Turn ordinary camera feeds into an autonomous, reasoning command center — with local AI, no cloud, and no data leaving your network.
Why · Features · The console · Quick start · Architecture · Tech
Sentigon ingests live camera feeds and runs them through a fleet of 12 cooperating AI agents (perception → reasoning → action → supervision) that detect, verify, reason about, and respond to physical-security threats in real time. Every model — vision, language, OCR, audio — runs locally. There is no cloud, no API key, and no telemetry. Your footage stays yours.
Most "AI security" products are a thin wrapper around a cloud API: your footage leaves the building, you pay per frame, and a "person detected" box is the extent of the intelligence. Sentigon takes the opposite approach.
| Typical AI surveillance | Sentigon | |
|---|---|---|
| Where the AI runs | Someone else's cloud | Your hardware, fully offline |
| Your footage | Uploaded and retained | Never leaves your network |
| Cost model | Per-frame / per-camera SaaS | Free — open and self-hosted |
| Intelligence | Bounding boxes | Agents that reason, verify, and act |
| False positives | You drown in them | An adversarial verifier filters them out |
| Lock-in | Proprietary | Open and self-hosted |
A single self-hosted platform that replaces a rack of disconnected tools.
Perception — see everything
- State-of-the-art detection: RT-DETR (transformer, NMS-free), YOLO11, YOLO-World open-vocabulary ("knife", "person on the ground", "fire" by text prompt, no retraining), BoT-SORT re-ID tracking, and SAM2 mask segmentation for occlusion.
- Structured scene intelligence: a local vision model (qwen2.5-VL) produces a real scene graph (objects, attributes, relationships), captions, activities, and an evidence-calibrated threat assessment.
- Behavior over time: geometric, hallucination-free temporal detection of loitering, running, falls (pose-based), and abandoned objects — behaviors that only exist across frames.
- Real ALPR (local EasyOCR plate reading), audio event detection (gunshot / glass-break / scream / alarm), and CLIP appearance embeddings.
Reasoning — connect the dots
- Adversarial threat verifier: a second, skeptical model re-examines every flagged threat and tries to refute it. Only the survivors become alerts — the single biggest lever against alert fatigue.
- Escalation chains: recognizes a sequence on one person (loiter → test door → approach) and escalates it long before any single step would.
- Trajectory prediction, cross-camera entity tracking and re-ID, real-time BOLO appearance/plate matching, and semantic "looks-like" forensic search.
- Adaptive thresholds that learn each camera's normal and stop crying wolf in naturally-busy areas.
Action — do something about it
- Autonomous response pipeline (incident recording → SOP playbook → operator dispatch → emergency-services lookup) with a shadow-mode safety gate.
- SOC Copilot: an agentic, tool-using chat that answers questions like "what's happening on the loading dock right now?" with a reasoned, data-grounded answer.
- SOP execution, compliance forecasting, predictive analytics, and red-team self-testing.
Supervision
- A SENTINEL Cortex agent orchestrates the fleet, maintains the security posture, issues directives, and synthesizes shift briefings.
A purpose-built mission-control interface, not a generic admin dashboard: layered surfaces, monospaced telemetry, status LEDs, a live command bar, and real-time agent feeds across 60+ operational views.
Spin it up (below) and open http://localhost:3000 to see it live.
Sentigon runs on bare metal — no Docker required. Everything (Postgres, Redis, Qdrant, the backend, and the frontend) comes up with one script.
Prerequisites: Python 3.12, Node 20+, Ollama, and a GPU (recommended, not required).
# 1. Clone
git clone https://github.com/Sherin-SEF-AI/Sentigon.git
cd Sentigon
# 2. Pull the local models (the only download you need — no API keys)
ollama pull qwen2.5:7b # reasoning / language
ollama pull qwen2.5vl:7b # vision
# 3. Backend deps (into a venv)
python3.12 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# 4. Frontend deps
cd frontend && npm install && cd ..
# 5. Launch the entire stack
bash start-local.shThen open http://localhost:3000 and log in:
Email: admin@sentinel.local
Password: changeme123 # change this before exposing it anywhere
The backend serves on :8002. On first boot it auto-runs migrations, seeds 165+ threat signatures, loads the detector and agents, and registers any available cameras. Add an RTSP/USB camera from Settings → Cameras.
A layered perception → reasoning → action → supervision pipeline:
flowchart LR
subgraph Ingest["Ingest"]
CAM["RTSP / USB / ONVIF"]
IOT["IoT / PACS / Alarms"]
end
subgraph Perceive["Perception"]
DET["RT-DETR / YOLO-World<br/>pose / BoT-SORT / SAM2"]
VLM["Scene intelligence<br/>(qwen2.5-VL)"]
AUD["Audio / ALPR / CLIP"]
end
subgraph Reason["Reasoning"]
VERIFY["Adversarial verifier"]
TEMP["Temporal / escalation<br/>trajectory / BOLO"]
CORTEX["SENTINEL Cortex<br/>(orchestrator)"]
end
subgraph Act["Action"]
RESP["Autonomous response<br/>SOP / dispatch"]
COPILOT["SOC Copilot"]
end
Ingest --> Perceive --> Reason --> Act
Reason <--> CORTEX
Every box runs locally. The LLMs are Ollama (qwen2.5 / qwen2.5-VL), the detectors are ultralytics (RT-DETR / YOLO / SAM2), and embeddings are CLIP. No external inference calls.
Backend — FastAPI (async), SQLAlchemy 2.0 + asyncpg, PostgreSQL, Qdrant (vectors), Redis, Celery, Alembic, JWT/bcrypt, Prometheus, structlog.
AI / CV — Ollama (qwen2.5 / qwen2.5-VL), ultralytics (RT-DETR, YOLO11, YOLO-World, pose, SAM2, BoT-SORT), CLIP, EasyOCR, librosa, OpenCV.
Frontend — Next.js 16 (App Router), React 19, TypeScript, Tailwind CSS v4, Radix UI, Recharts, Leaflet.
Deep dive — agents, services, and the full feature set
The agent fleet (12) Perception: Watcher, Detector, Audio Sentinel. Reasoning: Threat Analyzer, Tracker, Investigator. Action: Responder, Reporter. Supervision: SENTINEL Cortex. Specialized: Access Guardian, Environmental, Red Team.
Agents communicate over Redis pub/sub channels and call internal tools through a local LLM function-calling loop.
Notable services
scene_intelligence, threat_verifier, temporal_behavior, escalation_tracker, trajectory_predictor, sam_segmenter, alpr_service, audio_detection_service, bolo_matcher, adaptive_thresholds, baseline_learning, autonomous_response, sop_engine, compliance (with forecasting), forensic_search (semantic), feedback_tuning, entity_tracker, context_fusion.
Surface area
- 60+ frontend views across Operations, Alerts & Response, Investigation, Detection & AI, Threat Management, Access & Patrol, Analytics & Maps, Compliance, and System.
- Hundreds of API endpoints, WebSocket live feeds, RBAC (admin / analyst / operator / viewer), audit logging, and multi-tenant scaffolding.
- One-command installer and prebuilt model bundle
- Live multi-camera demo dataset
- Deep audio model (PANNs / YAMNet) drop-in to replace the DSP classifier
- Mask-based occlusion re-acquisition (SAM2 video memory)
- Edge deployment guide (Jetson / mini-PC)
Issues, ideas, and PRs are welcome — a new detector, an agent skill, a UI pass, or docs. Open an issue to start a conversation. If the project is useful to you, a star helps others find it.
Built for operators who would rather not send their footage to someone else's cloud.
