Case Study

AI Security Monitoring

Vision AI pipeline for real-time security monitoring on commodity Android hardware.

91%

True positive rate on configured detection scenarios

s End-to-end alert latency

Locations deployed

Any

Modern Android device — no proprietary hardware

The Challenge

Traditional security cameras record everything and alert nothing. Operators spend hours reviewing footage for incidents that may or may not have happened — and when alerts do trigger, they're typically based on motion detection so primitive that raccoons, passing cars, and lighting changes generate constant false positives. Monitoring staff learns to ignore them.

Sophisticated monitoring systems that actually understand scenes — enterprise video analytics platforms — require expensive dedicated hardware, proprietary software licenses, and vendor lock-in. For the client, a multi-location operation with three sites, the hardware cost alone was prohibitive, and the configuration complexity meant adding a new detection scenario required vendor engagement.

The requirement was precise: intelligent monitoring that distinguishes genuine threats from normal activity, running on consumer-grade Android devices that could be mounted anywhere, configured in minutes using plain language, and generating alerts only for events that actually warranted attention.

The Solution

We built a vision AI pipeline that runs on standard Android devices and understands what it sees — not just that something moved. The core architectural insight: modern vision-language models can evaluate scene descriptions expressed in plain language. Rather than building a rule engine with coded logic for each detection scenario, we built an inference layer that lets operators define rules as natural language prompts.

The pipeline was designed with deduplication as a first-class concern. Alert fatigue is the silent killer of monitoring systems — when operators receive too many alerts for the same event, they stop responding to alerts entirely. We implemented configurable suppression windows, event clustering, and confidence thresholding to ensure every alert that pages an operator represents a genuinely new, high-confidence event.

Capture Layer

Standard Android phones mounted as fixed cameras. Custom app handles frame capture, compression, and transmission.

Vision Pipeline

Frames are analyzed by a vision model for scene understanding — not just motion detection. The system understands context: a person standing still is different from a person running, which is different from a person hiding.

Alert Engine

Configurable detection rules using natural language descriptions. "Person in restricted area after 10pm" or "More than 3 people gathered at entrance" are expressed as prompts, not code.

Notification System

Real-time alerts via push notification with frame captures. Alert fatigue prevention through intelligent deduplication — same event does not trigger multiple alerts.

Technical Details

Hardware

Any modern Android device as camera hardware

Capture App

Custom Android app (Kotlin) for capture and transmission

Vision Model

GPT-4V / Gemini Vision for scene understanding (newer deployments use Claude's vision capabilities)

Alert Rules

Defined in plain language, no code required

Deduplication

Event deduplication with configurable suppression windows

Storage

Secure frame storage with retention policies

Key Takeaways

Natural language rules are the killer feature

Traditional security systems require days of configuration work to define a new detection scenario. Natural language rules — expressed as prompts to the vision model — take minutes. This changes the economics of custom security monitoring fundamentally.

Deduplication is critical to adoption

Alert fatigue kills security monitoring systems. We spent significant engineering time on the deduplication layer — suppression windows, event clustering, confidence thresholds. A system that pages 50 times per hour for the same event will be ignored within a week.

Commodity hardware changes the cost model

Enterprise security hardware is expensive, proprietary, and locked into vendor ecosystems. Consumer Android devices cost 1/10th the price, are everywhere, and can be replaced in 24 hours. The cost of deploying intelligent monitoring at a new location dropped from tens of thousands to under $500.

Technology Stack

AndroidKotlinGPT-4VGemini VisionClaude VisionNode.jsPostgreSQLFCM

Key Advantage

Custom detection scenarios configured in minutes using natural language — no code required. Rules that used to take days to configure now take minutes.

Need vision AI?

Vision AI monitoring applies far beyond security — quality control, customer flow, compliance verification. Let's talk about your use case.

Get in Touch

Work With Us

Want similar results?

Vision AI applies to more than security. Quality control, compliance verification, customer flow analysis — if you have a monitoring challenge, we can help design the right solution.

Book a Discovery Call View More Work