June 2025 - March 2026

Project

Graph Investigation AI

Introduction

Data Intelligence

Built an AI investigation workspace that turns CDR/IPDR data into graph evidence, multi-hop links, and analyst-ready insights.

Case Study

Investigation Overhead

Law-enforcement investigations often rely on large Excel/SQL-style datasets containing CDR, IPDR, tower, IP, location, and banking records. The challenge is not storage alone, but relationship discovery under scale, time pressure, and high error cost. Manual joins, filters, and pivots are slow, technically demanding, and fragile. This platform was built to replace static querying with intelligence-driven graph reasoning: it converts heterogeneous evidence into Neo4j, lets investigators ask natural-language questions, and uses an orchestrated AI pipeline to infer direct, indirect, and temporal links across entities.

Solution

1. AI Orchestration Pipeline

Input query > intent detection > schema-grounded Cypher generation > bounded graph retrieval > candidate strategy evaluation > multi-LLM reasoning > final synthesis. The orchestrator uses history-aware context, intent classification, fallback strategies, and safety constraints such as hop limits and temporal normalization. This makes the system adaptive instead of purely retrieval-based.

2. Neo4j Batch Uploads

Batch ingestion uses UNWIND to load large evidence files efficiently into Neo4j. This is necessary because investigative datasets contain repetitive entities, many-to-many links, and time-bound events. UNWIND reduces write overhead, supports scalable merging of nodes/relationships, and preserves graph integrity during high-volume ingestion.

3. LLM Reasoning and Synthesis

LLMs are used as evidence interpreters, not truth sources. Raw facts come from Neo4j, the model reasons over retrieved paths, timestamps, and entity links to produce analyst-readable conclusions. This reduces hallucination risk and supports explanations for hidden links, co-location, session tracing, and multi-hop correlation.

4. Query Generation Strategies

Implemented strategies include direct-edge lookup, known-intermediate traversal, bounded multi-hop fallback, shortest-path search, all-path search, timestamp correlation, and merged CE/PE evidence flows. The system prioritizes schema-specific routes, uses business keys such as msisdn, imei, cell_id, ip, and session_id, and avoids unsafe or semantically incorrect query patterns.

Legacy Map

Conversion

Their AI writes you a subject line

Conversion Agents draft the campaign, QA it, send it

Workflows fire on if/then rules you wrote a year ago

Agents reason about each lead with the context that exists right

Every request starts from a blank brief

Every request lands with a drafted campaign attached

A broken UTM is discovered Monday morning, after the send

Campaign QA catches it before a single email leaves the queue

New use case = a workflow build, a QA cycle, and a sprint

New use case = describe it in chat, ship it in minutes

AI is a chat widget bolted onto a 15-year-old platform

Agents are the operating layer of the platform itself

Live Context

Knows your assets

Works where your team already works. Every chat, every scheduled run starts from the same shared context, your data, your brand, your stack.

CRM

Every object, campaign, and custom field reads the way your ops team already reads it.

Warehouse

Snowflake, BigQuery, and other warehouses surface usage, billing, and tables in plain English.

Brand

Voice, visual systems, approved patterns, and historical assets stay aligned with the team tone.

Semantic search

Find any audience, asset, or insight by intent, not by exact keyword or filter chain.

Pattern recognition

Surface anomalies, learn what is converting, and flag the moments worth your attention.

Connected tools

Slack, Asana, Salesforce, and your warehouse all stay in the same working context.

Every time we dig into the platform, we find another use case worth building. It's not just solving the priorities I came in with, it's expanding what we can reasonably take on this year.

Jason Ginsberg

Head of Marketing, GovWell

75%

Reduction in MOps time spent on recurring work

10x

Faster campaign-to-launch cycle time

4wk

Avg. migration time using agents