About
The Scaffold for Your Knowledge Graph
Graffold isn't a demo or a wrapper around an API. It's a production-grade platform — built to ingest at scale, consolidate intelligently, and answer with provenance.
☁️ Cloud
Managed databases and hosted LLMs. Deploy with AWS CDK or Docker Compose — scales to millions of entities.
🏠 Local
Full stack on your machine. No external API calls. Air-gapped deployments for sensitive data. One docker compose up.
🔀 Hybrid
Mix and match. Local database with a cloud LLM. Remote graph with a local model. Swap any component independently — no lock-in at any layer.
Core Architecture
Database Agnostic
Not locked to one vendor. Adapter pattern supports multiple backends simultaneously.
LLM Provider Agnostic
Swap providers without changing your pipeline. Cost tracking per session and operation.
Multi-Source Ingestion
Any source with an API plugs in as a connector. The PDF pipeline uses multi-strategy extraction — vision-based OCR, Nougat for scientific papers, PyMuPDF for fast text — with parallel processing for batch runs of thousands of documents.
Provenance & Temporal Tracking
Every entity and relationship traces to its source. Temporal validity windows, contradiction detection, and confidence decay over time.
Query Intelligence
Two-Phase Agent
Discovery phase finds relevant entities. Expansion phase traverses neighborhoods iteratively. Configurable search depth — fast, balanced, or deep.
Query Mode Router
Auto-classifies queries into naive, local, global, or hybrid modes. Local searches entity neighborhoods. Global uses community summaries. Hybrid merges both.
Entity Disambiguation
Resolves ambiguous mentions using canonical IDs, synonym dictionaries, and context. Multi-strategy: exact match → synonym → fuzzy → LLM fallback.
Community Detection
Louvain algorithm identifies entity clusters. LLM generates summaries per community. Global queries aggregate across communities for broad answers.
KNN Expansion
Expand results with k-nearest neighbors to discover connected entities beyond the initial query. Iterative rounds with configurable depth.
SSE Streaming
Real-time token-by-token streaming via Server-Sent Events. Retrieval progress, synthesis tokens, and final citations delivered as they're generated.
Performance & Operations
Rust-Accelerated Hot Paths
Critical paths use Rust-backed libraries for measurable throughput gains with zero application-code rewrites.
- ▸ orjson — 3-10× faster JSON serialization
- ▸ Granian — 2-4× HTTP throughput (Rust ASGI)
- ▸ neo4j-rust-ext — up to 10× for large result sets
- ▸ Polars — 2-5× faster DataFrames
- ▸ tiktoken — native Rust BPE tokenizer
- ▸ hiredis — 10× Redis parse speed
Full Observability
Pre-built Grafana dashboards, Prometheus metrics, structured audit logging, and cost tracking across every LLM call.
- ▸ 5 Grafana dashboards (eval, API, LLM costs, E2E, infra)
- ▸ Regression alerts on eval metric drops
- ▸ JSON-line audit log for every query and pipeline run
- ▸ Per-session LLM cost tracking by provider
- ▸ Pipeline progress with ETA and callbacks
- ▸ Health/readiness/liveness probes
Entity Consolidation Engine
Multi-strategy deduplication that runs across the entire graph after ingestion.
- ▸ Canonical ID matching (UniProt, MONDO, custom)
- ▸ Gene symbol and synonym resolution
- ▸ Fuzzy name matching for typos and abbreviations
- ▸ Relationship consolidation preserving all evidence
- ▸ Incremental consolidation for weekly updates
Production Deployment
Docker Compose for dev, AWS CDK for production. Redis for distributed caching. Rate limiting, auth, and security headers built in.
- ▸ 7 CDK stacks (Neo4j, Neptune, Bedrock, Cognito, etc.)
- ▸ Docker Compose with dev/monitoring/ingest profiles
- ▸ Redis distributed session management
- ▸ Bearer token auth + rate limiting
- ▸ Parallel ingestion with tunnel keepalive
Collaboration
Session Sharing
Share entire query sessions with teammates — view-only for review, or editable for collaborative exploration. All via the Org Panel.
Branching Threads
Fork a new thread from any answer. Explore tangents without losing the original conversation. Non-linear exploration by design.
Context-Aware Routing
User role, team, and dataset context become graph nodes. "Show me what the proteomics team knows about IL-6" returns different results than the clinical team's view.
Why We Built This
Critical knowledge is buried across millions of documents, inaccessible to the people who need it most. Graffold consolidates this fragmented information into a queryable graph so teams can discover connections in seconds instead of months.