Architecture

How Graffold Works

A multi-source ingestion pipeline transforms unstructured documents into a consolidated knowledge graph — queryable via hybrid vector + graph retrieval with full provenance.

Ingestion Pipeline

Data Sources Documents & PDFs APIs & Feeds Open Access Archives Structured Data CSV / Excel API Processing Token-aware Chunking LLM Entity Extraction Multi-pass Gleaning Entity Consolidation Relationship Consolidation Enrichment Canonical ID Mapping Ontology Alignment Taxonomy Hierarchies Vector Embeddings Community Detection Knowledge Graph Graph Database Entities · Relations · Evidence

Query & Retrieval Architecture

User Natural Language API answer LLM Agent Layer REST API + SSE Streaming Query Mode Router (local / global / hybrid) Two-Phase Discovery + Expansion Entity Disambiguation Cypher Generation Synthesis + Citations LLM Provider (Bedrock / Ollama / OpenAI) queries Data Sources Vector Search Fulltext Search Graph Traversal Community Summaries results Neo4j Neptune Kuzu DuckDB

Integrations

Graph Databases

  • Neo4j
  • Amazon Neptune
  • Kuzu (embedded)
  • DuckDB (analytics)

LLM Providers

  • AWS Bedrock
  • AWS SageMaker
  • Ollama (local / air-gapped)
  • OpenAI-compatible APIs

Data Sources

  • Any source with an API
  • PDF files (vision + OCR)
  • CSV / Excel / Parquet
  • PubMed / bioRxiv (built-in)

Infrastructure

  • Redis (distributed cache)
  • Grafana + Prometheus
  • Docker Compose / AWS CDK
  • HuggingFace embeddings

Performance Stack

Rust-accelerated JSON

3-10× faster serialization

Rust ASGI Server

2-4× request throughput

Native Graph Driver

Up to 10× for large result sets

Rust-native DataFrames

2-5× faster for data processing

Token-aware Chunking

BPE tokenizer with sentence boundaries

Hybrid Retrieval

Vector + fulltext + graph traversal

Performance Benchmarks

Measured speedups from Rust- and C-backed drop-in replacements across the stack. Zero application-code rewrites required.

3–10×

JSON Serialization

Rust-backed encoder vs stdlib

2–4×

HTTP Throughput

Rust ASGI server vs Python default

Up to 10×

Graph Query Results

Native driver extensions for large result sets

2–5×

DataFrame Operations

Rust-native DataFrames vs legacy libraries

~10×

Cache Parsing

C-backed parser vs pure-Python

250 ms

P50 Query Latency

Hybrid vector + graph retrieval

Processing Throughput

1,000+

Documents per pipeline run

Multi-source

Parallel ingestion across data feeds

47 → 1

Duplicate edges consolidated per entity pair

< 15 min

Incremental update for ~100 new documents