Architecture
How Graffold Works
A split database architecture β Memgraph for graph traversals, Cloudflare Vectorize for semantic search β with multi-source ingestion and hybrid retrieval. Full provenance on every answer.
Ingestion Pipeline
Query & Retrieval Architecture
Integrations
Graph & Vector Layer
- Memgraph + MAGE
- Cloudflare Vectorize
- Neo4j Β· FalkorDB
LLM Providers
- AWS Bedrock (Claude, Titan)
- AWS SageMaker
- Cloudflare Workers AI
- Ollama (local / air-gapped)
- OpenAI-compatible APIs
Data Sources
- Any source with an API
- PDF files (vision + OCR)
- CSV / Excel / Parquet
- PubMed / bioRxiv (built-in)
Infrastructure
- Redis (cache + sessions)
- OpenTelemetry + Grafana
- Docker (per-tenant isolation)
- HuggingFace embeddings (768d)
Split Database Architecture
Graph storage and vector search are decoupled β each layer scales independently.
Memgraph β Graph Layer
- βΈ In-memory graph β sub-millisecond traversals
- βΈ MAGE algorithms (PageRank, community detection, shortest path)
- βΈ openCypher queries over Bolt protocol
- βΈ Dedicated container per tenant (256 MBβ1 GB)
- βΈ Bolt-compatible β same driver code works with Neo4j
Cloudflare Vectorize β Vector Layer
- βΈ 768-dimensional embeddings (all-distilroberta-v1)
- βΈ Cosine similarity search at the edge
- βΈ Managed scaling β no infrastructure to maintain
- βΈ Decoupled from graph memory
- βΈ Metadata-enriched vectors for result context
Alternative Backends
Enterprise graph β native HNSW vector indexes, APOC/GDS algorithms
Redis-native graph β named graph isolation, lightweight deployments
Any Bolt-compatible database works via the DatabaseInterface abstraction
Performance Stack
Rust-accelerated JSON
3-10Γ faster serialization
Rust ASGI Server
2-4Γ request throughput
Memgraph Bolt Driver
In-memory graph, sub-ms traversals
Rust-native DataFrames
2-5Γ faster for data processing
Token-aware Chunking
BPE tokenizer with sentence boundaries
Hybrid Retrieval
Vector + fulltext + graph traversal
Performance Benchmarks
Measured speedups from Rust- and C-backed drop-in replacements across the stack. Zero application-code rewrites required.
3β10Γ
JSON Serialization
Rust-backed encoder vs stdlib
2β4Γ
HTTP Throughput
Rust ASGI server vs Python default
<1 ms
Graph Traversal
In-memory Memgraph, sub-millisecond
2β5Γ
DataFrame Operations
Rust-native DataFrames vs legacy libraries
~10Γ
Cache Parsing
C-backed parser vs pure-Python
250 ms
P50 Query Latency
Hybrid vector + graph retrieval
Processing Throughput
1,000+
Documents per pipeline run
Multi-source
Parallel ingestion across data feeds
47 β 1
Duplicate edges consolidated per entity pair
< 15 min
Incremental update for ~100 new documents
Ontology Studio
Define your domain schema visually. The extraction pipeline adapts to your ontology β producing typed, validated knowledge graphs instead of generic entity soup.
Visual Schema Designer
Drag-and-drop node types, draw relationships, define properties. See your ontology as a live graph.
Guided Extraction
Your ontology constrains the LLM β extraction produces typed entities that match your schema, not generic blobs.
OWL / JSON-LD
Import existing ontologies from OWL or JSON-LD. Export for use in ProtΓ©gΓ©, Neo4j, or any RDF-compatible tool.
Available on Enterprise tier