Architecture

How Graffold Works

A split database architecture β€” Memgraph for graph traversals, Cloudflare Vectorize for semantic search β€” with multi-source ingestion and hybrid retrieval. Full provenance on every answer.

Ingestion Pipeline

Data Sources Documents & PDFs APIs & Feeds Open Access Archives Structured Data CSV / Excel API Processing Token-aware Chunking LLM Entity Extraction Multi-pass Gleaning Entity Consolidation Relationship Consolidation Enrichment Canonical ID Mapping Ontology Alignment Taxonomy Hierarchies Vector Embeddings Community Detection Knowledge Graph Memgraph Nodes Β· Relationships MAGE Algorithms openCypher Β· Bolt Cloudflare Vectorize 768d Embeddings Β· Cosine Semantic Similarity Split Architecture

Query & Retrieval Architecture

User Natural Language API answer LLM Agent Layer REST API + SSE Streaming Query Mode Router (local / global / hybrid) Two-Phase Discovery + Expansion Entity Disambiguation Cypher Generation Synthesis + Citations LLM Provider (Bedrock / Cloudflare / Ollama) queries Data Sources Vector Search Fulltext Search Graph Traversal Community Summaries results Memgraph CF Vectorize Neo4j FalkorDB

Integrations

Graph & Vector Layer

  • Memgraph + MAGE
  • Cloudflare Vectorize
  • Neo4j Β· FalkorDB

LLM Providers

  • AWS Bedrock (Claude, Titan)
  • AWS SageMaker
  • Cloudflare Workers AI
  • Ollama (local / air-gapped)
  • OpenAI-compatible APIs

Data Sources

  • Any source with an API
  • PDF files (vision + OCR)
  • CSV / Excel / Parquet
  • PubMed / bioRxiv (built-in)

Infrastructure

  • Redis (cache + sessions)
  • OpenTelemetry + Grafana
  • Docker (per-tenant isolation)
  • HuggingFace embeddings (768d)

Split Database Architecture

Graph storage and vector search are decoupled β€” each layer scales independently.

Memgraph β€” Graph Layer

  • β–Έ In-memory graph β€” sub-millisecond traversals
  • β–Έ MAGE algorithms (PageRank, community detection, shortest path)
  • β–Έ openCypher queries over Bolt protocol
  • β–Έ Dedicated container per tenant (256 MB–1 GB)
  • β–Έ Bolt-compatible β€” same driver code works with Neo4j

Cloudflare Vectorize β€” Vector Layer

  • β–Έ 768-dimensional embeddings (all-distilroberta-v1)
  • β–Έ Cosine similarity search at the edge
  • β–Έ Managed scaling β€” no infrastructure to maintain
  • β–Έ Decoupled from graph memory
  • β–Έ Metadata-enriched vectors for result context

Alternative Backends

Neo4j

Enterprise graph β€” native HNSW vector indexes, APOC/GDS algorithms

FalkorDB

Redis-native graph β€” named graph isolation, lightweight deployments

OpenCypher Compatible

Any Bolt-compatible database works via the DatabaseInterface abstraction

Performance Stack

β–Έ

Rust-accelerated JSON

3-10Γ— faster serialization

β–Έ

Rust ASGI Server

2-4Γ— request throughput

β–Έ

Memgraph Bolt Driver

In-memory graph, sub-ms traversals

β–Έ

Rust-native DataFrames

2-5Γ— faster for data processing

β–Έ

Token-aware Chunking

BPE tokenizer with sentence boundaries

β–Έ

Hybrid Retrieval

Vector + fulltext + graph traversal

Performance Benchmarks

Measured speedups from Rust- and C-backed drop-in replacements across the stack. Zero application-code rewrites required.

3–10Γ—

JSON Serialization

Rust-backed encoder vs stdlib

2–4Γ—

HTTP Throughput

Rust ASGI server vs Python default

<1 ms

Graph Traversal

In-memory Memgraph, sub-millisecond

2–5Γ—

DataFrame Operations

Rust-native DataFrames vs legacy libraries

~10Γ—

Cache Parsing

C-backed parser vs pure-Python

250 ms

P50 Query Latency

Hybrid vector + graph retrieval

Processing Throughput

1,000+

Documents per pipeline run

Multi-source

Parallel ingestion across data feeds

47 β†’ 1

Duplicate edges consolidated per entity pair

< 15 min

Incremental update for ~100 new documents

Ontology Studio

Define your domain schema visually. The extraction pipeline adapts to your ontology β€” producing typed, validated knowledge graphs instead of generic entity soup.

Visual Schema Designer

Drag-and-drop node types, draw relationships, define properties. See your ontology as a live graph.

Guided Extraction

Your ontology constrains the LLM β€” extraction produces typed entities that match your schema, not generic blobs.

OWL / JSON-LD

Import existing ontologies from OWL or JSON-LD. Export for use in ProtΓ©gΓ©, Neo4j, or any RDF-compatible tool.

Available on Enterprise tier

Are you drowning in documents?
Try us out.

Or email us directly at hello@graffold.com