Architecture

How Graffold Works

A split database architecture — Memgraph for graph traversals, Cloudflare Vectorize for semantic search — with multi-source ingestion and hybrid retrieval. Full provenance on every answer.

Ingestion Pipeline

Query & Retrieval Architecture

Integrations

Graph & Vector Layer

Memgraph + MAGE
Cloudflare Vectorize
Neo4j · FalkorDB

LLM Providers

AWS Bedrock (Claude, Titan)
AWS SageMaker
Cloudflare Workers AI
Ollama (local / air-gapped)
OpenAI-compatible APIs

Data Sources

Any source with an API
PDF files (vision + OCR)
CSV / Excel / Parquet
PubMed / bioRxiv (built-in)

Infrastructure

Redis (cache + sessions)
OpenTelemetry + Grafana
Docker (per-tenant isolation)
HuggingFace embeddings (768d)

Split Database Architecture

Graph storage and vector search are decoupled — each layer scales independently.

Memgraph — Graph Layer

▸ In-memory graph — sub-millisecond traversals
▸ MAGE algorithms (PageRank, community detection, shortest path)
▸ openCypher queries over Bolt protocol
▸ Dedicated container per tenant (256 MB–1 GB)
▸ Bolt-compatible — same driver code works with Neo4j

Cloudflare Vectorize — Vector Layer

▸ 768-dimensional embeddings (all-distilroberta-v1)
▸ Cosine similarity search at the edge
▸ Managed scaling — no infrastructure to maintain
▸ Decoupled from graph memory
▸ Metadata-enriched vectors for result context

Alternative Backends

Neo4j

Enterprise graph — native HNSW vector indexes, APOC/GDS algorithms

FalkorDB

Redis-native graph — named graph isolation, lightweight deployments

OpenCypher Compatible

Any Bolt-compatible database works via the DatabaseInterface abstraction

Performance Stack

▸

Rust-accelerated JSON

3-10× faster serialization

▸

Rust ASGI Server

2-4× request throughput

▸

Memgraph Bolt Driver

In-memory graph, sub-ms traversals

▸

Rust-native DataFrames

2-5× faster for data processing

▸

Token-aware Chunking

BPE tokenizer with sentence boundaries

▸

Hybrid Retrieval

Vector + fulltext + graph traversal

Performance Benchmarks

Measured speedups from Rust- and C-backed drop-in replacements across the stack. Zero application-code rewrites required.

3–10×

JSON Serialization

Rust-backed encoder vs stdlib

2–4×

HTTP Throughput

Rust ASGI server vs Python default

<1 ms

Graph Traversal

In-memory Memgraph, sub-millisecond

2–5×

DataFrame Operations

Rust-native DataFrames vs legacy libraries

~10×

Cache Parsing

C-backed parser vs pure-Python

250 ms

P50 Query Latency

Hybrid vector + graph retrieval

Processing Throughput

1,000+

Documents per pipeline run

Multi-source

Parallel ingestion across data feeds

47 → 1

Duplicate edges consolidated per entity pair

< 15 min

Incremental update for ~100 new documents

Ontology Studio

Define your domain schema visually. The extraction pipeline adapts to your ontology — producing typed, validated knowledge graphs instead of generic entity soup.

Visual Schema Designer

Drag-and-drop node types, draw relationships, define properties. See your ontology as a live graph.

Guided Extraction

Your ontology constrains the LLM — extraction produces typed entities that match your schema, not generic blobs.

OWL / JSON-LD

Import existing ontologies from OWL or JSON-LD. Export for use in Protégé, Neo4j, or any RDF-compatible tool.

Available on Enterprise tier