Don't want to self-host? Try the hosted version β free tier includes Neo4j, GraphRAG, and Llama 3.
Try Graffold Free βDeveloper Documentation
From Docker to First Query in Minutes
Install, ingest, query. Everything you need to self-host Graffold.
ββββββ ββββββ βββββ βββββββ βββββββ ββββββ ββ ββββββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ βββ ββββββ βββββββ βββββ βββββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββ ββββββ ββ ββ ββ ββ ββ ββ ββββββ βββββββ ββββββv0.2.0 Turn anything into actionable knowledgeAPI http://localhost:8000 (granian) Memgraph bolt://localhost:7687 (memgraph@mydb) Redis localhost:6379 (sessions + cache) Auth enabled (API_AUTH_TOKEN) LLM bedrock (default)Usage: graffold <resource> <command> [flags]Resources: ingest PubMed, bioRxiv, PMC, PDF ingestion enrich CSV/Excel enrichment pipeline pipeline Full automated KG creation query Natural language graph queries consolidate Entity & relationship merging embeddings Vector embedding generation serve Start the API server health Check API / Memgraph / Redis statusDocumentation: Quick Start docs/KNOWLEDGE_GRAPH_CREATION_GUIDE.md API Docs http://localhost:8000/docs Frontend http://localhost:5173 GitHub github.com/graffold/graffold-apiPress Ctrl+C to stop Run with --help for all options
Setup
Option A: Docker (recommended)
# Start the full stack docker compose up -d # What's running: # - API server on http://localhost:8000 # - Memgraph on bolt://localhost:7687 # - Redis for session caching # Verify curl http://localhost:8000/health/ready
Option B: Python package
pip install graffold # Or with uv uv pip install graffold
Configure .env
# .env MEMGRAPH_URI=bolt://localhost:7687 MEMGRAPH_USER=memgraph MEMGRAPH_PASSWORD=your_password # LLM provider (pick one) OLLAMA_API_URL=http://localhost:11434/api/generate # local # BEDROCK_MODEL_ID=meta.llama3-8b-instruct-v1:0 # cloud # CF_API_TOKEN=your_token # cloudflare # Auth API_AUTH_TOKEN=your_secret_token
Ingest Data
Feed documents into the knowledge graph. The pipeline handles chunking, entity extraction, consolidation, and embedding generation.
CLI
# Ingest PDFs graffold ingest --source pdf --files report.pdf contract.pdf --database mydb # Ingest from any API source graffold ingest --source api --endpoint https://your-api.com/docs --database mydb # Ingest CSVs with column mapping graffold ingest --source csv --files data.csv --database mydb \ --column-handlers "0:entity-id,1:entity-name,2:properties" # Bulk parallel ingestion graffold ingest --source pdf --files docs/*.pdf --database mydb --parallel
REST API
curl -X POST "http://localhost:8000/v1/ingestion/jobs" \
-H "Authorization: Bearer $API_AUTH_TOKEN" \
-F "source=pdf" \
-F "database=mydb" \
-F "files=@report.pdf" \
-F "files=@contract.pdf"
# Response:
# {"job_id": "ing_abc123", "status": "processing", "files": 2} Python
from graffold import Client
client = Client(base_url="http://localhost:8000", token="your_token")
# Ingest PDFs
job = client.ingest(
source="pdf",
files=["report.pdf", "contract.pdf"],
database="mydb"
)
print(f"Job {job.id}: {job.status}")
# Ingest from DataFrame
import pandas as pd
df = pd.read_csv("data.csv")
job = client.ingest(source="dataframe", data=df, database="mydb") 10MB per file, 50 files per batch via API. CLI has no limits. Supports PDF (vision + OCR), CSV, Excel, Parquet, and any REST API source.
Query
Create a session
# Create a session
curl -X POST "http://localhost:8000/v1/sessions" \
-H "Authorization: Bearer $API_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"llm_service": "ollama",
"database_name": "mydb",
"agent_type": "graph_rag_agent",
"connection_details": {
"uri": "bolt://localhost:7687",
"user": "memgraph",
"password": "your_password"
}
}'
# Response:
# {"session_id": "sess_abc123", "created_at": "2026-04-09T09:00:00"} Execute a query
# Basic query
curl -X POST "http://localhost:8000/v1/sessions/sess_abc123/query" \
-H "Authorization: Bearer $API_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"question": "What connects entity A to entity B?",
"mode": "hybrid",
"search_depth": "balanced"
}' Streaming (SSE)
# Streaming query (SSE)
curl -N "http://localhost:8000/v1/sessions/sess_abc123/query" \
-H "Authorization: Bearer $API_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"question": "Overview of risk factors", "stream": true}'
# Events:
# event: retrieval_started
# event: retrieval_complete β {"source_count": 12, "retrieval_time_ms": 340}
# event: token β {"text": "Several", "index": 0}
# event: token β {"text": " risk", "index": 1}
# ...
# event: complete β {"answer": "...", "sources": [...], "cost": {...}} Python client (full example)
from graffold import Client
client = Client(base_url="http://localhost:8000", token="your_token")
# Create session
session = client.create_session(
llm_service="ollama",
database="mydb",
connection={"uri": "bolt://localhost:7687", "user": "memgraph", "password": "..."}
)
# Query
result = session.query("What connects entity A to entity B?", mode="hybrid")
print(result.answer)
print(result.sources) # [{doc_id, excerpt, confidence}, ...]
print(result.cost) # {prompt_tokens, completion_tokens, estimated_usd}
# Follow-up (uses session context)
result2 = session.query("Which of those have active certifications?")
# KNN expansion
expanded = session.expand_knn(expansion_level=1)
print(expanded.entities) # newly discovered neighbors
# Streaming
for event in session.query("Summarize all findings", stream=True):
if event.type == "token":
print(event.text, end="") Query Modes
Direct LLM answer, no graph retrieval
Entity neighborhood search
Community summary-based retrieval
Merges local + global with deduplication
Search Depth
Discovery only, no expansion
1 expansion round, Cypher fallback if < 3 targets
Up to 3 rounds, always runs Cypher fallback
Expand & Explore
Discover connected entities beyond the initial answer with KNN expansion.
# Expand results with k-nearest neighbors
curl -X POST "http://localhost:8000/v1/sessions/sess_abc123/expand-knn" \
-H "Authorization: Bearer $API_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"expansion_level": 1}' All Endpoints
GET /health # Version, uptime (no auth)
GET /health/ready # DB + LLM connectivity check (no auth)
GET /health/live # Process alive (no auth)
POST /v1/sessions # Create session
GET /v1/sessions/{session_id} # Get session details
DEL /v1/sessions/{session_id} # Delete session
POST /v1/sessions/{session_id}/query # Execute query
POST /v1/sessions/{session_id}/expand-knn # KNN expansion
GET /v1/sessions/{session_id}/stats # Database statistics
POST /v1/ingestion/jobs # Start ingestion job
GET /v1/costs # LLM cost summary
GET /v1/metrics # Performance metrics
GET /v1/metrics/latency # Query latency percentiles
GET /v1/metrics/multi-hop # Multi-hop success rate
GET /v1/metrics/community # Community usage stats