20 live examples · updated weekly

20 examples 9 sources 6 targets 5 LLM providers

Examples.

Real pipelines you can clone, run, and bend to your data. Each example is production-wired — one source, a declarative flow, a live target. Pick the one closest to what you need and change the parts that don't fit.

★ FeaturedStart here

The most complete end-to-end example — a custom source, LLM extraction, and a queryable target.

Multi-stage · custom source · 07 / 20

HN Trending Topics.

Pulls every new HackerNews thread, extracts topics with an LLM, and keeps a Postgres index continuously fresh. Custom source + live mode = 92% fewer API calls after the first sync.

Custom source Gemini 2.5 Flash Postgres Live 30s Structured output

45 min · 6 parts · 12 steps Open example

01

Vector Indexes

Embed your documents, store vectors, answer by meaning.

Text · Postgres 01 / 20

Simple Vector Index

The cleanest "hello world" for CocoIndex + embeddings — index markdown, query it with natural language.

Local FSPostgresStarter

~6 min · starter Open →

Code · Tree-sitter 02 / 20

Codebase Indexing

Walk a repo, split by syntax, embed, and query your codebase in English. Real-time RAG for code.

Local FSTree-sitterPostgres

~10 min Open →

PDFs · metadata 03 / 20

Academic Papers

Extract metadata, chunk and embed abstracts, enable semantic + author-based search over academic PDFs.

PDFAny LLMPostgres

~20 min Open →

02

Custom Building Blocks

Bring your own source, target, or parser. Same declarative flow.

Postgres · CDC 04 / 20

Postgres as a Source

Use an existing Postgres table as a CocoIndex source. AI transforms + data mappings flow into pgvector.

PostgrespgvectorData mapping

~12 min Open →

HN · Algolia API 05 / 20

Custom Source HN

Treat any API as a first-class incremental source. A custom HN connector that stays in sync with Postgres.

HN APICustom sourcePostgres

~30 min Open →

Markdown → HTML 06 / 20

Custom Targets

Export markdown files to local HTML using a custom target. The simplest file-to-file pipeline shape.

Local FSLocal FSStarter

~8 min Open →

03

Structured Extraction

Turn loose prose into structured data with LLMs, BAML, DSPy, or Ollama.

Ollama · local 08 / 20

Python Manual Extraction

Extract structured data from the Python manual markdowns with a local Ollama model.

MarkdownOllamaStructured

~18 min Open →

Nested · typed 09 / 20

Patient Form Extraction

Extract nested structured data from patient intake forms with field-level transformation and data mapping.

PDFAny LLMData mapping

~22 min Open →

BAML · typed 10 / 20

Patient Intake (BAML)

BAML as the typed contract between LLM and code. Same intake problem, stronger guarantees.

PDFBAMLStructured

~25 min Open →

DSPy · vision 11 / 20

Patient Intake (DSPy)

DSPy-style prompt programming on vision models. Compare the ergonomics to the BAML variant side by side.

PDFDSPyVision models

~28 min Open →

Google · parse 12 / 20

Document AI Parser

Bring your own parser. Google Document AI extracts, CocoIndex embeds and stores for semantic search.

PDFCustom parserPostgres

~20 min Open →

04

Knowledge Graphs

Give agents a persistent, graph-shaped memory from conversations, meetings, products.

Docs · Neo4j 13 / 20

Knowledge Graph for Docs

Build live knowledge for agents from documentation — incremental triple extraction with LLMs.

MarkdownAny LLMNeo4j

~30 min Open →

Drive · Neo4j 14 / 20

Meeting Notes Graph

Turn Google Drive meeting notes into an automatically updating Neo4j knowledge graph.

DriveAny LLMNeo4j

~32 min Open →

Taxonomy · graph 15 / 20

Product Recommendation

Real-time recommendation engine — product taxonomy understanding via LLM, stored in a graph database.

CatalogAny LLMGraph DB

~35 min Open →

05

Multimodal

Images, PDFs, slides, faces — same flow, different encoder.

ColPali · FastAPI 16 / 20

Image Search (ColPali)

ColPali embeddings served behind a FastAPI endpoint. Page-level multi-vector image search.

ImagesColPaliPostgres

~22 min Open →

CLIP · query 17 / 20

Image Search (CLIP)

CLIP embeddings over a folder of images. Query by text or reference image.

ImagesCLIPPostgres

~15 min Open →

PDF · slides · images 18 / 20

Multi-format Index

ColPali over PDFs, images, academic papers, and slides — mixed together in the same vector space, no OCR.

MixedColPaliNo OCR

~25 min Open →

PDF · unified 19 / 20

PDF Elements

Extract, embed, and index both text and images from PDFs — SentenceTransformers + CLIP in one vector space.

PDFCLIP + STUnified

~20 min Open →

Faces · similarity 20 / 20

Photo Search (Faces)

Detect, extract, and embed faces from photos. Export to a vector DB for face similarity queries.

ImagesFace detectPostgres

~18 min Open →

Can't find the shape you need?

Clone the closest example, swap the source or the target, and keep the rest. Or request a new example — we ship the ones developers ask for.

Request on GitHub Build your own Ask on Discord