Knowledge Layer

Knowledge Layer

Contextual data architecture for RAG, memory, and document retrieval. The Knowledge Layer provides the primitives your pipelines need to access, store, and reason over structured and unstructured data.

Namespaces

Isolation containers within apps. Each namespace has its own threads, sources, memory, and artifacts, providing a clean boundary for data separation.

Multi-tenant data isolation — each tenant or user can have their own namespace with completely separated knowledge
Conversation contexts — group related threads and their associated memory under a single namespace
Workflow isolation — run distinct pipelines against different namespaces without data leakage between them
Scoped resources — threads, sources, memory, and artifacts within a namespace are invisible to other namespaces

Threads

Conversation-like entities that maintain state and history across interactions. Threads are the primary unit for tracking multi-turn exchanges and agent workflows.

Chat applications — maintain full conversation history with user and assistant messages across sessions
Agent interactions — track multi-step agent reasoning, tool calls, and intermediate results within a single thread
Stateful workflows — use threads to persist state between pipeline executions, enabling long-running processes
Scoped to namespaces — each thread belongs to a namespace, inheriting its isolation and access boundaries

Memory

Persistent embeddings, facts, and structured knowledge that pipeline steps can read and write. Memory enables your AI to recall context, learn from interactions, and build up knowledge over time.

Multiple scopes — augmentation-level, step-level, thread-level, and system-level memory for fine-grained control
Queryable from pipeline steps — retrieve relevant memories using semantic search during augmentation execution
Persistent embeddings — facts and knowledge are embedded and stored for efficient similarity-based retrieval
Structured knowledge — store typed facts, key-value pairs, and structured data alongside unstructured text

Sources

Upload and index documents for retrieval-augmented generation. Sources are the foundation of your RAG pipeline, providing the external knowledge your models need to give accurate, grounded responses.

Supported formats — PDF, DOCX, TXT, CSV, Excel, images, and code files for document-based knowledge
Web and API sources — index content from URLs, API endpoints, and notes for dynamic, up-to-date knowledge
Automatic chunking and semantic indexing — documents are split into optimal chunks and embedded for fast retrieval
Flexible scoping — sources can be workspace-scoped (shared across apps) or namespace-scoped (isolated to a context)

Artifacts

Generated and processed data outputs from pipeline executions. Artifacts capture the results of your AI systems, making them inspectable, reusable, and manageable.

Multiple types — text, JSON, files, URLs, images, tables, HTML, and code artifacts for any output format
Lifecycle management — artifacts transition through active, archived, and expired states for clean data hygiene
Pipeline outputs — automatically generated from augmentation steps, capturing intermediate and final results
Scoped to namespaces — artifacts belong to the namespace where their pipeline executed, maintaining isolation