Knowledge
Namespaces, sources, threads, memory, artifacts, and vault secrets for contextual AI.
The knowledge system provides contextual data to your AI pipelines. It includes document sources, conversation threads, semantic memory, generated artifacts, and vault secrets — all organized within namespaces for clean isolation.
Namespaces
A namespace is an isolation boundary within an app. Each namespace has its own threads, memory entries, sources, artifacts, and vault secrets. This enables multi-tenant scenarios where each customer, project, or workflow gets its own context without cross-contamination.
Creating a Namespace
To create a namespace, navigate to an app and open the Namespaces tab. Click Create Namespace and fill in the dialog:
| Field | Required | Description |
|---|---|---|
| Name | Yes | A descriptive name for the namespace |
| Description | No | What this namespace is used for |
| Environment | Yes | Select an environment from the dropdown (defaults to the workspace default environment) |
| Embedding Model | No | Optional embedding model override for this namespace |
| Is Private | No | Checkbox to restrict access to the namespace |
Namespace Tabs
Once created, a namespace detail page has five tabs:
- Threads — conversation entities with message history
- Sources — uploaded documents processed into searchable chunks
- Memory — semantic, vector-indexed knowledge entries
- Artifacts — generated content from pipeline executions
- Secrets — namespace-scoped vault secrets and credentials
Namespace Isolation
Each namespace is fully isolated. Threads in one namespace cannot access memory or sources from another. This makes namespaces ideal for per-user, per-tenant, or per-project scoping.
Sources
Sources are documents that you upload or paste for your pipelines to reference. When ingested, they go through an automated processing pipeline that converts raw content into searchable, vector-indexed chunks.
Creating a Source
Click Add Source on the Sources page. The creation dialog offers two input methods:
After selecting your input method, fill in the remaining fields:
| Field | Required | Description |
|---|---|---|
| Name | Yes | A descriptive name for the source |
| Description | No | What this source contains |
| Tags | No | Multi-select autocomplete for categorization |
Advanced Options
Expand the advanced options section for additional controls:
- Smart Chunking — toggle for semantic-aware document splitting (enabled by default)
- Indexing Level — controls the trade-off between recall and precision:
- Standard — balanced performance and recall
- High Recall — maximize document retrieval at the cost of precision
- High Precision — prioritize relevance over recall
Processing Pipeline
When you add a source, it progresses through a series of processing stages:
Created
Source record created in the system.
Pending
Queued for processing.
Uploading
File data is being transferred to storage.
Extracting
Raw text and structure are extracted from the document.
Chunking
Content is split into semantically meaningful segments.
Embedding
Vector embeddings are generated for each chunk.
Indexing
Embeddings are indexed for fast similarity search.
Completed
Source is fully processed and available for retrieval.
Each stage updates in real time in the UI. If processing fails at any stage, the source shows a Failed status with an error message.
Supported File Types
| Category | Formats |
|---|---|
| Documents | PDF, DOCX, DOC, TXT, RTF, Markdown |
| Web | HTML |
| Spreadsheets | CSV, XLSX, XLS |
| Data | JSON, XML, YAML |
| Images | PNG, JPG, JPEG, TIFF, WebP, GIF |
| Code | Python, JavaScript, TypeScript |
Source Scopes
Sources can be scoped at two levels:
- Workspace — shared across all apps in the workspace
- Namespace — isolated to a specific namespace within an app
Threads
Threads represent ongoing conversations. Each thread has a message history with user and assistant messages, along with metadata and timestamps.
- Threads are scoped to a namespace
- Each thread tracks message count, creation time, and last update
- Message history is available to inference steps for conversational context
- Pipeline steps can read from and write to threads using
thread_updateandthread_querystep types
Memory
Memory entries are semantic, vector-indexed knowledge that persists across conversations. Unlike threads (which store raw messages), memory stores distilled knowledge that the AI can reference for long-term context.
Memory Types
| Type | Description |
|---|---|
fact | Objective information about the world or the user |
preference | User preferences and settings |
summary | Condensed summaries of past interactions |
entity | Information about specific people, places, or things |
instruction | Standing instructions for how the AI should behave |
custom | Application-specific memory with a custom type label |
Memory entries have a status (active or archived) and a scope (namespace or thread). They are vector-indexed for semantic retrieval during pipeline execution.
Artifacts
Artifacts are content generated by your pipelines during execution. They can be code files, documents, reports, images, or any other output. Artifacts have three scope levels that determine their lifetime and visibility:
| Scope | Description |
|---|---|
| Execution | Only available during the pipeline run that created them |
| Thread | Persisted within a specific thread conversation |
| Namespace | Persisted at the namespace level for long-term access |
Artifacts are viewable in the namespace Artifacts tab and in Cortex's scratchpad.
Vault Secrets
Each namespace has its own Secrets tab for managing namespace-scoped credentials. Secrets stored here are isolated from other namespaces and can be referenced by tools and pipeline configurations. See Security for more on vault scoping and access control.
Using Knowledge in Pipelines
Pipeline steps interact with knowledge through dedicated step types:
source_injection/source_query— query and retrieve document chunkssource_ingest— add new sources programmaticallythread_update/thread_query— read and write thread messagesmemory_update/memory_query— store and retrieve memory entriesartifact_create/artifact_query/artifact_get— manage generated artifacts