Managing Sources
Upload files, paste text, configure chunking and indexing, and monitor processing status.
Overview
Sources are knowledge documents that power retrieval in your AI pipelines. When uploaded, MechaMental processes them through extraction, chunking, embedding, and indexing so they can be searched semantically at runtime.
Supported Formats
| Format | Extensions | Notes |
|---|---|---|
.pdf | Text extraction with layout awareness | |
| Word | .docx | Full formatting support |
| Plain Text | .txt | Direct text processing |
| Markdown | .md | Preserves structure and headings |
| CSV | .csv | Row-level chunking with headers |
| JSON | .json | Structured data processing |
| Images | .png, .jpg | OCR and visual analysis |
Uploading a Source
Open the Create Source Dialog
- Navigate to Sources in the sidebar (for workspace-level sources) or to a namespace's Sources tab (for namespace-scoped sources)
- Click the New Source button (Plus icon)
- The Add Workspace Source dialog opens as a wide split-panel view
Fill In Source Details (Left Panel)
The left panel contains the metadata fields:
- Name (required) -- a descriptive name, e.g., "Company Documentation"
- Description -- brief description of the source content
- Source Type -- auto-detected from the file, or manually override
- Tags -- add tags for organization and filtering
Choose Input Method (Right Panel)
The right panel has two input modes, selectable via tabs:
Drag and drop files onto the upload area, or click to browse your device. You can upload multiple files at once. Supported file types are shown in the format table above.
Paste text content directly into a text area. Useful for snippets, notes, or content copied from other applications.
Configure Advanced Options
Expand the Advanced section at the bottom of the left panel:
- Smart Chunking -- toggle on to use intelligent document-aware chunking instead of fixed-size splits
- Indexing Strategy -- choose how aggressively to index:
| Strategy | Description |
|---|---|
| Standard | Balanced performance and recall |
| High Recall | Maximize document retrieval at the cost of precision |
| High Precision | Prioritize relevance over recall |
Upload and Monitor Processing
Click the upload button. A progress indicator appears showing the current processing state.
The source transitions through these states in order:
- Created -- source record created
- Pending -- queued for processing
- Uploading -- file data being transferred
- Extracting -- text content being extracted from the file format
- Chunking -- text split into manageable chunks with overlap
- Embedding -- vector embeddings generated for each chunk
- Indexing -- chunks and embeddings stored in the vector index
- Completed -- source is ready for queries
Processing Status
You can monitor the status of every source in real time from the sources list. Each source shows a colored badge with its current state. If processing fails, the status shows Failed with an error message.
Source Scoping
Sources exist at one of two scopes:
- Workspace -- available to all apps and namespaces in the workspace. Good for shared knowledge bases (company docs, FAQs).
- Namespace -- scoped to a specific namespace within an app. Good for tenant-specific or context-specific documents.
When a source_query step runs, it can search workspace sources, namespace sources, or
both, depending on the search scope configuration.
Managing Tags
Tags help you organize and filter sources. You can add tags when creating a source or edit them later. Use the Tags field in the source details, and filter the source list by tag using the filter controls at the top of the sources table.
Reindexing Sources
If you update the embedding model or change chunking settings, you can reindex existing sources without re-uploading the original files:
- Select the source (or namespace) you want to reindex
- Trigger Reindex from the source actions or namespace actions
- The source re-enters the processing pipeline (Extracting, Chunking, Embedding, Indexing)
- Existing chunks are replaced with newly generated ones
Using Sources in Pipelines
Add a Source Query step to your pipeline to retrieve relevant chunks at runtime:
- Set the Query to a Jinja template like
{{ endpoint_payload.message }} - Configure Top K (number of results) and Score Threshold (minimum similarity)
- Optionally pin specific sources to limit the search scope
- The step output contains the matched chunks, which you can reference in subsequent steps
See Configuring Step Types for full details on source step configuration.