Managing Sources

Upload files, paste text, configure chunking and indexing, and monitor processing status.

Overview

Sources are knowledge documents that power retrieval in your AI pipelines. When uploaded, MechaMental processes them through extraction, chunking, embedding, and indexing so they can be searched semantically at runtime.

Supported Formats

Format	Extensions	Notes
PDF	`.pdf`	Text extraction with layout awareness
Word	`.docx`	Full formatting support
Plain Text	`.txt`	Direct text processing
Markdown	`.md`	Preserves structure and headings
CSV	`.csv`	Row-level chunking with headers
JSON	`.json`	Structured data processing
Images	`.png`, `.jpg`	OCR and visual analysis

Uploading a Source

Open the Create Source Dialog

Navigate to Sources in the sidebar (for workspace-level sources) or to a namespace's Sources tab (for namespace-scoped sources)
Click the New Source button (Plus icon)
The Add Workspace Source dialog opens as a wide split-panel view

Fill In Source Details (Left Panel)

The left panel contains the metadata fields:

Name (required) -- a descriptive name, e.g., "Company Documentation"
Description -- brief description of the source content
Source Type -- auto-detected from the file, or manually override
Tags -- add tags for organization and filtering

Choose Input Method (Right Panel)

The right panel has two input modes, selectable via tabs:

Drag and drop files onto the upload area, or click to browse your device. You can upload multiple files at once. Supported file types are shown in the format table above.

Paste text content directly into a text area. Useful for snippets, notes, or content copied from other applications.

Configure Advanced Options

Expand the Advanced section at the bottom of the left panel:

Smart Chunking -- toggle on to use intelligent document-aware chunking instead of fixed-size splits
Indexing Strategy -- choose how aggressively to index:

Strategy	Description
Standard	Balanced performance and recall
High Recall	Maximize document retrieval at the cost of precision
High Precision	Prioritize relevance over recall

Upload and Monitor Processing

Click the upload button. A progress indicator appears showing the current processing state.

The source transitions through these states in order:

Created -- source record created
Pending -- queued for processing
Uploading -- file data being transferred
Extracting -- text content being extracted from the file format
Chunking -- text split into manageable chunks with overlap
Embedding -- vector embeddings generated for each chunk
Indexing -- chunks and embeddings stored in the vector index
Completed -- source is ready for queries

Processing Status

You can monitor the status of every source in real time from the sources list. Each source shows a colored badge with its current state. If processing fails, the status shows Failed with an error message.

Source Scoping

Sources exist at one of two scopes:

Workspace -- available to all apps and namespaces in the workspace. Good for shared knowledge bases (company docs, FAQs).
Namespace -- scoped to a specific namespace within an app. Good for tenant-specific or context-specific documents.

When a source_query step runs, it can search workspace sources, namespace sources, or both, depending on the search scope configuration.

Managing Tags

Tags help you organize and filter sources. You can add tags when creating a source or edit them later. Use the Tags field in the source details, and filter the source list by tag using the filter controls at the top of the sources table.

Reindexing Sources

If you update the embedding model or change chunking settings, you can reindex existing sources without re-uploading the original files:

Select the source (or namespace) you want to reindex
Trigger Reindex from the source actions or namespace actions
The source re-enters the processing pipeline (Extracting, Chunking, Embedding, Indexing)
Existing chunks are replaced with newly generated ones

Using Sources in Pipelines

Add a Source Query step to your pipeline to retrieve relevant chunks at runtime:

Set the Query to a Jinja template like {{ endpoint_payload.message }}
Configure Top K (number of results) and Score Threshold (minimum similarity)
Optionally pin specific sources to limit the search scope
The step output contains the matched chunks, which you can reference in subsequent steps

See Configuring Step Types for full details on source step configuration.

On this page