Models

Organization-level model management with workspace entitlements, multi-provider routing, and capability tracking.

MechaMental abstracts away the complexity of managing multiple LLM providers behind a logical model layer. Models are managed at the organization level and granted to workspaces via entitlements.

Organization-Level Management

Models are configured and managed centrally at the organization level. This gives administrators a single place to:

Add and configure provider accounts (Anthropic, OpenAI, Google, etc.)
Define logical models with provider targets and fallback chains
Control which workspaces have access to which models
Monitor usage across the platform

Dashboard Stats

The organization model dashboard displays key metrics:

Stat	Description
Available Models	Total number of logical models configured
Total Requests	Aggregate request count across all models
Total Tokens	Aggregate token consumption across all models
Provider Targets	Number of provider/model target configurations

Model Categories

MechaMental supports four categories of models, each serving a different purpose in your pipelines:

Conversational LLMs used in inference steps. Chat models handle system prompts, user messages, tool use, and streaming responses.

Common capabilities: Streaming, Tools, Vision, JSON

Vector embedding models used for source processing, memory indexing, and semantic search. These power the source_query, memory_query, and other retrieval step types.

Image generation and analysis models for visual content workflows. Used when pipelines need to create, interpret, or transform images.

Speech-to-text and text-to-speech models for voice-based applications. Enable audio input/output in your pipelines.

Capabilities

Each model tracks its supported capabilities, displayed in the model catalog:

Capability	Description
Streaming	Supports streaming token-by-token responses via SSE
Tools	Supports function/tool calling for invoking external integrations
Vision	Supports image inputs alongside text
JSON	Supports structured JSON output mode

Capabilities determine which features are available when you select a model for an inference step. For example, only models with the Tools capability can be used in inference steps that have tools attached.

Workspace Entitlements

Organizations grant model access to workspaces through entitlements. This controls:

Which models are available in a given workspace
Which teams can use expensive or high-capability models
Cost allocation across projects

When building a pipeline, the model dropdown in an inference step only shows models that have been entitled to your workspace.

Entitlement Management

If a model you need is not available in your workspace, contact your organization administrator to grant the entitlement. Model entitlements are managed from the organization settings.

Logical Models and Provider Targets

A logical model is a named abstraction (e.g., "fast-chat", "high-quality-reasoning") that maps to one or more provider targets — specific provider/model combinations.

Logical Model: "fast-chat"
├── Target 1: Anthropic / Claude Sonnet  (priority: 1)
├── Target 2: OpenAI / GPT-4o           (priority: 2)
└── Target 3: Google / Gemini Pro        (priority: 3)

Logical Model — the name you reference in pipeline inference steps
Provider Target — a specific provider + model (e.g., Anthropic / Claude Sonnet)
Each logical model can have multiple targets for redundancy and cost optimization

Fallback Chains

When a logical model has multiple targets, they form a fallback chain. If the primary target fails or hits a rate limit, the request automatically falls back to the next target. This provides built-in resilience without any pipeline changes.

Cost Optimization

Order targets by cost or capability. Use a faster, cheaper model as the primary target and fall back to a more capable (and more expensive) model only when needed.

Router Policies

Router policies control how requests are distributed across targets:

Policy	Behavior
Priority (ordered fallback)	Try targets in order; fall back on failure
Lowest cost	Route to the cheapest available target
Lowest latency	Route to the fastest responding target
Round robin	Distribute requests evenly across targets

Provider Accounts

Provider accounts store the credentials for each LLM provider. These are managed at the organization level and shared across workspaces via model entitlements.

Supported providers include Anthropic, OpenAI, Google, Azure OpenAI, and others. See the Model Configuration admin guide for provider setup details.

On this page