MechaMental
Concepts

Models

Organization-level model management with workspace entitlements, multi-provider routing, and capability tracking.

MechaMental abstracts away the complexity of managing multiple LLM providers behind a logical model layer. Models are managed at the organization level and granted to workspaces via entitlements.

Organization-Level Management

Models are configured and managed centrally at the organization level. This gives administrators a single place to:

  • Add and configure provider accounts (Anthropic, OpenAI, Google, etc.)
  • Define logical models with provider targets and fallback chains
  • Control which workspaces have access to which models
  • Monitor usage across the platform

Dashboard Stats

The organization model dashboard displays key metrics:

StatDescription
Available ModelsTotal number of logical models configured
Total RequestsAggregate request count across all models
Total TokensAggregate token consumption across all models
Provider TargetsNumber of provider/model target configurations

Model Categories

MechaMental supports four categories of models, each serving a different purpose in your pipelines:

Conversational LLMs used in inference steps. Chat models handle system prompts, user messages, tool use, and streaming responses.

Common capabilities: Streaming, Tools, Vision, JSON

Vector embedding models used for source processing, memory indexing, and semantic search. These power the source_query, memory_query, and other retrieval step types.

Image generation and analysis models for visual content workflows. Used when pipelines need to create, interpret, or transform images.

Speech-to-text and text-to-speech models for voice-based applications. Enable audio input/output in your pipelines.

Capabilities

Each model tracks its supported capabilities, displayed in the model catalog:

CapabilityDescription
StreamingSupports streaming token-by-token responses via SSE
ToolsSupports function/tool calling for invoking external integrations
VisionSupports image inputs alongside text
JSONSupports structured JSON output mode

Capabilities determine which features are available when you select a model for an inference step. For example, only models with the Tools capability can be used in inference steps that have tools attached.

Workspace Entitlements

Organizations grant model access to workspaces through entitlements. This controls:

  • Which models are available in a given workspace
  • Which teams can use expensive or high-capability models
  • Cost allocation across projects

When building a pipeline, the model dropdown in an inference step only shows models that have been entitled to your workspace.

Entitlement Management

If a model you need is not available in your workspace, contact your organization administrator to grant the entitlement. Model entitlements are managed from the organization settings.

Logical Models and Provider Targets

A logical model is a named abstraction (e.g., "fast-chat", "high-quality-reasoning") that maps to one or more provider targets — specific provider/model combinations.

Logical Model: "fast-chat"
├── Target 1: Anthropic / Claude Sonnet  (priority: 1)
├── Target 2: OpenAI / GPT-4o           (priority: 2)
└── Target 3: Google / Gemini Pro        (priority: 3)
  • Logical Model — the name you reference in pipeline inference steps
  • Provider Target — a specific provider + model (e.g., Anthropic / Claude Sonnet)
  • Each logical model can have multiple targets for redundancy and cost optimization

Fallback Chains

When a logical model has multiple targets, they form a fallback chain. If the primary target fails or hits a rate limit, the request automatically falls back to the next target. This provides built-in resilience without any pipeline changes.

Cost Optimization

Order targets by cost or capability. Use a faster, cheaper model as the primary target and fall back to a more capable (and more expensive) model only when needed.

Router Policies

Router policies control how requests are distributed across targets:

PolicyBehavior
Priority (ordered fallback)Try targets in order; fall back on failure
Lowest costRoute to the cheapest available target
Lowest latencyRoute to the fastest responding target
Round robinDistribute requests evenly across targets

Provider Accounts

Provider accounts store the credentials for each LLM provider. These are managed at the organization level and shared across workspaces via model entitlements.

Supported providers include Anthropic, OpenAI, Google, Azure OpenAI, and others. See the Model Configuration admin guide for provider setup details.

On this page