DAG-like processing chains that define exactly how your AI systems execute. Compose stages and steps into deterministic, version-controlled pipelines that handle everything from inference to tool calls, memory operations, and data transformations.
Pipelines follow a clear hierarchy that gives you granular control over execution while keeping things organized at every level.
The top-level container. An app groups related augmentations, endpoints, and configurations into one deployable unit.
A single pipeline definition within an app. Each augmentation is a self-contained DAG of stages that processes a request from start to finish.
A grouping of steps within an augmentation. Stages can run sequentially or in parallel, and support iteration over collections with for_each.
The atomic unit of work. Each step performs a single operation: run inference, call a tool, query memory, transform data, or create an artifact.
Each step in a pipeline performs a specific type of operation. Combine them to build rich, multi-step AI systems.
Run LLM inference against any configured model. Supports system prompts, structured output schemas, temperature controls, and streaming.
Invoke registered tools including webhooks, MCP servers, custom code runners, and external API integrations.
Read from and write to the knowledge layer. Inject conversation history, thread context, or persistent memory into the pipeline.
Query document sources using semantic search and RAG. Retrieve relevant chunks from indexed knowledge bases to ground inference.
Reshape, filter, and transform data between steps. Map outputs from one step into the expected inputs of another.
Generate and store output artifacts such as files, structured data, or reports that persist beyond the pipeline execution.
Branch pipeline logic based on runtime conditions. Evaluate step outputs, model responses, or input parameters to determine which stages execute next.
Stream real-time server-sent events from pipeline steps to clients. Push intermediate results, progress updates, and partial responses as they are generated.
Pipelines support flexible execution patterns to handle both simple linear flows and complex parallel workloads.
Stages execute one after another in a defined order. Each stage waits for the previous one to complete before starting. Ideal for linear workflows where each step depends on prior results.
Multiple stages execute concurrently when they have no dependencies on each other. Dramatically reduces pipeline latency for independent operations like querying multiple sources simultaneously.
Stages can iterate over a collection, executing their steps once per item. Useful for processing batches of documents, running inference on lists of inputs, or fan-out patterns.
Define explicit dependencies between stages to build complex DAGs. The execution engine resolves the graph and runs stages as soon as their dependencies are satisfied.
Built for production from day one. Every pipeline comes with enterprise-grade reliability, observability, and version control out of the box.
Given the same inputs and configuration, a pipeline produces the same execution plan every time. No hidden state, no surprises. The execution engine guarantees stage ordering, step sequencing, and data flow are fully reproducible.
Every change to a pipeline — whether it's a new step, a modified prompt, or a configuration tweak — is tracked as a commit. You can browse the full history, compare any two versions, and revert to a previous state instantly.
Changes are diffed at the field level with word-by-word inline comparisons. See exactly which words in a system prompt changed, which parameters were modified, and which steps were added or removed — not just that "the file changed".
Configure retry policies per step with exponential backoff, max attempts, and timeout thresholds. Circuit breakers automatically halt execution when downstream services are unhealthy, preventing cascading failures across your pipeline.
Inject contextual memory at multiple levels: augmentation-scoped memory shared across all steps, step-level memory for isolated context, thread memory for conversational continuity, and system memory for global platform context.
Endpoints are the bridge between the outside world and your pipelines. They map external triggers to augmentations, giving you full control over how pipelines are invoked.
Traditional request-response endpoints that expose your pipelines as APIs. Trigger augmentations via HTTP calls with structured input payloads and receive the pipeline output as a response.
Conversational endpoints that maintain thread context across interactions. Designed for chat-like interfaces where the pipeline needs access to conversation history, user memory, and multi-turn context.
Both endpoint types support scheduling via CRON expressions and event-driven invocation. Endpoints are versioned alongside the rest of your app, so every change to routing configuration is tracked in the same commit history as your pipelines.