Tracing
OpenTelemetry-style distributed tracing with 18 span kinds for complete visibility into agent execution, LLM calls, tool invocations, media generation, and more.
Tracing
PromptRails provides comprehensive distributed tracing that captures every step of an agent execution. Inspired by OpenTelemetry, the tracing system records spans for LLM calls, tool invocations, guardrail evaluations, data source queries, memory operations, and more.
How Tracing Works
Every agent execution generates a trace -- a tree of spans that represent the individual operations performed during execution. Each span captures:
- What happened (name, kind)
- How long it took (duration, start/end timestamps)
- What went in and came out (input/output)
- Whether it succeeded (status, error details)
- How much it cost (token usage, cost)
- Context links (agent, prompt, model, execution, session)
Spans form a parent-child hierarchy that reflects the execution flow:
agent (root span)
├── guardrail (input scan)
├── prompt (template rendering)
├── llm (model call)
│ └── tool (tool call by LLM)
│ └── mcp_call (MCP server invocation)
├── guardrail (output scan)
└── memory (memory update)
Span Kinds
PromptRails defines 18 span kinds:
| Kind | Identifier | Description |
|---|---|---|
| Agent | agent | Top-level agent execution span |
| LLM | llm | LLM model call (prompt + completion) |
| Tool | tool | Tool invocation within an agent |
| Data Source | datasource | Database or file query |
| Prompt | prompt | Prompt template rendering |
| Guardrail | guardrail | Input or output guardrail scan |
| Chain | chain | Chain-type agent orchestration |
| Workflow | workflow | Workflow step execution |
| Agent Step | agent_step | Individual step in a multi-agent execution |
| MCP Call | mcp_call | Remote MCP server tool call |
| Preprocessing | preprocessing | Input preprocessing before LLM |
| Postprocessing | postprocessing | Output postprocessing after LLM |
| Memory | memory | Memory retrieval or storage |
| Embedding | embedding | Vector embedding generation |
| Speech | speech | Text-to-speech or speech-to-text operation |
| Image | image | Image generation or editing |
| Video | video | Video generation |
| Storage | storage | Asset storage upload/download |
Span Hierarchy
Spans are organized in a tree structure using trace IDs, span IDs, and parent span IDs:
- trace_id -- Groups all spans belonging to the same execution
- span_id -- Uniquely identifies a single span
- parent_span_id -- Links a span to its parent (empty for root spans)
Example Trace for a Simple Agent
[agent] Customer Support Bot (15ms total)
[guardrail] prompt_injection input scan (2ms)
[prompt] Render main prompt (1ms)
[llm] gpt-4o call (10ms, 450 prompt + 120 completion tokens, $0.003)
[guardrail] pii output scan (2ms)
Example Trace for a Chain Agent
[chain] Data Analysis Pipeline (45ms total)
[agent_step] Step 1: Data Extraction
[datasource] Query analytics DB (8ms)
[prompt] Render extraction prompt (1ms)
[llm] gpt-4o call (15ms)
[agent_step] Step 2: Analysis
[memory] Retrieve analysis templates (2ms)
[prompt] Render analysis prompt (1ms)
[llm] claude-3.5-sonnet call (18ms)
Span Status and Levels
Status
| Status | Description |
|---|---|
ok | The span completed successfully |
error | The span encountered an error |
Level
| Level | Description |
|---|---|
debug | Detailed diagnostic information |
default | Standard operational information |
warning | Something unexpected but non-fatal |
error | An error occurred |
Span Attributes
Each span carries an attributes JSON object with kind-specific metadata:
LLM Span Attributes
{
"model": "gpt-4o",
"provider": "openai",
"temperature": 0.7,
"max_tokens": 1024,
"prompt_tokens": 450,
"completion_tokens": 120,
"total_tokens": 570,
"cost": 0.003
}Guardrail Span Attributes
{
"scanner_type": "prompt_injection",
"direction": "input",
"action": "block",
"triggered": false
}Tool Span Attributes
{
"tool_name": "weather_api",
"tool_type": "api",
"parameters": { "location": "New York" }
}Media Span Attributes (Speech / Image / Video)
{
"provider": "fal",
"model": "fal-ai/flux/schnell",
"media_type": "image_gen",
"prompt": "A futuristic city skyline at sunset",
"asset_url": "https://storage.example.com/assets/image.png",
"content_type": "image/png"
}Filtering Traces
List and filter traces with multiple criteria:
traces = client.traces.list(
agent_id="your-agent-id", # Filter by agent
execution_id="execution-id", # Filter by execution
kind="llm", # Filter by span kind
status="ok", # Filter by status
model_name="gpt-4o", # Filter by model
session_id="session-id", # Filter by session
user_id="user-id", # Filter by user
page=1,
limit=50
)JavaScript SDK
const traces = await client.traces.list({
agentId: 'your-agent-id',
kind: 'llm',
status: 'ok',
page: 1,
limit: 50,
})Cost and Token Tracking Per Span
Every LLM span includes precise cost and token tracking:
| Field | Description |
|---|---|
prompt_tokens | Number of input tokens sent to the model |
completion_tokens | Number of output tokens generated |
total_tokens | Sum of prompt and completion tokens |
cost | Cost in USD for this specific span |
model_name | The model used for this call |
This enables granular cost attribution -- you can see exactly which LLM call in a multi-step execution was the most expensive.
Error Information
When a span has an error status, additional fields provide diagnostic information:
| Field | Description |
|---|---|
error_message | Human-readable error description |
error_type | Error classification (e.g., rate_limit, timeout, validation) |
error_stack | Stack trace for debugging |
Trace Fields
| Field | Type | Description |
|---|---|---|
id | KSUID | Unique span record ID |
trace_id | string | Trace group identifier |
span_id | string | Unique span identifier |
parent_span_id | string | Parent span (empty for root) |
name | string | Span name |
kind | string | Span kind (one of 18 types) |
status | string | ok or error |
level | string | debug, default, warning, error |
input | JSON | Span input data |
output | JSON | Span output data |
attributes | JSON | Kind-specific metadata |
tags | JSON | Custom tags |
token_usage | JSON | Token consumption |
cost | float | Cost in USD |
duration_ms | integer | Duration in milliseconds |
model_name | string | LLM model name |
agent_id | KSUID | Associated agent |
prompt_id | KSUID | Associated prompt |
execution_id | KSUID | Associated execution |
session_id | string | Associated chat session |
started_at | timestamp | Span start time |
ended_at | timestamp | Span end time |
Related Topics
- Executions -- Execution lifecycle
- Media Generation -- Speech, image, and video generation
- Cost Tracking -- Aggregated cost analysis
- Scoring and Evaluation -- Scoring individual spans