# Tracing

> OpenTelemetry-style distributed tracing with 18 span kinds for complete visibility into agent execution, LLM calls, tool invocations, media generation, and more.

Source: https://0.0.0.0:8080/docs/tracing

PromptRails provides comprehensive distributed tracing that captures every step of an agent execution. Inspired by OpenTelemetry, the tracing system records spans for LLM calls, tool invocations, guardrail evaluations, data source queries, memory operations, and more.

## How Tracing Works

Every agent execution generates a **trace** -- a tree of **spans** that represent the individual operations performed during execution. Each span captures:

- What happened (name, kind)
- How long it took (duration, start/end timestamps)
- What went in and came out (input/output)
- Whether it succeeded (status, error details)
- How much it cost (token usage, cost)
- Context links (agent, prompt, model, execution, session)

Spans form a parent-child hierarchy that reflects the execution flow:

```
agent (root span)
  ├── guardrail (input scan)
  ├── prompt (template rendering)
  ├── llm (model call)
  │   └── tool (tool call by LLM)
  │       └── mcp_call (MCP server invocation)
  ├── guardrail (output scan)
  └── memory (memory update)
```

## Span Kinds

PromptRails defines 18 span kinds:

| Kind               | Identifier       | Description                                |
| ------------------ | ---------------- | ------------------------------------------ |
| **Agent**          | `agent`          | Top-level agent execution span             |
| **LLM**            | `llm`            | LLM model call (prompt + completion)       |
| **Tool**           | `tool`           | Tool invocation within an agent            |
| **Data Source**    | `datasource`     | Database or file query                     |
| **Prompt**         | `prompt`         | Prompt template rendering                  |
| **Guardrail**      | `guardrail`      | Input or output guardrail scan             |
| **Chain**          | `chain`          | Chain-type agent orchestration             |
| **Workflow**       | `workflow`       | Workflow step execution                    |
| **Agent Step**     | `agent_step`     | Individual step in a multi-agent execution |
| **MCP Call**       | `mcp_call`       | Remote MCP server tool call                |
| **Preprocessing**  | `preprocessing`  | Input preprocessing before LLM             |
| **Postprocessing** | `postprocessing` | Output postprocessing after LLM            |
| **Memory**         | `memory`         | Memory retrieval or storage                |
| **Embedding**      | `embedding`      | Vector embedding generation                |
| **Speech**         | `speech`         | Text-to-speech or speech-to-text operation |
| **Image**          | `image`          | Image generation or editing                |
| **Video**          | `video`          | Video generation                           |
| **Storage**        | `storage`        | Asset storage upload/download              |

## Span Hierarchy

Spans are organized in a tree structure using trace IDs, span IDs, and parent span IDs:

- **trace_id** -- Groups all spans belonging to the same execution
- **span_id** -- Uniquely identifies a single span
- **parent_span_id** -- Links a span to its parent (empty for root spans)

### Example Trace for a Simple Agent

```
[agent] Customer Support Bot (15ms total)
  [guardrail] prompt_injection input scan (2ms)
  [prompt] Render main prompt (1ms)
  [llm] gpt-4o call (10ms, 450 prompt + 120 completion tokens, $0.003)
  [guardrail] pii output scan (2ms)
```

### Example Trace for a Chain Agent

```
[chain] Data Analysis Pipeline (45ms total)
  [agent_step] Step 1: Data Extraction
    [datasource] Query analytics DB (8ms)
    [prompt] Render extraction prompt (1ms)
    [llm] gpt-4o call (15ms)
  [agent_step] Step 2: Analysis
    [memory] Retrieve analysis templates (2ms)
    [prompt] Render analysis prompt (1ms)
    [llm] claude-3.5-sonnet call (18ms)
```

## Span Status and Levels

### Status

| Status  | Description                     |
| ------- | ------------------------------- |
| `ok`    | The span completed successfully |
| `error` | The span encountered an error   |

### Level

| Level     | Description                        |
| --------- | ---------------------------------- |
| `debug`   | Detailed diagnostic information    |
| `default` | Standard operational information   |
| `warning` | Something unexpected but non-fatal |
| `error`   | An error occurred                  |

## Span Attributes

Each span carries an `attributes` JSON object with kind-specific metadata:

### LLM Span Attributes

```json
{
  "model": "gpt-4o",
  "provider": "openai",
  "temperature": 0.7,
  "max_tokens": 1024,
  "prompt_tokens": 450,
  "completion_tokens": 120,
  "total_tokens": 570,
  "cost": 0.003
}
```

### Guardrail Span Attributes

```json
{
  "scanner_type": "prompt_injection",
  "direction": "input",
  "action": "block",
  "triggered": false
}
```

### Tool Span Attributes

```json
{
  "tool_name": "weather_api",
  "tool_type": "api",
  "parameters": { "location": "New York" }
}
```

### Media Span Attributes (Speech / Image / Video)

```json
{
  "provider": "fal",
  "model": "fal-ai/flux/schnell",
  "media_type": "image_gen",
  "prompt": "A futuristic city skyline at sunset",
  "asset_url": "https://storage.example.com/assets/image.png",
  "content_type": "image/png"
}
```

## Filtering Traces

List and filter traces with multiple criteria:

```python
traces = client.traces.list(
    agent_id="your-agent-id",          # Filter by agent
    execution_id="execution-id",        # Filter by execution
    kind="llm",                         # Filter by span kind
    status="ok",                        # Filter by status
    model_name="gpt-4o",               # Filter by model
    session_id="session-id",            # Filter by session
    user_id="user-id",                  # Filter by user
    page=1,
    limit=50
)
```

**JavaScript SDK**

```typescript
const traces = await client.traces.list({
  agentId: 'your-agent-id',
  kind: 'llm',
  status: 'ok',
  page: 1,
  limit: 50,
})
```

## Cost and Token Tracking Per Span

Every LLM span includes precise cost and token tracking:

| Field               | Description                              |
| ------------------- | ---------------------------------------- |
| `prompt_tokens`     | Number of input tokens sent to the model |
| `completion_tokens` | Number of output tokens generated        |
| `total_tokens`      | Sum of prompt and completion tokens      |
| `cost`              | Cost in USD for this specific span       |
| `model_name`        | The model used for this call             |

This enables granular cost attribution -- you can see exactly which LLM call in a multi-step execution was the most expensive.

## Error Information

When a span has an error status, additional fields provide diagnostic information:

| Field           | Description                                                        |
| --------------- | ------------------------------------------------------------------ |
| `error_message` | Human-readable error description                                   |
| `error_type`    | Error classification (e.g., `rate_limit`, `timeout`, `validation`) |
| `error_stack`   | Stack trace for debugging                                          |

## Trace Fields

| Field            | Type      | Description                            |
| ---------------- | --------- | -------------------------------------- |
| `id`             | KSUID     | Unique span record ID                  |
| `trace_id`       | string    | Trace group identifier                 |
| `span_id`        | string    | Unique span identifier                 |
| `parent_span_id` | string    | Parent span (empty for root)           |
| `name`           | string    | Span name                              |
| `kind`           | string    | Span kind (one of 18 types)            |
| `status`         | string    | `ok` or `error`                        |
| `level`          | string    | `debug`, `default`, `warning`, `error` |
| `input`          | JSON      | Span input data                        |
| `output`         | JSON      | Span output data                       |
| `attributes`     | JSON      | Kind-specific metadata                 |
| `tags`           | JSON      | Custom tags                            |
| `token_usage`    | JSON      | Token consumption                      |
| `cost`           | float     | Cost in USD                            |
| `duration_ms`    | integer   | Duration in milliseconds               |
| `model_name`     | string    | LLM model name                         |
| `agent_id`       | KSUID     | Associated agent                       |
| `prompt_id`      | KSUID     | Associated prompt                      |
| `execution_id`   | KSUID     | Associated execution                   |
| `session_id`     | string    | Associated chat session                |
| `started_at`     | timestamp | Span start time                        |
| `ended_at`       | timestamp | Span end time                          |

## Related Topics

- [Executions](/docs/executions) -- Execution lifecycle
- [Media Generation](/docs/media-generation) -- Speech, image, and video generation
- [Cost Tracking](/docs/cost-tracking) -- Aggregated cost analysis
- [Scoring and Evaluation](/docs/scoring-and-evaluation) -- Scoring individual spans
