Data Sources

Let agents read structured data through managed queries instead of hard-coding database access in your application.

Data sources let agents retrieve structured context from databases and files during execution. Instead of hard-coding database access in your app, you define a versioned query template in PromptRails and attach it to the agent that needs it.

PromptRails Studio showing a data source query template and connection settings — Data sources live in Studio with agents and prompts. Open a data source to review the query template, parameters, connection, and version before a workflow uses it.

What Are Data Sources?

A data source defines:

The database technology (PostgreSQL, MySQL, BigQuery, etc.)
Connection credentials (linked to an encrypted credential)
A query template with parameters
Caching configuration
Version history for safe iteration

When an agent executes, it can query its linked data sources to retrieve contextual information, look up records, or pull analytics data.

Supported Databases

Type	Identifier	Description
PostgreSQL	`postgresql`	Standard PostgreSQL databases
MySQL	`mysql`	MySQL and MariaDB databases
BigQuery	`bigquery`	Google BigQuery data warehouse
Snowflake	`snowflake`	Snowflake cloud data platform
Redshift	`redshift`	Amazon Redshift data warehouse
MSSQL	`mssql`	Microsoft SQL Server
ClickHouse	`clickhouse`	ClickHouse analytics database
Static File	`static_file`	CSV, JSON, or other static files

Technical detailsQuery templates and parameters

Query Templates

Data source versions include a query template that uses parameter placeholders. Parameters are substituted at execution time from the agent's input.

Example: PostgreSQL Query

SELECT order_id, status, total_amount, created_at
FROM orders
WHERE customer_id = :customer_id
  AND status = :status
ORDER BY created_at DESC
LIMIT :limit

Parameters

Each query template defines its parameters:

[
  {
    "name": "customer_id",
    "type": "string",
    "required": true,
    "description": "The customer's unique identifier"
  },
  {
    "name": "status",
    "type": "string",
    "required": false,
    "default": "active",
    "description": "Order status filter"
  },
  {
    "name": "limit",
    "type": "integer",
    "required": false,
    "default": "10",
    "description": "Maximum number of results"
  }
]

Creating a Data Source

Create and test data sources from Studio when you are shaping the query with teammates. Use the SDK when data source creation is part of an internal platform workflow.

Technical detailsCreate and version data sources with SDKs

Python SDK

# Create the data source
ds = client.data_sources.create(
    name="Customer Orders",
    description="Query customer order history",
    type="postgresql"
)
 
# Create a version with query template
version = client.data_sources.create_version(
    data_source_id=ds["data"]["id"],
    credential_id="your-postgresql-credential-id",
    query_template="""
        SELECT order_id, status, total_amount, created_at
        FROM orders
        WHERE customer_id = :customer_id
        ORDER BY created_at DESC
        LIMIT :limit
    """,
    parameters=[
        {"name": "customer_id", "type": "string", "required": True},
        {"name": "limit", "type": "integer", "required": False, "default": "10"}
    ],
    cache_timeout=300,
    message="Initial query for customer orders"
)

JavaScript SDK

const ds = await client.dataSources.create({
  name: 'Customer Orders',
  description: 'Query customer order history',
  type: 'postgresql',
})
 
const version = await client.dataSources.createVersion(ds.data.id, {
  credentialId: 'your-postgresql-credential-id',
  queryTemplate: `
    SELECT order_id, status, total_amount, created_at
    FROM orders
    WHERE customer_id = :customer_id
    ORDER BY created_at DESC
    LIMIT :limit
  `,
  parameters: [
    { name: 'customer_id', type: 'string', required: true },
    { name: 'limit', type: 'integer', required: false, default: '10' },
  ],
  cacheTimeout: 300,
  message: 'Initial query for customer orders',
})

Version Management

Data sources use the same immutable versioning pattern as agents and prompts:

Each version captures the query template, parameters, credential, and connection config
Exactly one version per data source is marked as current
Promote versions to make them active
Roll back by promoting a previous version

# List versions
versions = client.data_sources.list_versions(data_source_id="your-ds-id")
 
# Promote a version
client.data_sources.promote_version(
    data_source_id="your-ds-id",
    version_id="version-to-promote"
)

Technical detailsConnection, cache, and output details

Connection Configuration

The connection_config object varies by database type. For databases that connect via the credential, this may be minimal. For BigQuery or Snowflake, it may include project/dataset/warehouse identifiers.

Cache Timeout

Each version specifies a cache_timeout in seconds:

0 -- No caching (every query hits the database)
3600 -- Cache results for 1 hour (default)
Any positive integer -- Cache duration in seconds

Caching is keyed on the rendered query (after parameter substitution), so different parameter values produce independent cache entries.

Output Format

Data source versions support configurable output formats:

json (default) -- Query results as JSON arrays
csv -- Query results as CSV text

Test Connections

Before using a data source in production, test the connection:

result = client.data_sources.execute(
    data_source_id="your-ds-id",
    parameters={
        "customer_id": "test-customer",
        "limit": 5
    }
)
 
print(f"Status: {result['data']['status']}")
print(f"Duration: {result['data']['duration_ms']}ms")
print(f"Results: {result['data']['result']}")

Using Data Sources in Agents

Link data sources to agents through the agent version configuration. When the agent executes, it can query linked data sources as part of its pipeline:

The agent receives input
Parameters from the input are mapped to data source query parameters
The query is executed and results are returned
Results are injected into the prompt context for the LLM

This enables agents to provide data-grounded responses based on real-time database queries.

Data Source Status

Status	Description
`active`	Available for use by agents
`archived`	Hidden and cannot be queried

Credentials -- Database connection credentials
Agents -- Using data sources in agent configurations
Tracing -- Data source queries appear as datasource spans