Data Sources
Let agents read structured data through managed queries instead of hard-coding database access in your application.
Data sources let agents retrieve structured context from databases and files during execution. Instead of hard-coding database access in your app, you define a versioned query template in PromptRails and attach it to the agent that needs it.
What Are Data Sources?
A data source defines:
- The database technology (PostgreSQL, MySQL, BigQuery, etc.)
- Connection credentials (linked to an encrypted credential)
- A query template with parameters
- Caching configuration
- Version history for safe iteration
When an agent executes, it can query its linked data sources to retrieve contextual information, look up records, or pull analytics data.
Supported Databases
| Type | Identifier | Description |
|---|---|---|
| PostgreSQL | postgresql | Standard PostgreSQL databases |
| MySQL | mysql | MySQL and MariaDB databases |
| BigQuery | bigquery | Google BigQuery data warehouse |
| Snowflake | snowflake | Snowflake cloud data platform |
| Redshift | redshift | Amazon Redshift data warehouse |
| MSSQL | mssql | Microsoft SQL Server |
| ClickHouse | clickhouse | ClickHouse analytics database |
| Static File | static_file | CSV, JSON, or other static files |
Technical detailsQuery templates and parameters
Query Templates
Data source versions include a query template that uses parameter placeholders. Parameters are substituted at execution time from the agent's input.
Example: PostgreSQL Query
SELECT order_id, status, total_amount, created_at
FROM orders
WHERE customer_id = :customer_id
AND status = :status
ORDER BY created_at DESC
LIMIT :limitParameters
Each query template defines its parameters:
[
{
"name": "customer_id",
"type": "string",
"required": true,
"description": "The customer's unique identifier"
},
{
"name": "status",
"type": "string",
"required": false,
"default": "active",
"description": "Order status filter"
},
{
"name": "limit",
"type": "integer",
"required": false,
"default": "10",
"description": "Maximum number of results"
}
]Creating a Data Source
Create and test data sources from Studio when you are shaping the query with teammates. Use the SDK when data source creation is part of an internal platform workflow.
Technical detailsCreate and version data sources with SDKs
Python SDK
# Create the data source
ds = client.data_sources.create(
name="Customer Orders",
description="Query customer order history",
type="postgresql"
)
# Create a version with query template
version = client.data_sources.create_version(
data_source_id=ds["data"]["id"],
credential_id="your-postgresql-credential-id",
query_template="""
SELECT order_id, status, total_amount, created_at
FROM orders
WHERE customer_id = :customer_id
ORDER BY created_at DESC
LIMIT :limit
""",
parameters=[
{"name": "customer_id", "type": "string", "required": True},
{"name": "limit", "type": "integer", "required": False, "default": "10"}
],
cache_timeout=300,
message="Initial query for customer orders"
)JavaScript SDK
const ds = await client.dataSources.create({
name: 'Customer Orders',
description: 'Query customer order history',
type: 'postgresql',
})
const version = await client.dataSources.createVersion(ds.data.id, {
credentialId: 'your-postgresql-credential-id',
queryTemplate: `
SELECT order_id, status, total_amount, created_at
FROM orders
WHERE customer_id = :customer_id
ORDER BY created_at DESC
LIMIT :limit
`,
parameters: [
{ name: 'customer_id', type: 'string', required: true },
{ name: 'limit', type: 'integer', required: false, default: '10' },
],
cacheTimeout: 300,
message: 'Initial query for customer orders',
})Version Management
Data sources use the same immutable versioning pattern as agents and prompts:
- Each version captures the query template, parameters, credential, and connection config
- Exactly one version per data source is marked as current
- Promote versions to make them active
- Roll back by promoting a previous version
# List versions
versions = client.data_sources.list_versions(data_source_id="your-ds-id")
# Promote a version
client.data_sources.promote_version(
data_source_id="your-ds-id",
version_id="version-to-promote"
)Technical detailsConnection, cache, and output details
Connection Configuration
The connection_config object varies by database type. For databases that connect via the credential, this may be minimal. For BigQuery or Snowflake, it may include project/dataset/warehouse identifiers.
Cache Timeout
Each version specifies a cache_timeout in seconds:
0-- No caching (every query hits the database)3600-- Cache results for 1 hour (default)- Any positive integer -- Cache duration in seconds
Caching is keyed on the rendered query (after parameter substitution), so different parameter values produce independent cache entries.
Output Format
Data source versions support configurable output formats:
json(default) -- Query results as JSON arrayscsv-- Query results as CSV text
Test Connections
Before using a data source in production, test the connection:
result = client.data_sources.execute(
data_source_id="your-ds-id",
parameters={
"customer_id": "test-customer",
"limit": 5
}
)
print(f"Status: {result['data']['status']}")
print(f"Duration: {result['data']['duration_ms']}ms")
print(f"Results: {result['data']['result']}")Using Data Sources in Agents
Link data sources to agents through the agent version configuration. When the agent executes, it can query linked data sources as part of its pipeline:
- The agent receives input
- Parameters from the input are mapped to data source query parameters
- The query is executed and results are returned
- Results are injected into the prompt context for the LLM
This enables agents to provide data-grounded responses based on real-time database queries.
Data Source Status
| Status | Description |
|---|---|
active | Available for use by agents |
archived | Hidden and cannot be queried |
Related Topics
- Credentials -- Database connection credentials
- Agents -- Using data sources in agent configurations
- Tracing -- Data source queries appear as
datasourcespans