Data Sources
Connect databases and static files to your agents with versioned query templates and parameterized queries.
Data Sources
Data sources allow your agents to query external databases and files during execution. PromptRails supports multiple database technologies and provides versioned, parameterized query templates that can be safely used by agents.
What Are Data Sources?
A data source represents a connection to an external data store. It defines:
- The database technology (PostgreSQL, MySQL, BigQuery, etc.)
- Connection credentials (linked to an encrypted credential)
- A query template with parameters
- Caching configuration
- Version history for safe iteration
When an agent executes, it can query its linked data sources to retrieve contextual information, look up records, or pull analytics data.
Supported Databases
| Type | Identifier | Description |
|---|---|---|
| PostgreSQL | postgresql | Standard PostgreSQL databases |
| MySQL | mysql | MySQL and MariaDB databases |
| BigQuery | bigquery | Google BigQuery data warehouse |
| Snowflake | snowflake | Snowflake cloud data platform |
| Redshift | redshift | Amazon Redshift data warehouse |
| MSSQL | mssql | Microsoft SQL Server |
| ClickHouse | clickhouse | ClickHouse analytics database |
| Static File | static_file | CSV, JSON, or other static files |
Query Templates
Data source versions include a query template that uses parameter placeholders. Parameters are substituted at execution time from the agent's input.
Example: PostgreSQL Query
SELECT order_id, status, total_amount, created_at
FROM orders
WHERE customer_id = :customer_id
AND status = :status
ORDER BY created_at DESC
LIMIT :limitParameters
Each query template defines its parameters:
[
{
"name": "customer_id",
"type": "string",
"required": true,
"description": "The customer's unique identifier"
},
{
"name": "status",
"type": "string",
"required": false,
"default": "active",
"description": "Order status filter"
},
{
"name": "limit",
"type": "integer",
"required": false,
"default": "10",
"description": "Maximum number of results"
}
]Creating a Data Source
Python SDK
# Create the data source
ds = client.data_sources.create(
name="Customer Orders",
description="Query customer order history",
type="postgresql"
)
# Create a version with query template
version = client.data_sources.create_version(
data_source_id=ds["data"]["id"],
credential_id="your-postgresql-credential-id",
query_template="""
SELECT order_id, status, total_amount, created_at
FROM orders
WHERE customer_id = :customer_id
ORDER BY created_at DESC
LIMIT :limit
""",
parameters=[
{"name": "customer_id", "type": "string", "required": True},
{"name": "limit", "type": "integer", "required": False, "default": "10"}
],
cache_timeout=300,
message="Initial query for customer orders"
)JavaScript SDK
const ds = await client.dataSources.create({
name: 'Customer Orders',
description: 'Query customer order history',
type: 'postgresql',
})
const version = await client.dataSources.createVersion(ds.data.id, {
credentialId: 'your-postgresql-credential-id',
queryTemplate: `
SELECT order_id, status, total_amount, created_at
FROM orders
WHERE customer_id = :customer_id
ORDER BY created_at DESC
LIMIT :limit
`,
parameters: [
{ name: 'customer_id', type: 'string', required: true },
{ name: 'limit', type: 'integer', required: false, default: '10' },
],
cacheTimeout: 300,
message: 'Initial query for customer orders',
})Version Management
Data sources use the same immutable versioning pattern as agents and prompts:
- Each version captures the query template, parameters, credential, and connection config
- Exactly one version per data source is marked as current
- Promote versions to make them active
- Roll back by promoting a previous version
# List versions
versions = client.data_sources.list_versions(data_source_id="your-ds-id")
# Promote a version
client.data_sources.promote_version(
data_source_id="your-ds-id",
version_id="version-to-promote"
)Connection Configuration
The connection_config object varies by database type. For databases that connect via the credential, this may be minimal. For BigQuery or Snowflake, it may include project/dataset/warehouse identifiers.
Cache Timeout
Each version specifies a cache_timeout in seconds:
0-- No caching (every query hits the database)3600-- Cache results for 1 hour (default)- Any positive integer -- Cache duration in seconds
Caching is keyed on the rendered query (after parameter substitution), so different parameter values produce independent cache entries.
Output Format
Data source versions support configurable output formats:
json(default) -- Query results as JSON arrayscsv-- Query results as CSV text
Test Connections
Before using a data source in production, test the connection:
result = client.data_sources.execute(
data_source_id="your-ds-id",
parameters={
"customer_id": "test-customer",
"limit": 5
}
)
print(f"Status: {result['data']['status']}")
print(f"Duration: {result['data']['duration_ms']}ms")
print(f"Results: {result['data']['result']}")Using Data Sources in Agents
Link data sources to agents through the agent version configuration. When the agent executes, it can query linked data sources as part of its pipeline:
- The agent receives input
- Parameters from the input are mapped to data source query parameters
- The query is executed and results are returned
- Results are injected into the prompt context for the LLM
This enables agents to provide data-grounded responses based on real-time database queries.
Data Source Status
| Status | Description |
|---|---|
active | Available for use by agents |
archived | Hidden and cannot be queried |
Related Topics
- Credentials -- Database connection credentials
- Agents -- Using data sources in agent configurations
- Tracing -- Data source queries appear as
datasourcespans