Prompts
Manage versioned prompts with Jinja2 templating, model assignment, caching, and structured input/output schemas.
Prompts
Prompts are the instructions that guide LLM behavior. PromptRails provides a full prompt management system with versioning, templating, model assignment, and caching -- all configurable through the UI or API.
Prompt Management Overview
A prompt in PromptRails consists of:
- System prompt -- Instructions that define the LLM's role and behavior
- User prompt -- A template that renders the user's input into a structured message
- Model assignment -- Which LLM model to use (with optional fallback)
- Parameters -- Temperature, max tokens, top_p
- Input/output schemas -- JSON schemas for validation
- Cache timeout -- How long to cache responses
- Version history -- Immutable versions with promotion
Prompts are workspace-scoped and can be linked to one or more agents.
Jinja2 Templating
PromptRails uses Jinja2 templating for prompt rendering. This allows you to create dynamic prompts that incorporate variables from the execution input.
Basic Variables
You are a customer support agent for {{ company_name }}.
The customer's name is {{ customer_name }}.
Please help them with their inquiry:
{{ message }}Conditionals
You are a {{ role }} assistant.
{% if language == "spanish" %}
Please respond in Spanish.
{% elif language == "french" %}
Please respond in French.
{% else %}
Please respond in English.
{% endif %}
User query: {{ message }}Loops
Here are the relevant documents for context:
{% for doc in documents %}
Document {{ loop.index }}: {{ doc.title }}
Content: {{ doc.content }}
---
{% endfor %}
Based on the above documents, answer: {{ question }}Filters
Customer name: {{ name | upper }}
Order date: {{ date | default("Unknown") }}
Summary: {{ long_text | truncate(200) }}Input and Output Schemas
Define JSON schemas to validate inputs before prompt rendering and structure outputs from the LLM.
Input Schema
{
"type": "object",
"properties": {
"message": {
"type": "string",
"description": "The user's message"
},
"language": {
"type": "string",
"enum": ["en", "es", "fr", "de"],
"default": "en"
},
"context": {
"type": "array",
"items": { "type": "string" }
}
},
"required": ["message"]
}Output Schema
{
"type": "object",
"properties": {
"response": { "type": "string" },
"sentiment": {
"type": "string",
"enum": ["positive", "neutral", "negative"]
},
"confidence": {
"type": "number",
"minimum": 0,
"maximum": 1
}
}
}Model Assignment
Each prompt version specifies which LLM model to use:
- Primary model -- The default model for execution
- Fallback model -- Used if the primary model fails or is unavailable
Supported providers include OpenAI, Anthropic, Google Gemini, DeepSeek, Fireworks, xAI, and OpenRouter. Models are configured globally and referenced by their PromptRails ID.
Temperature, Max Tokens, and Top P
| Parameter | Default | Range | Description |
|---|---|---|---|
temperature | 0.7 | 0.0 - 1.0 | Controls randomness. Lower values produce more deterministic outputs. |
max_tokens | Provider default | Varies by model | Maximum number of tokens in the response. |
top_p | Provider default | 0.0 - 1.0 | Nucleus sampling. Controls diversity by limiting the token pool. |
Prompt Caching
PromptRails supports response caching at the prompt version level. When cache_timeout is set to a value greater than 0 (in seconds), identical inputs will return cached responses without making an LLM call.
version = client.prompts.create_version(
prompt_id="your-prompt-id",
system_prompt="You are a helpful assistant.",
user_prompt="Translate '{{ text }}' to {{ target_language }}.",
temperature=0.3,
cache_timeout=3600, # Cache responses for 1 hour
message="Added caching for translation prompt"
)Caching is keyed on the rendered prompt content (after template variables are substituted), so different inputs produce different cache entries.
Creating Prompts
Python SDK
# Create a prompt
prompt = client.prompts.create(
name="Product Description Generator",
description="Generates product descriptions from features"
)
# Create the first version
version = client.prompts.create_version(
prompt_id=prompt["data"]["id"],
system_prompt="You are an expert copywriter who writes compelling product descriptions.",
user_prompt="""Write a product description for:
Product: {{ product_name }}
Category: {{ category }}
Features:
{% for feature in features %}
- {{ feature }}
{% endfor %}
The description should be {{ tone }} and approximately {{ word_count }} words.""",
input_schema={
"type": "object",
"properties": {
"product_name": {"type": "string"},
"category": {"type": "string"},
"features": {"type": "array", "items": {"type": "string"}},
"tone": {"type": "string", "default": "professional"},
"word_count": {"type": "integer", "default": 150}
},
"required": ["product_name", "features"]
},
temperature=0.8,
max_tokens=512,
message="Initial version"
)JavaScript SDK
const prompt = await client.prompts.create({
name: 'Product Description Generator',
description: 'Generates product descriptions from features',
})
const version = await client.prompts.createVersion(prompt.data.id, {
systemPrompt: 'You are an expert copywriter who writes compelling product descriptions.',
userPrompt: `Write a product description for:
Product: {{ product_name }}
Category: {{ category }}
Features:
{% for feature in features %}
- {{ feature }}
{% endfor %}`,
temperature: 0.8,
maxTokens: 512,
message: 'Initial version',
})Testing Prompts
Execute a prompt directly to test it without going through an agent:
result = client.prompts.execute(
prompt_id="your-prompt-id",
input={
"product_name": "Wireless Earbuds Pro",
"category": "Electronics",
"features": [
"Active noise cancellation",
"30-hour battery life",
"IPX5 water resistance"
],
"tone": "enthusiastic"
}
)
print(result["data"]["output"])Prompt Status
| Status | Description |
|---|---|
active | The prompt is available for use and execution |
archived | The prompt is hidden from listings and cannot be executed |
Related Topics
- Prompt Versioning -- Version management and promotion
- Agents -- How agents use prompts
- Tracing -- Prompt rendering appears as
promptspans in traces