# LLM Gateway

> Use PromptRails-hosted models without managing provider keys, while keeping usage, balance, and traces visible.

Source: https://0.0.0.0:8080/docs/llm-gateway

The LLM Gateway lets a workspace use PromptRails-hosted models without bringing its own provider API keys. It exposes an OpenAI-compatible API, routes requests to the configured upstream provider, and records usage against the workspace.

Use the gateway when you want a simple way to call hosted models, test a model before adding your own credential, or give an agent access to a model without distributing provider secrets across the team.

## How It Fits

Gateway usage is separate from your workspace subscription. It is tracked through the workspace balance in **Billing > Balance**.

The product shows:

- Current hosted-model balance
- Recent balance transactions
- Active hosted models available to the workspace
- Whether a model has a free usage allowance
- Model capabilities such as streaming, tools, or vision when available

The docs intentionally do not list model names, prices, or free-token amounts because those change over time. Use **Billing > Balance** for the current catalog.

## When To Use It

Use the gateway when:

- You want to run PromptRails-hosted models without configuring provider credentials.
- You need an OpenAI-compatible endpoint for a script, backend service, or integration.
- You want usage to be tied to a workspace balance and transaction history.
- You want agents and prompts to use a hosted model while keeping provider credentials managed by PromptRails.

Use your own provider credentials when:

- Your company already has direct provider contracts.
- You need a provider account, region, or deployment controlled by your organization.
- Your security policy requires direct ownership of the upstream key.

## Free Usage Allowances

Some hosted models can include a free monthly usage allowance. The allowance is model-specific and can change as the catalog changes, so it is shown in the product rather than documented as static text.

When a request is within the free allowance, it can be served without reducing the workspace balance. After the allowance is exhausted, usage is deducted from the balance according to the active model catalog.

## Balance and Transactions

The balance tab is the operational view for hosted model usage:

- **Balance** shows the current amount available for hosted model calls.
- **Deposit** adds funds to the workspace balance.
- **Auto-reload** can top up the balance when it drops below a threshold.
- **Recent transactions** show deposits, usage deductions, refunds, and auto-reloads.
- **Model catalog** shows active hosted models, free allowance status, and capabilities.

Do not use the docs as a pricing source. The product is the source of truth for active catalog and balance details.

## Calling the Gateway Directly

The gateway accepts OpenAI-compatible chat completion requests. Use an API key or authenticated request for the workspace and pass the workspace ID with the request.

<TechnicalDetails title="OpenAI-compatible request shape">

```bash
curl https://api.promptrails.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_PROMPTRAILS_API_KEY" \
  -H "X-Workspace-ID: YOUR_WORKSPACE_ID" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "pr/<model-slug>",
    "messages": [
      { "role": "system", "content": "You are a concise assistant." },
      { "role": "user", "content": "Summarize this request." }
    ],
    "stream": false
  }'
```

Supported request features depend on the selected hosted model. The gateway can expose OpenAI-compatible chat completions, streaming responses, tool calls, structured output, and vision-capable messages when the active model supports them.

</TechnicalDetails>

## Using It From Agents

Agents and prompts can use PromptRails-hosted models through the same model selection flow used for other providers. When a hosted model is selected, PromptRails routes the call through the gateway. No workspace-level provider credential is required for that model.

That keeps hosted-model runs connected to the same product surfaces:

- Prompt and agent version history
- Execution traces
- Cost and usage views
- Billing balance and transactions
- Data masking policies when enabled

## Related Topics

- [Billing and Plans](/docs/billing-and-plans) -- Workspace balance, invoices, and payment settings
- [Prompts](/docs/prompts#model-capabilities) -- Capabilities exposed by selected models
- [Credentials](/docs/credentials) -- Bring your own provider credentials
- [Tracing](/docs/tracing) -- Inspect model calls, usage, latency, and errors
- [API Keys and Scopes](/docs/api-keys-and-scopes) -- Create keys for direct API access