Data Masking

Intercept PII in outbound LLM calls and replace it with opaque placeholders before the request leaves your workspace. The agent runtime sees real values; the cloud provider sees only tokens.

Data Masking

Data Masking is PromptRails' answer to the most common enterprise blocker for cloud LLMs: "we can't let customer data leave our perimeter." With masking on, sensitive values in your prompts, datasource results, and tool arguments are replaced with opaque placeholders before the request goes to OpenAI / Anthropic / Gemini / any other provider. The placeholders are restored on the response path, so your agents and tools keep seeing real values — only the provider sees the masked form.

Data Masking is available on the Pro and Enterprise plans. Free and Starter workspaces will see an upgrade prompt on the masking settings page.

What Gets Masked

Twelve built-in detectors run on every outbound LLM call when masking is enabled:

Category	Types
Identity	Email, phone, credit card (Luhn-validated), IBAN (mod-97), US SSN, TC Kimlik No (TR national ID)
Network	IPv4, IPv6
Secrets	JWT tokens, AWS access keys, generic API keys (OpenAI / Anthropic / Stripe / GitHub / Slack / Google), PEM private key blocks

Each detector pairs a pattern with a validator — credit cards must pass Luhn, IBANs must pass mod-97, US SSNs reject reserved area ranges, TC Kimlik passes the two-digit checksum. False positives stay rare.

Two additional types — Name and Address — have no regex (their shapes are too ambiguous) and are masked only when you mark them explicitly on a datasource column.

How It Works

Prompt / tool result (real PII)
    │
    ▼
Detect + replace with [PII_TYPE_xxxxxxxx] placeholders
    │
    ▼
Cloud LLM provider (sees only the placeholders)
    │
    ▼
Response with placeholders preserved
    │
    ▼
Restore originals before the agent / tool / chat client reads them

The placeholder format is stable so the LLM can reason about coreferences ("the email I mentioned earlier") and tool calls receive the real value when they execute — the boundary is strictly the cloud provider.

Workspace Policy

Open Settings → Data Masking in your workspace to flip the master switch on. Two controls:

Enabled — off by default. When on, every outbound LLM call from this workspace runs through the masking engine.
Failure mode — what to do if the masking engine itself fails (rare; the engine talks to a workspace-local store):
- Strict (recommended) — abort the request rather than risk leaking PII. Best for compliance-sensitive workspaces.
- Permissive — log a warning and let the request through. Useful in dev workspaces where stability beats strict guarantees.

You can also restrict masking to a subset of detector types — for example, "only mask EMAIL and PHONE" — when you want a targeted policy.

Per-Agent and Per-Datasource Overrides

The workspace policy is the default. Individual agents and data sources can override it.

In the Studio detail page for an agent or data source, open the PII Masking tab (it's hidden behind the + menu — you opt in to see it). The tab offers a three-way control:

Inherit workspace policy — follow whatever the workspace setting is. Recommended unless this specific resource has a different requirement.
Force on — always mask this resource's outbound calls, even if the workspace policy is off.
Force off — skip masking on this resource even if the workspace policy is on. Use sparingly — typically for a non-PII dev datasource where masking adds noise.

A small chip on the tab label tells you at a glance whether an explicit override is set.

Marking Datasource Columns

Detectors catch values whose shape is recognizable (an email always looks like an email). They don't catch names, internal customer IDs, or domain-specific identifiers — but those are usually the values you most want to mask in queries.

Open a credential's detail page in Settings → Credentials. The schema view now carries a PII dropdown per column:

Mark a column as	Behaviour
`Email`, `Phone`, `Credit card`, `IBAN`, `SSN`, `TC Kimlik`, `IP`, `JWT`, `AWS / API / Private key`	Validator type is matched — useful when a column holds the value but the detector would also catch it from raw text.
`Name`, `Address`	High-confidence masking for values no regex would catch.
`none`	Default. The column flows through unmasked unless a detector matches its content.

After saving, every value returned in that column from a datasource tool call gets masked before the prompt that wraps it reaches the LLM. The trace count badge tells you how many fields were intercepted.

Trace Visibility

Every LLM span in the trace UI shows a small amber N PII masked chip in its header when the call had PII intercepted. The number tells you the count; the trace store never holds the original values themselves — only the count, model name, and the usual timing / token usage. This means an auditor can verify that masking ran without giving them a path to the raw PII that was masked.

The chip appears on the trace list view too, so you can scan a session for which calls had PII flow through them.

Upstream Hints (API Clients)

When you call the OpenAI-compatible gateway directly (without going through an agent), you can attach structured PII markers with the X-Masking-Hints header. The value is a base64-encoded JSON array:

HINTS=$(printf '%s' '[{"value":"John Doe","type":"NAME"},{"value":"alice@x.com","type":"EMAIL"}]' | base64)
 
curl https://api.promptrails.ai/v1/chat/completions \
  -H "Authorization: Bearer $PROMPTRAILS_API_KEY" \
  -H "X-Workspace-ID: $WORKSPACE_ID" \
  -H "X-Masking-Hints: $HINTS" \
  -d '{
    "model": "pr/gpt-4o",
    "messages": [{"role":"user","content":"draft an email to John Doe at alice@x.com"}]
  }'

Hints take precedence over the built-in detectors on overlap — and they're the only way to mask NAME and ADDRESS values for ad-hoc text outside a datasource.

What Happens On Downgrade

If a workspace was on the Pro plan with masking enabled and then downgrades to Starter, the gateway notices on the next request and stops masking — your settings document is preserved, so re-upgrading restores the prior behaviour. If you actively try to enable masking from a plan without the feature, the API returns 402 Payment Required and the dashboard surfaces an upgrade card with a link to billing.

What Doesn't Change

The agent runtime sees real values. Tools called with a masked email get the real email when they execute.
Coreference within a single conversation works — the LLM is given the same placeholder for repeat mentions, so it can still reason about "the customer", "their email", etc.
Streaming continues to work — placeholders that span chunk boundaries are reassembled before they reach your client.

Audit Notes

Masking state is per-workspace; no data crosses between workspaces.
The placeholder mapping store is encrypted at rest. Plaintext PII is never written to the masking store.
Workspace mappings expire one hour after last activity by default.
The trace store only ever sees masked content and the count attribute. There is no path from the trace UI to the original values.

Guardrails — content-safety scanners that run around the LLM call. Composes with masking; both are independently configurable.
Security — workspace isolation, encryption, and authentication that masking builds on.
Billing & Plans — feature-flag matrix per plan tier.

Data Masking

Data Masking

What Gets Masked

How It Works

Workspace Policy

Per-Agent and Per-Datasource Overrides

Marking Datasource Columns

Trace Visibility

Upstream Hints (API Clients)

What Happens On Downgrade

What Doesn't Change

Audit Notes

Related