Privacy Vault Protocol (PVP) v1 — Specification¶
Project: mcp-pvp
Status: Draft (v1)
Scope: Local-first tokenization of sensitive data, policy-gated disclosure, and “deliver mode” injection for MCP tool calls.
Tokenize → Policy → Deliver → Audit
Agents and LLMs operate on references, not raw values.
1. Motivation¶
MCP-based agents frequently handle sensitive inputs (PII, secrets). Sending raw values into LLM prompts, logs, telemetry, or cloud orchestration increases the risk of leakage.
PVP defines a local-first protocol where: - sensitive values are replaced with typed opaque tokens, - raw values are stored only in a local vault session, - disclosure is allowed only to approved sinks under policy, - deliver mode injects sensitive values locally at the tool boundary so raw values never return to the agent/LLM.
2. Definitions¶
2.1 Vault¶
A local service responsible for: - detecting sensitive spans (PII/secrets), - replacing them with tokens, - storing raw values in a short-lived session, - enforcing disclosure policies, - auditing tokenization/disclosure actions, - optionally executing or dispatching tool calls (deliver mode).
2.2 Vault Session (vault_session)¶
A scope container for tokens. Tokens are valid only within a session.
Properties:
- unguessable identifier (e.g., vs_...)
- TTL-based (expires automatically)
- optionally bound to workflow run/step context
2.3 Token Reference (pii_ref)¶
An opaque identifier (e.g., tkn_...) representing a stored sensitive value inside a vault session.
2.4 Sink¶
A destination where sensitive values may be disclosed/delivered:
- tool: a named tool invocation (preferred)
- engine: a cloud workflow engine/orchestrator (default deny)
- llm: a model prompt/response pipeline (default deny)
2.5 Capability (cap)¶
A signed authorization blob that permits disclosure/delivery of a specific token to a specific sink/argument path for a limited time.
Capabilities are the primary defense against:
- prompt injection (“restore everything”)
- token spoofing (LLM hallucinating tkn_...)
- over-broad restoration requests
3. Token Formats¶
PVP defines two standard token representations.
3.1 Text Token (LLM-safe)¶
[[PII:<TYPE>:<REF>]]
Example:
[[PII:EMAIL:tkn_a1b2c3]]
3.2 JSON Token Object (preferred for tool args)¶
{ "$pii_ref": "tkn_a1b2c3", "type": "EMAIL", "cap": "cap_..." }
Rules:
typeis informational; the vault uses stored metadata + capability claims.capis RECOMMENDED and in most deployments treated as REQUIRED for any disclosure.
4. PII Type Registry (v1)¶
Initial recommended types:
EMAILPHONEIPV4CC(credit card-like values)API_KEY(secret-like tokens)
4.1 Replacement Modes¶
Each type may be configured as:
TOKENIZE(reversible): store raw value and return tokenMASK(irreversible default): return masked placeholder without storing raw
Recommended defaults:
EMAIL,PHONE,IPV4:TOKENIZECC,API_KEY:MASK(tokenization allowed only if explicitly enabled)
5. Security Requirements (Normative)¶
A compliant vault MUST:
- Default deny disclosure unless explicitly permitted by policy.
- Deny disclosure to sink kinds
llmandengineunless explicitly enabled. - Bind tokens to a
vault_session; tokens MUST NOT be globally valid. - Enforce session TTL and reject expired sessions.
- Not log raw sensitive values (audit only metadata).
- Validate capabilities (if enabled): signature, expiration, and claim matches.
- Enforce rate/size limits on disclosure to reduce exfiltration blast radius.
6. Policy Model¶
Policy evaluation occurs inside the vault.
6.1 Sink Allow-Lists¶
Policy rules allow specific types to be disclosed to specific tool sinks and optionally restrict to specific argument paths.
Example:
{
"sinks": {
"tool:send_email": {
"allow": [
{ "type": "EMAIL", "arg_paths": ["to", "cc", "bcc"] }
]
}
},
"defaults": { "allow": [] }
}
6.2 Limits¶
Recommended limits (per workflow step):
max_disclosures_per_stepmax_total_disclosed_bytes_per_step
The vault should track disclosures by (workflow_run_id, step_id) when provided.
7. Capability Format¶
7.1 Claims (JSON)¶
A capability binds token usage to a sink and (optionally) to run context.
{
"v": 1,
"vault_session": "vs_...",
"pii_ref": "tkn_...",
"pii_type": "EMAIL",
"sink": { "kind": "tool", "name": "send_email", "arg_path": "to" },
"run": { "workflow_run_id": "wr_...", "step_id": "s2" },
"exp": 1730000000
}
7.2 Encoding (recommended, lightweight)¶
cap = base64url(json_bytes) + "." + base64url(hmac_sha256(secret, json_bytes))
Verification:
- signature matches
expnot exceeded- claims match requested sink/run constraints
8. Operations¶
PVP defines three primary operations:
tokenizeresolve(fallback)deliver(recommended)
All operations return an envelope:
8.1 Response Envelope¶
Success:
{ "ok": true, "result": { ... }, "error": null }
Failure:
{
"ok": false,
"result": null,
"error": {
"code": "ERR_POLICY_DENIED",
"message": "Disclosure denied by policy",
"details": { "rule": "tool:send_email allows EMAIL only for to/cc/bcc" }
}
}
9. tokenize¶
9.1 Purpose¶
Detect sensitive spans in content, replace with tokens, and store raw values locally.
9.2 Request¶
{
"vault_session": null,
"content": "Contact me at mitiku@example.com",
"content_type": "text/plain",
"run": { "workflow_run_id": "wr_...", "step_id": "s1" },
"policy_context": { "org_id": "org_...", "user_id": "u_..." },
"options": {
"token_format": "TEXT",
"include_caps": true,
"types": ["EMAIL", "PHONE", "IPV4", "CC", "API_KEY"]
}
}
9.3 Result¶
{
"vault_session": "vs_01J...",
"redacted": "Contact me at [[PII:EMAIL:tkn_a1]]",
"tokens": [
{
"ref": "tkn_a1",
"type": "EMAIL",
"occurrences": 1,
"caps": [
{
"sink": { "kind": "tool", "name": "send_email", "arg_path": "to" },
"cap": "cap_..."
}
]
}
],
"stats": { "EMAIL": 1 }
}
Notes:
- If
vault_sessionis null, vault creates one. - If a type is configured as
MASK, vault may output a masked marker instead of issuing a token.
10. resolve (Selective Disclosure, fallback)¶
10.1 Purpose¶
Return raw values for specific token refs, only if permitted.
10.2 Request¶
{
"vault_session": "vs_01J...",
"need": [
{ "ref": "tkn_a1", "cap": "cap_..." }
],
"sink": { "kind": "tool", "name": "send_email", "arg_path": "to" },
"run": { "workflow_run_id": "wr_...", "step_id": "s2" }
}
10.3 Result¶
{
"values": { "tkn_a1": "mitiku@example.com" },
"audit_id": "aud_...",
"disclosed": [
{ "ref": "tkn_a1", "type": "EMAIL", "bytes": 17 }
]
}
Security stance:
- Prefer
deliveroverresolveso raw PII does not transit back to agents/cloud.
11. deliver (Preferred)¶
11.1 Purpose¶
Safely inject sensitive values into tool calls locally under policy and capability checks.
11.2 Request¶
{
"vault_session": "vs_01J...",
"tool_call": {
"name": "send_email",
"args": {
"to": { "$pii_ref": "tkn_a1", "type": "EMAIL", "cap": "cap_..." },
"subject": "Hello",
"body": "..."
}
},
"run": { "workflow_run_id": "wr_...", "step_id": "s3" }
}
11.3 Result¶
{
"delivered": true,
"tool_result": { "message_id": "m_..." },
"audit_id": "aud_..."
}
Behavior:
-
For each token object in args:
-
validate session + token existence
- validate cap (if enabled)
- validate policy for
tool:<name>at the argument path - enforce limits
- Inject raw values just-in-time at the tool boundary.
- Never log raw values.
Execution model:
- Vault may execute tools itself or dispatch to a trusted local runner.
12. Error Codes (Standard)¶
ERR_INVALID_REQUESTERR_UNAUTHENTICATEDERR_UNAUTHORIZEDERR_VAULT_SESSION_UNKNOWNERR_VAULT_SESSION_EXPIREDERR_TOKEN_UNKNOWNERR_CAP_INVALIDERR_CAP_EXPIREDERR_POLICY_DENIEDERR_LIMIT_EXCEEDEDERR_INTERNAL
13. Audit Events¶
A compliant implementation should emit audit events locally (no raw values), including:
TOKENIZE: counts by type, session id, run contextRESOLVE: sink/tool, arg_path, types disclosed, allowed/deniedDELIVER: tool name, arg_paths injected, types disclosed, allowed/denied
Audit entries should include correlation fields when available:
vault_sessionworkflow_run_idstep_idaudit_id
14. Detection Engines¶
PVP does not mandate a specific detection engine. Implementations may use:
- Microsoft Presidio (recommended default when installed)
- a lightweight regex fallback for environments without Presidio
Regardless of detector:
- disclosure and delivery semantics remain the same
- masking/tokenization modes per type must be enforced consistently
15. Compatibility Notes¶
-
The protocol is transport-agnostic and can be exposed via:
-
MCP tools:
pvp.tokenize,pvp.resolve,pvp.deliver - HTTP endpoints (localhost binding recommended)
- All integrations should preserve token objects and avoid “restoring into prompts.”
16. Appendix: Minimal End-to-End Example¶
-
Tokenize:
-
input:
Email me at mitiku@example.com -
redacted:
Email me at [[PII:EMAIL:tkn_a1]] -
LLM tool plan references token:
{
"action": "send_email",
"args": { "to": { "$pii_ref": "tkn_a1", "type": "EMAIL", "cap": "cap_..." } }
}
-
Deliver injects locally and executes:
-
returns tool result, not raw PII
{ "delivered": true, "tool_result": { "message_id": "m_123" } }
::contentReference[oaicite:0]{index=0}