The product

Middleware in front of your LLM calls: detect PII, tokenize it, log tamper-evident evidence — all before the prompt leaves your infrastructure. Open source, MIT, self-hosted.

3-Pass Detection Pipeline

Multiple layers of detection ensure no PII slips through. Each pass catches what the previous one missed.

Pass 1

Regex

High-precision pattern matching for structured data.

EMAILSSNCREDIT_CARDPHONEIP_ADDRESSAPI_KEYJWTIBAN

Pass 2

spaCy NER

Named entity recognition for names, orgs, and locations. (Python only)

PERSONORGGPE

Pass 3

Ollama LLM

Local LLM-based semantic detection for contextual PII. (opt-in)

ADDRESSDOBMEDICALFINANCIALBIOMETRIC

Before — Plaintext prompt

Help me write a follow-up email
to Sarah Johnson (sarah.j@techcorp.io)
about the Q3 security audit.
Her direct line is +1-555-0142.

After — Cloaked prompt

Help me write a follow-up email
to [PERSON_0] ([EMAIL_0])
about the Q3 security audit.
Her direct line is [PHONE_0].

Everything you need to protect PII

Drop-in middleware that works with your existing LLM stack. No vendor lock-in, no cloud dependencies.

9 Detection Categories

Emails, SSNs, credit cards, phone numbers, API keys, IBANs, JWTs, AWS keys, and IP addresses — all detected out of the box.

Reversible Tokenization

Deterministic [CATEGORY_N] tokens preserve context for the LLM. Desanitize to restore originals in responses.

Tamper-Evident Audit Logs

Hash-chained JSONL entries with SHA-256 and per-entity metadata. No PII stored — just hashes and counts. EU AI Act Article 12 ready.

One-Line Integration

cloakllm.enable() wraps LiteLLM (Python) or the OpenAI SDK (JS). Works with Vercel AI SDK middleware too.

Multi-Language Detection

13 locales (DE, FR, ES, IT, PT, NL, PL, SE, NO, DK, FI, GB, AU) with country-specific PII patterns for SSNs, tax IDs, and more.

Local LLM Detection

Opt-in Ollama integration catches addresses, medical terms, DOBs, and more. Data never leaves your machine.

Cryptographic Attestation

Ed25519 signed sanitization certificates prove compliance. Merkle tree batch proofs. Cross-language compatible.

Incremental Streaming

StreamDesanitizer replaces tokens as chunks arrive — no buffering the full response. All middleware paths stream incrementally.

Context Risk Analysis

Scores re-identification risk in sanitized text. Detects token density, identifying descriptors, and relationship edges that could reveal identity.

Normalized Token Standard

Formal spec (TOKEN_SPEC.md) with validation utilities, canonical regex, and 62 built-in categories. Both SDKs produce identical tokens.

Pluggable Backends

DetectorBackend base class lets you swap or extend the default regex→NER→LLM pipeline with custom detection stages.

EU AI Act Compliance Reports

Per-article compliance reports (Art 12/19/4a/50) with COMPLIANT / NON_COMPLIANT verdicts and an honest, machine-readable coverage matrix — what CloakLLM provides and what remains your responsibility. JSON, Markdown, or PDF.

Independent Verification

A standalone verifier (cloakllm-verifier, Python + JS) re-checks the hash chain, RFC 3161 timestamps, key provenance, and compliance reports — without the SDK and without trusting our code.

Security Hardened

Ollama SSRF prevention, CLI PII redaction by default, thread-safe internals, and redacted analysis output.

One line to protect your LLM calls

Drop-in middleware for every major LLM framework. No code rewrites needed.

Python

from cloakllm import enable_openai
from openai import OpenAI

client = OpenAI()
enable_openai(client)  # Wraps OpenAI SDK — all calls are now protected

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user",
        "content": "Help me email Sarah Johnson (sarah.j@techcorp.io)"
    }],
)

# PII automatically restored in the response
print(response.choices[0].message.content)

Get started in seconds

Install the SDK for your language and start protecting PII immediately.

Python

$ pip install cloakllm[detection]

$ python -m spacy download en_core_web_sm

The base pip install cloakllm is detection-free (regex + audit only); the [detection] extra adds spaCy NER (PERSON/ORG/GPE).

JavaScript / TypeScript

$ npm install cloakllm

MCP Server & Verifier

$ pip install cloakllm-mcp

$ pip install cloakllm-verifier

SDK Comparison

Three SDKs, same core protection. Pick the one that fits your stack.

Feature	Python	JavaScript	MCP
Regex PII Detection
spaCy NER (PERSON, ORG, GPE)
Ollama LLM Detection (opt-in)
Reversible Tokenization
Redaction Mode
Hash-Chained Audit Logs
CLI (scan / verify / stats)
Multi-Turn Token Maps
Custom Patterns
Field-Level PII Metadata
Batch Processing
Performance Metrics
Incremental Streaming
Middleware Integration	OpenAI / LiteLLM	OpenAI / Vercel	Claude Desktop
Zero Runtime Dependencies

Using CloakLLM? Preparing for a 2027 audit?

We're talking to teams deploying AI under the EU AI Act — what you're building, what an audit will ask of you, and what's missing. Auditors and certification bodies especially welcome.

Talk to us