[CloakLLM]

CloakLLM Documentation

Open-source PII protection middleware for LLMs. Detect, tokenize, and audit — before prompts leave your infrastructure.

CloakLLM Documentation

Welcome to the CloakLLM documentation. CloakLLM is open-source PII protection middleware for LLMs that detects sensitive data, replaces it with reversible tokens, and maintains tamper-evident audit logs — all before your prompts leave your infrastructure.

📄 New whitepaper: The Article 12 Paradox — why GDPR and the EU AI Act cannot both be satisfied without PII middleware.

Quick Install

Python:

pip install cloakllm

JavaScript / TypeScript:

npm install cloakllm

MCP Server:

pip install cloakllm-mcp

Key Features

  • 3-pass PII detection: regex, spaCy NER (Python), and optional Ollama LLM
  • Multi-language detection: 13 locales (de, fr, es, it, pt, nl, pl, se, no, dk, fi, gb, au) with locale-specific PII patterns
  • Reversible tokenization: deterministic [CATEGORY_N] tokens preserve context for the LLM
  • Cryptographic attestation: Ed25519 signed sanitization certificates with Merkle tree batch proofs
  • Tamper-evident audit logs: hash-chained JSONL with per-entity metadata, EU AI Act Article 12 ready
  • Incremental streaming: StreamDesanitizer replaces tokens as chunks arrive
  • Context risk analysis: scores re-identification risk in sanitized text by analyzing token density, identifying descriptors, and relationship edges
  • Normalized Token Standard: formal spec with validation utilities (validateToken, parseToken), canonical regex, and 62 built-in categories
  • Pluggable detection backends: DetectorBackend base class for custom detection pipelines; swap or extend the default regex→NER→LLM pipeline
  • Article 12 Compliance Mode: formal EU AI Act compliance profile (v0.6.0) with compliance_summary(), export_compliance_config(), and structured verify_audit(output_format="compliance_report")
  • Enterprise Key Management: opt-in HSM/KMS signing keys (AWS KMS, GCP KMS, Azure Key Vault, HashiCorp Vault) — Python SDK
  • Security hardening: Ollama SSRF prevention, CLI PII redaction, thread-safe internals, redacted analysis output
  • One-line integration: wraps OpenAI SDK, LiteLLM, Vercel AI SDK, and MCP

Next Steps

Read the complete usage guide covering installation, configuration, middleware integration, audit logs, and more.

On this page