🔒 On-Premise Enterprise PII Redaction

Redact Every Name,
ID & Number —
Automatically.

PiiRemover surgically removes PII from any document — PDFs, scanned images, plain text — using a stacked engine of 11 detection strategies. 100% on your servers. Zero cloud exposure.

🚀 Start Here 📊 Open Dashboard →

Pattern Engines

∞

PII Field Types

100%

On-Premise

Cloud Calls

🔒 localhost:7049/admin/tester

🔒 PiiRemover Dashboard 🔍 Tester & Debug Clients PII Fields Logs ⚙ Settings v3.0.11

📂 Drop files here or click to select

PDF · Image · TXT · DOCX — up to 50 MB

PDF patient_record.pdf PDF discharge_summary.pdf

OCR + Redact pipeline · same engine as the API

PDF patient_record.pdf 128 KB 📍OCR: 312ms · ✓ 5 matches

📝 OCR 312ms 🔍 5 matches 🏷 3 fields ⚡ Total 448ms

Patient Name Israeli ID Phone

📝 Original — PII highlighted

00000Patient: David Cohen 00022ID: IL-1234567 00035DOB: 1985-03-14 00049Phone: 050-123-4567 00065Diagnosis: Hypertension 00087Attending: Dr. Sarah Levy

🔒 Redacted Output ⏱ 136ms

00000Patient: XXXXXXXXXXX 00022ID: XXXXXXXXXX 00035DOB: XXXXXXXXXX 00049Phone: XXXXXXXXXXXX 00065Diagnosis: Hypertension 00087Attending: XXXXXXXXXXXXXX

PII Fields Configuration

Define What Gets Redacted.
In Full Detail.

The Fields page is where you create and tune every detection rule — fields, patterns, preserve lists, and name dictionaries. The real admin UI, exactly as it looks.

🔒 localhost:7049/admin/fields

🔒 PiiRemover Dashboard 🔍 Tester PII Fields Logs ⚙ Settings 🚀 Getting Started v3.0.11

PII Fields & Patterns 📖 Pattern Help

💾Save BackupDownload fields as JSON

📂Load BackupRestore from JSON file

Add PII Field (redact matches)

Field name

Replacement char × match length

█

•

🛡 Add Preserve Field (never redacted)

Terms listed here override all rules — institution names, medicine names, etc.

Field name

Initial terms (pipe-separated)

Field NameReplacePatternsPriorityActions

Patient Name █ 3 patterns 10

Israeli ID █ 2 patterns 20

Phone Number * 1 pattern 30

🛡 Medical Terms — Preserve 1

Names Dictionary █ FileList 5

▾ Patterns for: Patient Name

TypePatternNotesScope / Matches

AfterLabel Patient:|שם מלא: Value after label ✓ 1 hit

FileList names_il.dat (84,320) 🎯 Scope: pos 0–500 ✓ 2 hits

ConstList Dr. Cohen|Dr. Levy Known doctors ✓ 1 hit

Getting Started Guide

Zero to First Redaction
in 5 Minutes.

The built-in guide — live in your browser — walks every step with real before/after examples and direct links to each section.

🔒 localhost:7049/admin/getting-started

🔒 PiiRemover Dashboard 🔍 Tester PII Fields Logs ⚙ Settings 🚀 Getting Started v3.0.11

🚀 Getting Started

From zero to your first redacted document in under 5 minutes.

① Define a PII Field ② Add a Pattern ③ Test in Live Tester ④ Protect specific terms ⑤ Call the API ⑥ Name Lists & Scope

Define a PII Field — what should be redacted?

A PII Field is a named category (e.g. "Israeli ID", "Phone", "Patient Name"). Each field has a replacement character that fills the redacted space, preserving document length.

📄 Input document

Patient: David Cohen ID: 123456789 Phone: 050-1234567 Complaint: Chest pain

→

✅ After redaction

Patient: ███████████ ID: █████████ Phone: ███████████ Complaint: Chest pain

💡Replacement char █ repeats to match original text length — positional layout is preserved. Use *, -, # or any single character.

→ Create a field now:Fill in field name → click Add Field.

Add a Pattern — how should the field find its data?

Every field needs at least one pattern. Pick the simplest type — you can stack multiple patterns on one field.

🔤 WholeWord"David Cohen"Exact names

📋 ConstListCohen|Levy|SharonFixed term list

🔢 NumberSeqlength=9Israeli ID digits

🏷 AfterLabel"Patient:"Value after label

🔍 Regex\b\d{9}\bComplex patterns

📁 FileListnames.datLarge dictionaries

🎯 BetweenMarkers[START]..[END]Bracketed values

↩ BeginsWith+972Prefix matching

📖Full reference for all 11 engines at 📖 Pattern Help

Test in the Live Tester — verify before production

Upload a real document, run the engine, inspect matches visually. The Tester runs the exact same engine as the API.

📤 Upload & Analyze

① Drag a file onto the drop zone
② Click Analyze All
③ OCR pane = raw extracted text
④ Redact pane = replacements in green

🔍 Toolbar tools

📍 Pos — character position markers
⊞ Grid — offset gutter per line
⬇ Redacted — download as .txt
Right-click → Set Scope End / report miss

🛡 Protect specific terms — Preserve Fields

Preserve Fields are a whitelist — any listed term is never redacted, regardless of other rules. Hospital names, medicine names, city names.

⚠ Without Preserve

Admitted to ███████ ██████ Hospital Dr. ████████: Atrial Fibrillation

→

✅ With Preserve: "Hadassah"

Admitted to Hadassah Hospital Dr. ████████: Atrial Fibrillation

Call the API from your application

# Upload any file, get redacted text + match positions curl -X POST https://your-server/api/v1/redact/redact \ -H "X-Api-Key: YOUR_KEY_HERE" \ -F "file=@patient_record.pdf" // Response { "matchCount": 5, "fieldsHit": ["Patient Name", "Israeli ID", "Phone"], "matches": [{ "startIndex": 9, "length": 11, "fieldName": "Patient Name", "matchedText": "David Cohen", "replacement": "███████████" }] }

📋 Common Field Configurations

What to redactFieldEnginePatternReplace

Israeli ID (9 digits)Israeli IDRegex\b[0-9]{9}\b█

Israeli mobilePhoneRegex0[5-9]\d[-\s]?\d{7}█

Email addressEmailRegex[a-z0-9.]+@[a-z]+\.[a-z]{2,}█

Patient from labelPatient NameAfterLabelPatient:|שם:█

Institution (keep)🛡 InstitutionsPreserveHadassah|Ichilovnever

Drug names (keep)🛡 Medical TermsPreserveAspirin|Warfarinnever

Why PiiRemover

Everything Compliance Needs.
Nothing It Doesn't.

Built for healthcare, legal, and financial teams who process sensitive documents at scale.

🏗

Flexible Field Architecture

Define any number of PII field types. Each gets its own name, replace char, priority, and stacked patterns.

Unlimited custom field types
Stack 11 engine types per field
Per-field replacement strategy (█, *, #…)
Preserve whitelist overrides everything

🔐

Zero Cloud Exposure

Runs entirely on your infrastructure. No document ever leaves your network.

100% air-gappable deployment
SQLite — no external DB needed
API key auth per client system
Up to 5 independent admin accounts

📄

Multi-Format OCR Pipeline

From clean digital PDFs to scanned paper. Dual OCR with automatic fallback.

PDF with embedded text
Scanned PDF → dual OCR engines
Image files (JPEG, PNG, TIFF)
Hebrew + English + mixed

🗄

Auto-Backup & Recovery

Scheduled backups run in the background. Browse, download, or restore any point instantly.

Schedule: every 6h → weekly
Keep-last-N pruning
One-click restore with safety copy
Manual backup on demand

📊

Live Dashboard & Audit Log

Every API call logged. 30-day call chart, top matched fields, per-client breakdown.

30-day bar chart with today highlighted
Top PII fields by hit count
Error rate & avg duration KPIs
Filterable full audit log + CSV export

🎯

Scope & Name Dictionaries

Import 80,000+ name lists. Scope limits matching to the document header — no false positives in body text.

Upload .dat/.txt name files
ScopeEndPosition per pattern
Scope End Markers (text triggers)
Right-click in Tester to set scope

REST API

One Endpoint.
Infinite Integration.

POST a file, receive sanitized text with full match metadata. Any language, any platform.

🖥 Terminal — POST /api/v1/redact/redactREST

curl -X POST https://your-server/api/v1/redact/redact \

  -H "X-Api-Key: sk-••••••••••••" \

  -F "file=@patient_record.pdf"

// 200 OK — full match metadata in response

{

  "ocr": { "text": "Patient: David Cohen\n...",

          "durationMs": 312 },

  "redact": {

    "matchCount": 5,

    "fieldsHit": ["Patient Name", "Israeli ID"],

    "matches": [{

      "startIndex": 9, "length": 11,

      "fieldName": "Patient Name",

      "matchedText": "David Cohen",

      "replacement": "███████████"

    }]

  }

}

🔑

Per-Client API Keys

Issue separate keys for every system. Revoke, rotate, or audit each client independently from the admin panel.

📍

Exact Match Positions

Every response includes startIndex, length, field name, matched text, and replacement — your systems know exactly what and where.

📈

Quota & Full Audit Trail

Set per-client request quotas. Every call logged with filename, duration, matches, and timestamp. Filter & export CSV for compliance.

⚡

Swagger / OpenAPI Built-in

Interactive API documentation at /swagger — auto-generated, always accurate, no external dependency.

Redact Every Name,ID & Number —Automatically.

Define What Gets Redacted.In Full Detail.

Zero to First Redactionin 5 Minutes.

Everything Compliance Needs.Nothing It Doesn't.