Typly Anonymizer — safe AI on Polish documents (public demo live)
BLOG

Typly Anonymizer — safe AI on Polish documents (public demo live)

We've shipped a public demo of Typly Anonimizator — a service that removes PESEL, NIP, IBAN, KRS and 10 more Polish PII categories before your text reaches ChatGPT, Claude or Gemini. Validated on 1,000,000+ documents in INFOSTRATEG IV (NCBR).

In Polish companies, employees paste contracts, invoices and CVs into ChatGPT, Claude and Gemini every day. The compliance officer is the last to find out — or never. Typly Anonimizator fixes that: 14 categories of Polish PII removed before the LLM call, restored on the way back.

The public demo is live today at typly.app/anonimizator/.

What we detect

14 categories of Polish identifiers, every one with a dedicated check — not "some 11-digit number" but a real PESEL after the checksum:

  • PESEL (mod-10), NIP (mod-11), IBAN (mod-97)
  • REGON, KRS, national ID, land register, court case reference
  • Administrative case number (e.g. OS.6220.4.2024)
  • Postal code, phone, e-mail, date
  • Plus personal names and company names detected in Polish linguistic context (Sp. z o.o., S.A., Sp. k.)

Each category gets a strategy you choose per request: redact (irreversible placeholder, for BIP publication), index ([PERSON_1] with a mapping, for LLM round-trip), hash (deterministic pseudonym from your salt, for logs), keep_format (structure preserved, value gone — for case numbers).

Format-preserving redaction

A user from a public office doesn't want raw text. They want a PDF returned as a PDF, a DOCX as a DOCX. Same layout, tables, fonts, headers — but without personal data. We support:

  • PDF (text and scanned — built-in OCR with a Polish language model)
  • DOCX, PPTX, ODT
  • JPG / PNG / TIFF with sanitized file metadata
  • Plus TXT and EML for pipeline integration

File names containing PII are neutralized (303456Korycki.pdfdokument-anonimizowany-{hash}.pdf), and PDF metadata (/Author, /Title, /Subject) is overwritten with neutral values.

Validated on a real corpus

Typly Anonimizator was validated as part of INFOSTRATEG IV (NCBR) — the Polish strategic programme advancing modern technologies for the public sector. In that project we anonymized over 1,000,000 documents from Polish municipalities: resident inquiries, invoices, administrative decisions, official correspondence, summons, certificates.

A realistic mix, not a synthetic test set. We don't disclose the names of partner offices in line with our partnership agreement — for your compliance officer the corpus character matters more than a logo.

Compliance as a foundation

  • Pseudonymization compliant with GDPR Art. 4(5) — the mapping enables reversal
  • Anonymization compliant with GDPR Recital 26redact strategy without a map = data outside the GDPR scope
  • Audit trail for accountability (Art. 5(2)) — every anonymization returns a list of entities with positions, ready for a decision_log
  • AI Act compliance — PII anonymization is a foundation for high-risk AI system compliance
  • Configurable per-tenant salt in the hash strategy — pseudonyms don't collide between organizations

Demo and deployment

  • Public demo at typly.app/anonimizator/ — 10 anonymizations per day per IP, text or file, no signup. Your content never lands on our disks.
  • Self-service API: 1000 anonymizations/day with an API key at anon.typly.app
  • On-premise inside your infrastructure — full product offline, for the public sector, banks, insurance, healthcare
  • Hosted in the EEA — TYPLY SP Z O O servers inside the European Economic Area, zero US sub-processors for content you anonymize

Together with Faktura Insight Hub, these are our two flagship B2B products for 2026. Both grew out of NCBR research, both are EU-first in hosting, both address concrete operational gaps in Polish companies and public institutions.

Next step

If your organization is using AI off the books and your compliance officer isn't sleeping — book a 15-minute live demo at calendly.com/krzysztof-typly/30min. We'll show the product on a document from your organization (anonymized in advance for the test). NDA before any demo on your documents. No sales slides.


See also: Typly Anonimizator product page · Faktura Insight Hub

← All posts