Introducing OpenAI Privacy Filter

OpenAI: “Our state of the art model for masking personally identifiable information (PII) in text. Today we’re releasing OpenAI Privacy Filter, an open-weight model for detecting and redacting personally identifiable information (PII) in text. This release is part of our broader effort to support a more resilient software ecosystem by providing developers practical infrastructure for building with AI safely, including tools and models that make strong privacy and security protections easier to implement from the start.

  • Privacy Filter is a small model with frontier personal data detection capability. It is designed for high-throughput privacy workflows, and is able to perform context-aware detection of PII in unstructured text. It can run locally, which means that PII can be masked or redacted without leaving your machine. It processes long inputs efficiently, making redaction decisions in a quick, single pass.
  • At OpenAI, we use a fine-tuned version of Privacy Filter in our own privacy-preserving workflows. We developed Privacy Filter because we believe that with the latest AI capabilities, we could raise the standard for privacy beyond what was already on the market. The version of Privacy Filter we are releasing today achieves state-of-the-art performance on the PII-Masking-300k benchmark, when corrected for annotation issues we identified during evaluation.
  • With this release, developers can run Privacy Filter in their own environments, fine tune it to their own use cases, and build stronger privacy protections into training, indexing, logging, and review pipelines…”
Posted in: AI, E-Records, Internet, Knowledge Management, Legal Research, Privacy, Search Engines