Data Loss Prevention (DLP) Module¶
The DLP module detects and prevents sensitive data leakage across browser, email, and Teams interactions. It monitors text input in form fields, contenteditable elements, and file uploads on websites that match your policies - scans email body text and attachments in Microsoft Outlook - and monitors all Teams channel and chat messages server-side via the Microsoft Graph API. It requires an active subscription.
DLP is available in:
- Browser Extension (Chrome) - monitors websites matching your policies
- Outlook Add-in - monitors email compose windows and attachments before send
- Microsoft Teams - monitors all channel and chat messages server-side (requires Azure AD app registration)
Getting Started¶
- Open the DLP section to review groups and patterns.
- Start with available templates where applicable, then adjust enforcement levels and exceptions.
- Communicate changes to users so expectations are clear on specific sites.


DLP Groups¶
DLP patterns are organized into DLP Groups. Each group contains one or more patterns and can be linked to policies to control where scanning is applied.
- Create groups from the DLP section in the portal
- Link groups to one or more policies - scanning only activates on websites that match a policy's rules
- Each group can contain multiple pattern types
- A single policy can have multiple DLP groups attached
Linking Groups to Policies¶
DLP scanning is not global. It is tied to policies through a many-to-many relationship:
- Create your DLP group and add patterns
- Open a policy and attach one or more DLP groups
- When a user visits a site matching that policy, the browser extension receives the DLP patterns and begins scanning
Only enabled patterns from linked groups are sent to the browser extension and Outlook add-in. Disabled patterns are excluded at the API level.
Templates¶
PolicyClue ships with predefined DLP group templates that you can import as a starting point. When creating a new DLP group, select Import from template to populate it with curated patterns.
Available templates:
Financial Data¶
Credit cards, IBAN/BIC, VAT numbers, payment references, and CH/LI specifics.
PII (EU/CH)¶
Personal identifiers common in Switzerland and Europe.
IT Information¶
Network and IT identifiers, conservative to reduce false positives.
Patient Data (CH/EU)¶
Typical patient identifiers used in Swiss/European healthcare contexts.
File Type Detection¶
Detect and report file types by extension and MIME type. Uses the extension and mime scan targets. All 25 patterns are enabled and set to report-only (mode 0) by default. Admins can raise enforcement modes as needed.
Security:
- Executable files by extension (
.exe,.msi,.bat,.cmd,.ps1,.vbs, etc.) - Script files by extension (
.js,.vbe,.wsh,.reg,.lnk, etc.) - Archive files by extension (
.zip,.rar,.7z,.tar,.gz, etc.) - Macro-enabled Office documents by extension (
.docm,.xlsm,.pptm, etc.) - Disk image files by extension (
.iso,.dmg,.vhd,.vmdk, etc.) - Database files by extension (
.sql,.sqlite,.mdb,.accdb) - Certificate and key files by extension (
.pem,.key,.crt,.p12,.pfx, etc.) - Windows executables by MIME type (
application/x-msdownload,application/x-dosexec, etc.) - Archive formats by MIME type (
application/zip,application/x-rar-compressed, etc.) - ELF/Mach-O binaries by MIME type (
application/x-elf,application/x-mach-binary, etc.) - SQLite databases by MIME type
- WebAssembly modules by MIME type
Developer and IT:
- Source code files by extension (
.py,.java,.go,.ts,.sh, etc.) - Configuration and environment files by extension (
.env,.yml,.conf,.htpasswd, etc.) - Legacy MS Office documents by MIME type
- Backup and export files by extension (
.bak,.dump,.dmp,.snapshot, etc.)
Media:
- Audio files by MIME type (
audio/mpeg,audio/flac,audio/ogg, etc.) - Video files by MIME type (
video/mp4,video/webm, etc.) - Image files by MIME type
- PDF documents by MIME type
Healthcare and medical:
- Healthcare data files by extension (
.hl7,.cda,.ccd,.fhir,.dicom,.oru,.adt, etc.) - Medical imaging files by extension (
.dcm,.dicom,.nii,.mha, etc.)
Banking and finance:
- Banking and financial data files by extension (
.swift,.mt940,.camt,.pain,.sepa,.ofx,.bai, etc.) - Swiss banking files by extension (
.dta,.lsv,.ezag,.esr,.camt053,.pain001, etc.) - Accounting and tax files by extension (
.xbrl,.saft,.fec,.elster, etc.)
Templates are tracked via an internal template ID so you can see which groups originated from a template.
Features¶
Scan Targets¶
Each DLP pattern has a scan target that controls what the pattern is matched against. This allows DLP groups to contain patterns that scan different aspects of user data.
| Scan Target | Description | Matched Against |
|---|---|---|
| Content (default) | Text content scanning | Form fields, email body text, extracted text from files |
| Extension | File extension matching | The file extension of uploads/attachments (e.g. exe, bat, zip) |
| MIME | File type detection via magic bytes | The detected MIME type from file binary signatures (e.g. application/x-msdownload) |
Content is the default and matches the existing behavior - patterns scan text content.
Extension patterns use regex to match against file extensions. For example, ^(exe|msi|bat|cmd)$ blocks executable file types. This is useful for simple file type restrictions based on the filename.
MIME patterns use regex to match against the MIME type detected from the file's magic bytes (binary signature at the start of the file). This is more reliable than extension matching because it detects the actual file type regardless of the filename. For example, a .txt file that is actually a renamed .exe will still be detected as application/x-msdownload.
Extension and MIME patterns are available on: - Browser Extension - scans file uploads - Outlook Add-in - scans email attachments before send
A single DLP group can contain patterns with different scan targets. For example, a "File Type Detection" group might contain both extension patterns (for script files that lack distinctive magic bytes) and MIME patterns (for binary executables where magic byte detection is reliable).
Text-based DLP Scanning¶
Detects sensitive data patterns in text content across both the browser extension and the Outlook add-in.
Browser Extension:
- Patterns are defined as regular expressions (regex) with Unicode support
- Supported elements:
<input>,<textarea>, andcontenteditableelements (e.g. rich text editors) - Scanning is event-driven - the extension checks content on focus out, input, paste, submit, and Enter key events
- Duplicate alerts for the same pattern and matched value are automatically deduplicated
Outlook Add-in:
- Scans the email body (plain text) when the user clicks Send via the on-send handler
- Duplicate alerts for the same pattern and matched value are automatically deduplicated
File-based DLP Scanning¶
Monitors file uploads (browser extension) and email attachments (Outlook add-in) for sensitive content. Files are scanned client-side before upload or send.
Browser Extension:
- Triggered on file input changes and drag-and-drop onto file inputs
- The extension extracts text content and runs the same pattern library against it
Outlook Add-in:
- Scans email attachments before send using the same file format support and pattern library
- Attachment content is retrieved via the Office.js API and scanned locally
Supported file formats:
| Format | Extensions | Method |
|---|---|---|
.pdf |
Text extraction via PDF.js (up to 5 pages, 200KB cap) | |
| Microsoft Office (OOXML) | .docx, .xlsx, .pptx |
ZIP decompression, XML text extraction |
| OpenDocument (ODF) | .odt, .ods, .odp |
ZIP decompression, XML text extraction |
| Plain text | .txt, .md, .json, .xml, .csv, .log, .yaml, .yml |
Direct UTF-8 reading |
| Unknown types | any | Sniff mode - reads first 128KB as text |
Limits:
- PDF scanning is capped at 5 pages and 200,000 characters
- Overall file scanning is capped at 10MB per file
- OOXML extraction reads from
word/document.xml,xl/sharedStrings.xml, andppt/slides/slide1-3.xml(including headers/footers for Word) - ODF extraction reads from
content.xml,styles.xml, andmeta.xml
Enforcement Modes¶
Each pattern has a configurable enforcement mode that controls what happens when a match is detected. The same four modes apply across both the browser extension and the Outlook add-in:
| Mode | Name | Browser Extension | Outlook Add-in | Microsoft Teams |
|---|---|---|---|---|
| 0 | Report | Log the detection silently - no user-visible action | Alert logged silently - send allowed | Alert created, message untouched |
| 1 | Alert | Show a popup notifying the user - no masking | Show a notification in the task pane - send allowed | Alert created, message untouched |
| 2 | Mask & Overridable | Mask the matched text with asterisks, show popup with override option | Show alert with override option - send blocked until user overrides | Policy violation flag - message content blocked (best-effort) |
| 3 | Mask | Mask the matched text with asterisks - no override, but user can submit an appeal reason | Send blocked - user can submit an appeal reason but cannot proceed | Policy violation flag - message content blocked (best-effort) |
Note
Enforcement modes 2 and 3 behave identically in Microsoft Teams. PolicyClue sets a policyViolation flag on the message via Graph API, which blocks access to its content and shows a policy tip. This requires ChannelMessage.UpdatePolicyViolation.All / Chat.UpdatePolicyViolation.All permissions and the Communications DLP service plan (E5 or equivalent). If the tenant lacks this license, enforcement silently degrades to alert-only - the DLP match is still reported as an alert.
How masking works (browser extension):
- For
<input>and<textarea>: matched text is replaced with*characters (same length as the match) - For
contenteditableelements: the extension walks the DOM text nodes and replaces matched content with asterisks - Masking is applied locally in the browser before the data leaves the field
How blocking works (Outlook add-in):
- The add-in registers an on-send handler that checks the email body and attachments when the user clicks Send
- In mode 2, the user can override the detection via the task pane and proceed with sending
- In mode 3, sending is blocked - the user can submit an appeal reason but cannot proceed; the appeal is logged as a
DLP_BLOCK_APPEALalert - If an error occurs during on-send scanning, sending is allowed (fail-open) to avoid permanently blocking the user
User Overrides¶
When a pattern uses enforcement mode 2 (Mask & Overridable), users can override the detection and proceed. The override popup requires the user to select a reason:
| Reason | Description |
|---|---|
false_positive |
The detection is incorrect |
business_need |
Business justifies sharing this data |
trusted_destination |
The destination is trusted |
internal_data |
The data is internal/non-sensitive |
public_info |
The information is publicly available |
other |
Other reason |
Once overridden, the pattern ID is added to an in-memory override set for the current session (page session in the browser, compose session in Outlook). Subsequent matches for that same pattern will not trigger again in that session.
Each override generates a DLP_OVERRIDE alert with the selected reason, providing an audit trail.
Block Appeals (Mode 3)¶
When a pattern uses enforcement mode 3 (Mask), the detection is not overridable - the content remains blocked. However, users can submit an appeal explaining why they believe the detection is incorrect or why they need access. The appeal uses the same reason dropdown as overrides.
Submitting an appeal generates a DLP_BLOCK_APPEAL alert with the selected reason and optional justification, providing visibility into blocked detections that users consider problematic. This can help administrators identify patterns that need tuning or exceptions.
Pattern Exceptions¶
Each pattern can have one or more exceptions - string values that, if found within a match, cause that match to be ignored.
For example, if you have a credit card pattern but want to allow the test card number 4111111111111111, add it as an exception. When the regex matches that number, the exception check will see the substring and skip the detection.
- Exceptions use substring matching (not regex)
- Exceptions are managed per pattern in the group editor
- The browser extension and Outlook add-in receive exceptions alongside patterns and apply them client-side
- The Teams server-side scanner applies exceptions identically during pattern matching
This is useful for:
- Known test data (test credit cards, dummy IBANs)
- Legitimate values that match a broad pattern
- Reducing false positives without narrowing the regex
Case Insensitive Matching¶
Each pattern has an optional Case Insensitive toggle. When enabled, the pattern matches regardless of upper/lower case (e.g. "John" also matches "john" or "JOHN"). This flag is applied per pattern and respected by both the portal preview and the browser extension.
Internally both clients apply the i flag to the regex alongside the default gmu (global, multiline, Unicode) flags.
CSV Import¶
You can bulk-import patterns from a CSV file directly in the group editor:
- Open a DLP Group and click Import CSV
- Select your CSV file and configure the delimiter
- Choose which column contains the values to match (and optionally a name column)
- Set import options: case insensitive, word boundary matching, enforcement mode, name prefix, and enabled state
- Click Import - patterns are created one by one with a progress bar
Each CSV row becomes an individual pattern. Values are automatically escaped into safe regex. Word boundary matching wraps values with \b...\b so partial matches are avoided.
Regex Validation¶
All patterns are validated server-side before saving:
- The backend checks that the regex compiles successfully
- ReDoS protection: nested quantifiers (e.g.
(a+)+,(a*)+,(\d+){2,}) are blocked to prevent regular expression denial-of-service attacks that could freeze the browser - Invalid patterns are rejected with an error message
Tips for writing patterns:
- Use
\bword boundaries to reduce false positives - Use non-capturing groups
(?:...)instead of capturing groups when you don't need backreferences - Keep patterns as specific as possible - broad patterns generate noise
- Use labeled patterns (e.g. requiring a keyword like "IBAN:" before the number) to reduce false matches on generic number sequences
- Test your patterns in the portal preview before enabling enforcement
Alerts¶
DLP events generate alerts visible in the Alerts section. There are four alert types:
| Alert type | Trigger |
|---|---|
DLP_MATCH |
Sensitive text pattern found in a form field, or file upload allowed with a match (modes 0 and 1) |
DLP_FILE_MATCH |
Sensitive pattern found in an uploaded file and the upload was blocked or overridden (modes 2 and 3) |
DLP_OVERRIDE |
User overrode a DLP detection (mode 2) with a justification reason |
DLP_BLOCK_APPEAL |
User submitted an appeal reason on a blocked detection (mode 3) - the block remains in effect |
Alert Details¶
Each DLP alert includes:
- DLP Group ID and name - which group the pattern belongs to
- DLP Pattern ID and name - which specific pattern triggered
- Matched value - the actual text that was matched
- Filename - for file-based detections, the name of the scanned file (browser: uploaded file, Outlook: attachment name)
- Policy ID - which policy was active
- URL domain - the website where the detection occurred (browser extension), the specific recipient domain whose policy triggered the match (Outlook add-in), or
teams.microsoft.comfor Teams - User info - user email, display name (Teams), browser/client name and version, computer name
- Platform - which client platform generated the alert (Browser Extension, Outlook Add-in, or Microsoft Teams)
Webhook Integration¶
DLP alerts can be forwarded to external systems via webhooks. Configure webhooks in Settings > Webhooks. DLP alert payloads include all the fields listed above.
See Alerts for alert management and Webhooks for webhook configuration.
How It Works End-to-End¶
Browser Extension¶
- Policy check: When a user navigates to a website, the browser extension calls the
/check_siteAPI to determine if any policies match - Pattern delivery: If the matched policy has DLP enabled and groups linked, the API returns all enabled patterns (with exceptions) in the response
- Monitoring starts: The extension attaches event listeners to the page for text input monitoring and file upload interception
- Detection: When user input or a file upload matches a pattern (and no exception applies), the configured enforcement action is taken
- Alert reporting: The extension sends an alert to the
/reportAPI endpoint, which indexes it for the alerts dashboard and forwards it to any configured webhooks - Deduplication: Identical alerts (same pattern ID + same matched value) are deduplicated within a page session to avoid alert fatigue
Microsoft Teams¶
- Tenant-wide subscriptions: When an admin configures Azure AD credentials in the portal, PolicyClue automatically creates two Graph API subscriptions - one for all channel messages (
/teams/getAllMessages) and one for all chat messages (/chats/getAllMessages) - Webhook notification: When a message is created or edited anywhere in Teams, Microsoft sends a change notification to PolicyClue's webhook endpoint
- Message retrieval: PolicyClue fetches the full message content via the Graph API, strips HTML tags, and extracts plain text
- Participant domain extraction: PolicyClue fetches the channel or chat member list, extracts each member's email address, and collects the unique domains - the same approach as the Outlook add-in with recipient domains
- Per-domain policy resolution: For each participant domain, PolicyClue finds policies whose hostlist matching rules match that domain and that have the "Microsoft Teams" platform enabled. Only DLP patterns from matching policies are applied
- DLP scanning: The message text is scanned against the resolved DLP patterns
- Enforcement (best-effort): For modes 2/3, PolicyClue sets a
policyViolationflag on the message via Graph API, blocking access to its content. This requires the Communications DLP service plan (E5). If unavailable, enforcement is skipped and the alert is still created - Alert reporting: All matches are indexed as alerts with sender information (name, email, user ID) and the Teams context (channel or chat)
- Subscription lifecycle: A background worker renews the Graph API subscriptions automatically (they expire after 55 minutes and are renewed with a 20-minute safety margin)
Outlook Add-in¶
- Recipient domain extraction: When composing an email, the add-in extracts all recipient email addresses (To, CC, BCC), splits the domain from each address, and deduplicates them. For example,
alice@dropbox.comandbob@dropbox.comproduce one domain (dropbox.com), while addingcarol@gmail.comproduces two (dropbox.com,gmail.com) - Per-domain policy check: The add-in calls
/check_siteonce per unique recipient domain. Each call returns the policies and DLP patterns configured for that domain - Pattern merging: DLP patterns from all domain responses are combined into a single pattern list for scanning. Each pattern retains its
policy_idso matches can be traced back to the originating domain - On-send check: When the user clicks Send, the on-send handler scans the email body and all attachments. Content patterns match against extracted text, extension patterns match against file extensions, and MIME patterns match against detected file types from magic bytes
- Enforcement: Based on the pattern's enforcement mode, the add-in either logs silently (mode 0), shows a notification (mode 1), prompts for override (mode 2), or blocks the send (mode 3)
- Alert reporting with domain tracing: Each alert is attributed to the specific recipient domain whose policy produced the triggering pattern - not just the first recipient. The add-in traces the matched pattern back to its source domain via the
policy_idfield. Alerts are sent to the/reportAPI endpoint and include the originating domain, attachment names, and override reasons where applicable - Fail-open: If an error occurs during the on-send check, sending is allowed to prevent permanently blocking the user