> ## Documentation Index
> Fetch the complete documentation index at: https://hoopdev-docs-improve-idp-sso-pages.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Live Data Masking

> Automatically detect and mask sensitive data in query results without writing rules

<Frame>
  <img src="https://mintcdn.com/hoopdev-docs-improve-idp-sso-pages/quDT6RZSG6Ua5zfL/images/learn/features/ai-data-masking.png?fit=max&auto=format&n=quDT6RZSG6Ua5zfL&q=85&s=2ff268616488c0aa85ba5510a9eb3d78" alt="Live Data Masking" width="1408" height="768" data-path="images/learn/features/ai-data-masking.png" />
</Frame>

## What You'll Accomplish

Live Data Masking automatically detects and redacts sensitive data in your query results. Unlike traditional DLP solutions that require complex rule configuration, Hoop's data masking works out of the box:

* Automatically detect PII (names, emails, phone numbers, SSNs)
* Mask credit card numbers and financial data
* Redact passwords, API keys, and secrets
* Protect health information (HIPAA compliance)
* No regex patterns to write or maintain

***

## How It Works

<Steps>
  <Step title="Query Executed">
    User runs a query through Hoop
  </Step>

  <Step title="Results Analyzed">
    AI scans the query results for sensitive data patterns
  </Step>

  <Step title="Data Masked">
    Sensitive values are replaced with redacted placeholders
  </Step>

  <Step title="Results Returned">
    User sees masked results; original data never exposed
  </Step>
</Steps>

### Before and After

**Original query result:**

```
| name         | email              | ssn         | phone        |
|--------------|--------------------| ------------|--------------|
| John Smith   | john@example.com   | 123-45-6789 | 555-123-4567 |
| Jane Doe     | jane@company.org   | 987-65-4321 | 555-987-6543 |
```

**With Live Data Masking enabled:**

```
| name         | email              | ssn         | phone        |
|--------------|--------------------| ------------|--------------|
| [REDACTED]   | [REDACTED]         | [REDACTED]  | [REDACTED]   |
| [REDACTED]   | [REDACTED]         | [REDACTED]  | [REDACTED]   |
```

***

## Quick Start

## Prerequisites

To get the most out of this guide, you will need to:

* Either [create an account in our managed instance](https://use.hoop.dev) or [deploy your own hoop.dev instance](/setup/deployment/overview)
* You must be your account administrator to perform the following actions

- A DLP provider configured (Microsoft Presidio or GCP DLP)

### Step 1: Set Up a DLP Provider

Choose and deploy one of the supported providers:

<CardGroup cols={2}>
  <Card title="Microsoft Presidio" icon="microsoft" href="https://microsoft.github.io/presidio/">
    Open-source, self-hosted PII detection
  </Card>

  <Card title="Google Cloud DLP" icon="google" href="https://cloud.google.com/dlp">
    Managed service with advanced detection
  </Card>
</CardGroup>

See [Live Data Masking Configuration](/setup/configuration/live-data-masking/get-started) for detailed setup instructions.

### Step 2: Configure the Gateway

Set the required environment variables:

**For Microsoft Presidio:**

```bash theme={null}
DLP_PROVIDER=mspresidio
DLP_MODE=best-effort
MSPRESIDIO_ANALYZER_URL=http://presidio-analyzer:5001
MSPRESIDIO_ANONYMIZER_URL=http://presidio-anonymizer:5002
```

**For Google Cloud DLP:**

```bash theme={null}
DLP_PROVIDER=gcp
DLP_MODE=best-effort
GCP_PROJECT_ID=your-project-id
```

### Step 3: Enable on a Connection

1. Go to **Connections** in the Web App
2. Select a connection and click **Configure**
3. Enable **Live Data Masking**
4. Click **Save**

### Step 4: Test It

Run a query that returns sensitive data:

```sql theme={null}
SELECT name, email, phone FROM customers LIMIT 5;
```

You should see masked values in the results.

***

## Supported Data Types

Live Data Masking detects these sensitive data types by default:

### Personal Information

| Type             | Example                                     | Masked As  |
| ---------------- | ------------------------------------------- | ---------- |
| Person Name      | John Smith                                  | \[PERSON]  |
| Email Address    | [john@example.com](mailto:john@example.com) | \[EMAIL]   |
| Phone Number     | 555-123-4567                                | \[PHONE]   |
| Physical Address | 123 Main St                                 | \[ADDRESS] |

### Government IDs

| Type             | Example     | Masked As   |
| ---------------- | ----------- | ----------- |
| SSN (US)         | 123-45-6789 | \[SSN]      |
| Passport Number  | AB1234567   | \[PASSPORT] |
| Driver's License | D1234567    | \[LICENSE]  |

### Financial Data

| Type         | Example                | Masked As        |
| ------------ | ---------------------- | ---------------- |
| Credit Card  | 4111-1111-1111-1111    | \[CREDIT\_CARD]  |
| Bank Account | 123456789012           | \[BANK\_ACCOUNT] |
| IBAN         | GB82WEST12345698765432 | \[IBAN]          |

### Credentials

| Type     | Example             | Masked As   |
| -------- | ------------------- | ----------- |
| API Key  | sk\_live\_abc123... | \[API\_KEY] |
| Password | password123         | \[PASSWORD] |
| AWS Key  | AKIA...             | \[AWS\_KEY] |

### Health Information

| Type           | Example   | Masked As          |
| -------------- | --------- | ------------------ |
| Medical Record | MRN-12345 | \[MEDICAL\_RECORD] |
| Health Plan ID | HPL-98765 | \[HEALTH\_ID]      |

***

## Configuration Options

### DLP Mode

| Mode          | Behavior                                     |
| ------------- | -------------------------------------------- |
| `best-effort` | Mask detected fields, continue if some fail  |
| `strict`      | Block the entire result if any masking fails |

**Recommendation:** Start with `best-effort` to avoid blocking legitimate queries.

### Custom Fields

Add or remove fields from detection. See [Supported Fields](/setup/configuration/live-data-masking/fields) for the complete list.

### Per-Connection Settings

Enable or disable masking on individual connections:

* Enable on production databases with real customer data
* Disable on development databases with synthetic data

***

## Use Cases

### 1. Developer Access to Production

Developers need to debug production issues but shouldn't see customer PII:

* Enable Live Data Masking on production connections
* Developers can run diagnostic queries
* Customer data is automatically protected

### 2. Analytics Without Exposure

Data analysts need aggregate insights but not individual records:

* Masking protects individual-level PII
* Aggregations (COUNT, SUM, AVG) work normally
* Compliance requirements are met

### 3. Support Team Access

Support teams need to look up customer records:

* Enable masking on support-facing connections
* They can verify account status without seeing SSNs
* Audit trail shows who accessed what

### 4. Third-Party Contractor Access

External contractors need database access:

* Create a connection with masking enabled
* Grant access to contractors
* Sensitive data is never exposed

***

## Troubleshooting

### Data Not Being Masked

**Check:**

1. Live Data Masking is enabled on the connection
2. DLP provider is running and accessible
3. Gateway environment variables are set correctly
4. The data type is in the [supported fields list](/setup/configuration/live-data-masking/fields)

**Test the DLP provider directly:**

```bash theme={null}
curl -X POST http://presidio-analyzer:5001/analyze \
  -H "Content-Type: application/json" \
  -d '{"text": "John Smith, SSN 123-45-6789", "language": "en"}'
```

### Too Much Data Being Masked

If legitimate data is being masked incorrectly:

1. Check which field type is triggering
2. Disable that specific field type in configuration
3. Or use [Guardrails](/learn/features/guardrails) for more precise control

### Performance Impact

Live Data Masking adds latency to query results:

| Result Size   | Typical Latency |
| ------------- | --------------- |
| \< 100 rows   | 50-100ms        |
| 100-1000 rows | 100-500ms       |
| > 1000 rows   | 500ms+          |

**To reduce latency:**

* Use `LIMIT` clauses in queries
* Select only needed columns (avoid `SELECT *`)
* Consider disabling masking for high-volume analytics

***

## Compliance

Live Data Masking helps meet requirements for:

* **GDPR** - Protect EU citizen personal data
* **HIPAA** - Mask protected health information
* **PCI DSS** - Redact credit card numbers
* **SOC 2** - Demonstrate data protection controls
* **CCPA** - Protect California consumer data

<Note>
  Live Data Masking is one layer of a defense-in-depth strategy. Combine with [Access Control](/learn/features/access-control) and [Guardrails](/learn/features/guardrails) for comprehensive protection.
</Note>

***

## Next Steps

<CardGroup cols={2}>
  <Card title="Configuration Guide" icon="gear" href="/setup/configuration/live-data-masking/get-started">
    Set up Microsoft Presidio or GCP DLP
  </Card>

  <Card title="Supported Fields" icon="list" href="/setup/configuration/live-data-masking/fields">
    See all detectable data types
  </Card>

  <Card title="Guardrails" icon="shield" href="/learn/features/guardrails">
    Block queries before they execute
  </Card>

  <Card title="Access Control" icon="lock" href="/learn/features/access-control">
    Control who can access connections
  </Card>
</CardGroup>
