Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.dataharbor.co/llms.txt

Use this file to discover all available pages before exploring further.

Data Control

Privacy controls, not privacy code. Field-level privacy controls that protect sensitive data without requiring engineering effort. Configure once, apply everywhere.
Data Control runs inside the ordered controls pipeline after input normalization and before output formatting. You can mix privacy filters with data transforms in the same object definition.
For a complete Virtual API Configuration, start with version: "0.3" and define controls under objects.<objectName>.controls. See Virtual APIs if you need a refresher on object definitions.

Quick example

Use Data Control when you need to protect specific fields but still preserve the rest of the upstream shape:
version: "0.3"
objects:
  customers:
    controls:
      - type: redact
        fields: [ssn]
      - type: tokenize
        fields: [email]
      - type: mask
        fields: [phone]
Each control has two required properties:
  • type — the operation to apply (redact, tokenize, anonymize, mask, hash, allow)
  • fields — one or more field names to apply it to
You can also add required: true when a missing field should fail the request instead of being skipped.

Choosing a control

Control typeUse it when you need to
redactBlank out a value but keep the field present
tokenizePreserve correlation across records without exposing raw identifiers
anonymizeProduce non-correlatable stand-ins for demos, testing, or ML
maskPreserve some visual context while hiding most of the value
hashCreate a deterministic one-way value for deduplication or joins
allowBe explicit about fields that should pass through unchanged

Control types

Redact

Remove sensitive data entirely. redact keeps the field in the response and only changes its value. If you want the field to disappear entirely, use delete from Data Transform.
controls:
  - type: redact
    fields: [ssn, date_of_birth, drivers_license]

Tokenize

Replace sensitive values with consistent, syntax-preserving tokens. The same input always produces the same token within the same Virtual API and matched object definition — enabling analytics without exposing raw data.
controls:
  - type: tokenize
    fields: [email, phone]
Tokenization is scope-bound — the same input value always produces the same token within the same Virtual API and matched object definition. Different object definitions or different Virtual APIs can produce different tokens for the same input.

Anonymize

Generate randomized replacements that preserve data structure. Each call produces a different value, making re-identification impossible.
controls:
  - type: anonymize
    fields: [email, name]

Mask

Partially redact values while preserving enough structure for context. Useful in support, debugging, and audit log scenarios where you need to understand the data shape without revealing full values.
controls:
  - type: mask
    fields: [ssn, phone]
Masking behavior varies by data type:
TypeRuleExample
stringKeeps first & last N characters visible (N=2 for strings longer than 4 chars, N=1 otherwise). Strings of 2 or fewer characters are fully masked. Whitespace is preserved."John Smith""Jo** ***th"
doubleZeros leading digits, keeps last 2 visible per segment123456.78000056.78
longZeros leading digits, keeps last 2 visible55512345670000000067
boolAlways returns falsetruefalse

Hash

Replace values with a one-way SHA-256 digest. Consumers can correlate and deduplicate records without seeing real values — the same input within the same Virtual API and matched object definition always produces the same hash.
controls:
  - type: hash
    fields: [email, customerId]
Hash behavior varies by data type:
TypeOutputNotes
string64-character hex digestFull SHA-256 hex of scope:input
doubleDeterministic doubleDerived from hash bytes
longDeterministic longFrom first 8 hash bytes
boolDeterministic booleanBased on first hash byte
Hashing is scope-bound — different object definitions or different Virtual APIs produce different hashes for the same input value, preventing broad cross-context correlation.

Allow

Pass data through unchanged. Useful when you want to be explicit about which fields are permitted, even though fields not referenced by any control already pass through unchanged.
controls:
  - type: allow
    fields: [publicName]

Working with the controls pipeline

  • Controls execute top-to-bottom. Later controls see the values produced by earlier controls.
  • If you are transforming and filtering the same field, place the transform first so later controls operate on the transformed value.
  • Filter controls target scalar values only: strings, numbers, and booleans. If a path resolves to an object or array, the request fails.
  • If the matched payload is a root array, the control list runs against each object element automatically. Non-object elements are skipped.
  • Set required: true when a missing field should fail instead of being skipped.

Targeting fields

Use dot notation for nested fields and bracket segments for arrays:
controls:
  - type: redact
    fields: [user.ssn, user.address.street, "[contacts].email"]
See Field Targeting for the shared path rules used by Data Control and Data Transform.

When to reach for reference docs

Use this page to choose the right privacy behavior. Use the reference docs when you need exhaustive option rules or edge-case behavior:

Next steps

Field Targeting

Learn how controls target nested fields and arrays

Data Transform

Combine, coalesce, and delete fields in the same pipeline

Control Block Reference

Review detailed behavior and config options