Documentation Index
Fetch the complete documentation index at: https://docs.dataharbor.co/llms.txt
Use this file to discover all available pages before exploring further.
Data Pipeline
Different sources in. One governed response out. Every DataHarbor request follows the same pipeline. The important part is that governance happens on a canonical JSON model, not on raw upstream bytes.The request flow
Authenticate and authorize
DataHarbor validates the caller’s key, visibility rules, expiration, and other access controls before the upstream request is processed.
Fetch the upstream payload
DataHarbor calls the enrolled source using the request path and source credentials configured for the Virtual API.
Normalize into canonical JSON
JSON stays JSON. CSV becomes arrays of objects. YAML becomes JSON-compatible objects and arrays. Markdown becomes a document object with
content and optional frontmatter.See Input Normalization.Match the object definition
Once the payload is normalized, DataHarbor matches the request path to the correct object definition inside
objects.<objectName>. That determines which ordered controls list will run.See Virtual APIs.Run the ordered controls pipeline
Data Control and Data Transform both live in the same
controls array. They execute top-to-bottom, so later steps see values produced by earlier ones.Typical sequences look like this:- create a derived field with
combine - hash or anonymize the derived field
- redact the original source field
Format the governed result
After controls finish, DataHarbor can return JSON, Markdown, CSV, or YAML. Formatting is the final rendering step; it does not change the governance logic.See Output Formatting.
Why the pipeline matters
- You write one set of control rules even when the upstream source is not JSON.
- You can combine privacy controls and transforms in one ordered pipeline.
- Output formatting stays separate from governance, so the same rules apply whether the caller wants JSON, Markdown, CSV, or YAML.
Example
customers, the pipeline looks like this:
- Parse CSV rows into objects.
- Match the
customersobject definition. - Build
full_name, tokenizeemail, and redactssn. - Render the final governed payload as Markdown.
Next steps
Virtual APIs
Understand how object definitions and controls are configured
Input Normalization
See how JSON, CSV, YAML, and Markdown are parsed
Govern Data
Explore privacy controls, transforms, and field targeting
Output Formatting
Learn how the final governed payload is rendered

