Documentation Index
Fetch the complete documentation index at: https://docs.dataharbor.co/llms.txt
Use this file to discover all available pages before exploring further.
Input Normalization
Different source formats. One controls pipeline. Before DataHarbor can apply privacy controls or transforms, it normalizes the upstream payload into a canonical JSON model.Normalization is the stage that makes one
controls pipeline work across JSON, CSV, YAML, and Markdown. See Data Pipeline for the full request flow.Supported input formats
| Format | Content-Type | Normalized shape |
|---|---|---|
| JSON | application/json, application/*+json | Parsed directly |
| CSV | text/csv | Array of objects keyed by header columns |
| YAML | text/yaml, application/yaml, application/x-yaml | JSON-compatible mappings, sequences, and scalars |
| Markdown | text/markdown | Document object with content and optional frontmatter |
When to set input_format
Most of the time, DataHarbor can infer the source format from the upstream Content-Type header.
Set input_format only when the upstream service returns a missing, generic, or incorrect content type.
CSV normalization
CSV input is parsed as headered rectangular data. The first row defines the column names, and each later row becomes an object.YAML normalization
YAML input is normalized using the JSON-compatible subset of YAML.Markdown normalization
Markdown input is handled in document mode. DataHarbor preserves the body ascontent and extracts YAML front matter into frontmatter when present.
Body sniffing
When the upstreamContent-Type header is missing or unrecognized and no input_format is declared, DataHarbor inspects the response body to infer the format.
- Starts with
{or[→ JSON - Two rows with matching comma-separated field counts → CSV
- Starts with
---or akey:mapping → YAML - Starts with valid front matter plus non-empty body, or starts with
#→ Markdown
Why normalization matters
- Controls always run against normalized JSON, not raw upstream bytes.
- The same field targeting rules work across source formats.
- Output formatting happens later, so you can normalize from one format and respond in another.
Next steps
Data Pipeline
See where normalization fits in the request flow
Output Formatting
Learn how the final governed payload is rendered
Markdown Input
Dive into Markdown-specific rules and limitations

