Documentation Index
Fetch the complete documentation index at: https://docs.dataharbor.co/llms.txt
Use this file to discover all available pages before exploring further.
Core Concepts
Virtual API
A Virtual API is a published, governed view of an existing dataset. It’s the fundamental unit in DataHarbor. Each Virtual API:- Points to a single enrolled data source
- Applies data controls, transforms, and access rules
- Can be delivered via REST, MCP, or Data Lake
- Is versioned, auditable, and revocable
Object Definitions
Virtual APIs use a YAML configuration with anobjects key containing named object definitions. Each object definition specifies the ordered controls pipeline for that resource type.
See Virtual APIs for a full example and Data Pipeline for how object matching fits into request processing.
URL Matching
DataHarbor automatically applies the correct object definition based on the API URL. Object names must match URL segments exactly. For a request to/properties/{propertyId}/inspections/{inspectionId}/issues, the system:
- Parses the URL path segments
- Checks whether the last path segment matches a defined object name — in this case,
issues - If the last segment does not match, checks the second-to-last path segment instead
_default is used when present.
The _default Object
_default is an optional catch-all for any object not explicitly defined.
_default controls apply. When a request matches a named object, only that named object’s controls apply.
Control Blocks
Control Blocks are declarative rules applied to a Virtual API. Three categories:| Block | Purpose |
|---|---|
| Data Control | Privacy operations: redact, tokenize, anonymize, mask, hash |
| Data Transform | Field transformations: combine, coalesce, delete |
| Access Control | Access rules: geo restrictions, expiration, shutdown |
Data Pipeline
Every request flows through the same high-level pipeline:- Authenticate and authorize the caller
- Fetch the upstream payload
- Normalize it into DataHarbor’s canonical JSON model
- Match the correct object definition
- Run the ordered controls pipeline
- Format the result for delivery
Control Set
A Control Set is the complete declarative configuration of a Virtual API — source, controls, delivery, and access rules combined.Enrollment
Enrollment is the process of connecting an existing data source to DataHarbor. DataHarbor currently supports REST APIs, including JSON, CSV, YAML, and Markdown payloads; data lakes are coming soon.Delivery
Delivery is how consumers access a Virtual API:- REST API — Standard HTTP endpoint with API key auth
- MCP Server — Model Context Protocol endpoint for AI agents
- Data Lake — Scheduled sync to Fabric, Databricks, Snowflake, BigQuery
Governance
Every Virtual API includes governance primitives:- Versioning — Track what changed, when, under which policy
- Expiration — Set end dates for access
- Revocation — Cut off access instantly
- Audit trail — Correlate access with policy versions
- Organization Authorizations — Authorize partner organizations for relay access to a Virtual API by org ID. The partner’s existing marketplace key works automatically; usage is billed to their quota.

