Skip to main content
Data Lake Delivery is coming soon. This page describes planned functionality.

Data Lake Delivery

Governed data, wherever it needs to go. Mirror governed Virtual API data to modern data platforms. Privacy controls execute before data leaves DataHarbor — your lake receives already-governed data.

Supported destinations

PlatformStatus
Azure FabricComing Soon
DatabricksComing Soon
SnowflakeComing Soon
BigQueryComing Soon

How it works

virtual_api:
  name: analytics-export
  source: customer-api

  controls:
    - type: tokenize
      fields: [email, phone]
    - type: redact
      fields: [ssn]

  delivery:
    type: data_lake
    destination:
      type: databricks
      catalog: main
      schema: governed_data
      table: customers
    
    sync:
      schedule: daily
      time: "02:00"
      timezone: UTC

Sync options

Scheduled sync

Run on a schedule:
sync:
  schedule: hourly | daily | weekly
  time: "02:00"
  timezone: UTC

On-demand sync

Trigger manually via API:
POST /v1/virtual-apis/analytics-export/sync

Incremental updates

Sync only what’s changed:
sync:
  mode: incremental
  watermark_field: updated_at

Schema mapping

DataHarbor automatically translates your Virtual API schema to native table structures:
delivery:
  type: data_lake
  destination:
    type: snowflake
    database: ANALYTICS
    schema: GOVERNED
    table: CUSTOMERS
  
  schema_mapping:
    id: { type: VARCHAR, primary_key: true }
    email: { type: VARCHAR }  # Contains tokens, not raw emails
    created_at: { type: TIMESTAMP }

Pre-applied controls

Controls are applied before data lands in your lake:
Source fieldControlLake receives
emailtokenizetok_abc123
ssnredact""
phoneanonymize555-0123 (random)
Your data lake never sees raw sensitive data.

Connection configuration

Databricks

destination:
  type: databricks
  host: adb-123456789.azuredatabricks.net
  http_path: /sql/1.0/warehouses/abc123
  token: ${DATABRICKS_TOKEN}
  catalog: main
  schema: governed_data

Snowflake

destination:
  type: snowflake
  account: xy12345.us-east-1
  warehouse: COMPUTE_WH
  database: ANALYTICS
  schema: GOVERNED
  user: ${SNOWFLAKE_USER}
  password: ${SNOWFLAKE_PASSWORD}

Next steps