Early Access — Now Available

Document infrastructure
for the machine age

SDF bundles structured data and a human-readable PDF into a single, signed, machine-verifiable file. Build workflows that eliminate OCR and manual data entry.

Free tier includes 100 uploads/mo · No credit card required

Open Format · Zero Dependencies

One file.
PDF, data, and proof.

Every .sdf is a self-contained capsule — human-readable PDF and machine-readable JSON, cryptographically linked. Offline-ready. Verifiable forever.

Open specification · Works with any language · Self-hosted option

No more extraction pipelines

Zero OCR.
Zero re-entry.

Stop rebuilding data you already have. SDF embeds structured data where the document lives. One parse call returns typed, validated business data — instantly.

SAP · Oracle · Any ERP — native connectors included

parse-invoice.ts
import { SdfClient } from '@etapsky/cloud-sdk'

const sdf = new SdfClient({ apiKey: process.env.SDF_API_KEY })

// Parse structured data — zero OCR, zero guessing
const { data, meta } = await sdf.documents.parse(invoiceFile)

console.log(data.invoice.total)         // → 12_500.00
console.log(data.invoice.issuer.name)   // → "Acme Corp"
console.log(meta.verified)              // → true
invoice_2026_03.sdf
ZIP archive · 284 KB · ECDSA-P256
Verified
├── visual.pdf
Human layer
├── data.json
Machine layer
├── schema.json
Validation rules
├── meta.json
Identity & metadata
└── signature.sig
Cryptographic proof
Before · PDF world
1 Receive PDF
2 Run OCR engine
3 Clean noisy output
4 Parse & map fields
5 Handle errors & retries
6 Re-enter into ERP
Hours · Error-prone
After · SDF world
1 Receive .sdf file
2 Call .parse()
3 Typed data, ready
< 50ms · Verified
< 50ms
Parse latency (p99)
99.99%
Uptime SLA
5 SDKs
Languages supported
Open spec
Format specification
Native apps

Etapsky Workstation — desktop power

The full SDF experience on your machine: open and produce .sdf files, sync with your tenant on Etapsky Cloud, and stay current with signed, auto-updating releases.

All releases on GitHub · BUSL-1.1

The enterprise document gap

Every enterprise document workflow is caught between two incompatible formats — wasting engineering hours on OCR, reconciliation, and duplicate pipelines.

PDF

Human-readable. Machine-hostile.

PDF was designed for print. Extracting data requires fragile OCR pipelines, manual data entry teams, and constant reconciliation. Every vendor formats their PDF differently.

Pain: OCR & manual entry
JSON / XML

Machine-readable. Legally useless.

Structured data is great for systems. But a JSON invoice has no legal standing — you still need to produce a separate PDF representation, doubling your pipeline complexity.

Pain: Two formats to maintain
SDF

Both. In one signed file.

SDF embeds structured, schema-validated data and a human-readable PDF in a single signed file. One file serves both machines and humans — verified, tamper-proof, one format forever.

Solution: SDF

From document to data in three steps

SDF eliminates the entire OCR layer from your document workflow. Produce once, parse anywhere — with type safety from the start.

01

Produce

Use our SDK or CLI to create SDF documents. Provide your structured data and an existing PDF — or let SDF generate the PDF from your data automatically.

TypeScript, Python, CLI, REST API

02

Distribute

Upload to Etapsky for hosted storage, sharing, and validation. Or self-host with sdf-server. Webhooks fire on every upload, parse, or verification event.

S3-compatible storage · Webhooks · CDN

03

Parse

Recipients call `parse()` once and get fully typed, schema-validated structured data back. No OCR, no regex, no guesswork — pure deterministic extraction.

Schema validation · Type-safe · < 50ms

One SDK. Every platform.

The same ergonomic API across TypeScript, Python, Go, and the CLI. Produce, parse, validate, and sign SDF documents with full type safety in minutes.

  • Type-safe structured data extraction
  • Schema validation against the SDF registry
  • Cryptographic document signing & verification
  • Streaming support for large documents
  • Webhook integration for async workflows
produce-invoice.ts
import { SdfClient } from '@etapsky/cloud-sdk'

const sdf = new SdfClient({ apiKey: process.env.SDF_API_KEY })

// Produce a signed SDF document
const doc = await sdf.documents.produce({
  schema: 'invoice@1.0',
  data: {
    id: 'INV-2026-001',
    issuer: { name: 'Acme Corp', taxId: 'TR-12345678' },
    recipient: { name: 'Beta Ltd', taxId: 'TR-98765432' },
    items: [
      { description: 'SDF Cloud Pro License', quantity: 1, unitPrice: 12_500 },
    ],
    total: 12_500,
    currency: 'USD',
    issuedAt: '2026-03-20T09:00:00Z',
  },
  pdf: existingPdfBuffer, // optional — auto-generated if omitted
  sign: true,
})

console.log(doc.id)        // → "doc_01JNKX9P4M..."
console.log(doc.url)       // → "https://cdn.etapsky.com/..."
console.log(doc.verified)  // → true

Documents that work for everyone

Any document that needs to be both human-readable and machine-parseable is a perfect fit for SDF.

~40hrs
saved per week per AP clerk

E-Invoicing

Produce legally compliant invoices that contain both the official PDF representation and machine-readable line items, totals, VAT, and counterparty data. Ingest supplier invoices automatically — no data entry team required.

Finance Accounts Payable ERP Integration
100%
structured data on first parse

HR & Nominations

Signed offer letters, onboarding documents, and nomination forms — all in one verifiable SDF file. Automate candidate data extraction and maintain a tamper-proof audit trail from application to hire.

HR Compliance Audit Trail
Zero
OCR errors in extraction

Government Forms

Submit structured data alongside the official PDF form. Public authorities receive both the human-readable document and machine-parseable fields — eliminating re-keying from paper submissions.

Public Sector e-Government Compliance
< 50ms
parse latency p99

Supply Chain

Packing lists, bills of lading, and customs declarations as verified SDF documents. Each file carries signed structured data alongside the human-readable PDF — machine-verifiable at every border crossing.

Logistics Customs Trade Finance

SDF vs everything else

No single format checked all the boxes — until SDF.

Feature
PDF
Adobe / ISO 32000
JSON / XML
Structured data only
SDF
Smart Document Format
Machine-readable structured data
Human-readable (PDF-quality)
Cryptographic signature & verification
Schema validation
Zero-OCR data extraction
Single distributable file
Multi-language SDKs
Tamper-proof audit trail
Open specification
Yes
Partial
No
SDF document icon

Start building with SDF today

The free tier includes 100 uploads per month, 500 MB storage, and full SDK access. No credit card required.

Free tier — no card
Up and running in < 5 min
Open-source spec