Skip to content

ACORD Form Extraction

The @openinsure/acord package provides typed data models for ACORD standard forms and an AI extraction pipeline that converts uploaded PDFs and images into structured submission data.


FormTitleUse
ACORD 125Commercial Lines ApplicationGL, Property, Umbrella new business
ACORD 200Commercial Lines — Workers CompensationWC new business
ACORD 126Commercial General Liability SectionStandalone GL supplement
SupplementalProgram-specific supplementsAdditional risk questions per program

Every extracted form produces a typed object:

import type { ACORD125, ACORD200, ExtractionResult } from '@openinsure/acord';
interface ACORD125 {
// Applicant
namedInsured: string;
mailingAddress: Address;
naicsCode?: string;
annualRevenue?: number;
yearsInBusiness?: number;
// Coverage requested
requestedLines: ('GL' | 'PROPERTY' | 'UMBRELLA' | 'CRIME' | 'INLAND_MARINE')[];
occurrenceLimit: number;
aggregateLimit: number;
deductible: number;
effectiveDate: string;
// Loss history
lossHistory: LossYear[];
// Extraction metadata
extractionConfidence: number; // 0.0 – 1.0
rawText: string; // Full OCR text for manual review fallback
sourceFile: string; // R2 key of the uploaded PDF
}

extractionConfidence is an aggregate score (0–1) calculated per field and averaged across the form. A score below 0.75 triggers a review flag in the UW queue — an underwriter manually verifies the highlighted fields before the submission is created.


  1. Upload — Producer uploads a PDF via the portal or it arrives via the mailbox ingest cron 2. OCR — The PDF is rendered to images via the Browser binding and passed to Workers AI (Llama 3 vision) for OCR text extraction 3. Field extraction — Claude (claude-haiku-4-5) maps OCR text to the ACORD125 / ACORD200 schema using a structured output prompt 4. Validation — Zod schema validation catches missing required fields and type mismatches 5. Confidence scoring — Per-field confidence is calculated based on regex match strength and Claude’s self-reported uncertainty 6. Result — Structured ExtractionResult stored in PlanetScale; source PDF stored in R2

Terminal window
POST /v1/submissions/:id/extract-acord
Authorization: Bearer <token>
Content-Type: multipart/form-data
file=@acord125.pdf
formType=ACORD125

Response:

{
"extractionId": "ext_01J8...",
"status": "processing",
"estimatedSeconds": 15
}
Terminal window
GET /v1/submissions/:id/extract-acord/:extractionId
{
"status": "complete",
"extractionConfidence": 0.87,
"reviewRequired": false,
"data": {
"namedInsured": "Acme Roofing LLC",
"naicsCode": "238160",
"annualRevenue": 2500000,
...
}
}

Once extraction is complete with confidence ≥ 0.75, the submission fields are auto-populated:

Terminal window
POST /v1/submissions/:id/apply-extraction/:extractionId

This merges the extracted data into the draft submission. Low-confidence fields are marked with ⚠ Review in the producer portal form.


Extractions with confidence < 0.75 appear in the UW Review Queue under Extraction Review. The UW:

  1. Sees the original PDF alongside the extracted fields
  2. Corrects any flagged fields inline
  3. Clicks Approve — the corrected data is merged and the submission proceeds

If no PDF is uploaded, producers can enter ACORD data directly through the submission form in the producer portal. The form follows the ACORD 125/200 field layout and validates the same schema.


  1. Add the form’s TypeScript interface in packages/acord/src/types/{form}.ts 2. Create the extraction prompt in packages/acord/src/prompts/{form}.ts 3. Add the form to the SUPPORTED_FORMS registry in packages/acord/src/index.ts 4. Write extraction tests using fixtures in packages/acord/src/__fixtures__/ (sample PDFs + expected output) 5. Update this page with the new form in the Supported Forms table