Introduction
The ParseGrid API converts unstructured documents — PDFs, scans, screenshots — into structured JSON. All endpoints are JSON over HTTPS and return predictable response shapes.
All requests must be made over HTTPS. Calls to plain HTTP are rejected.
The API is versioned by URL prefix (/v1). Breaking changes are released under a new prefix; non-breaking additions ship inside the current version with backward compatibility.
Authentication
Authenticate every request with a bearer token. Tokens are issued per project from the dashboard and should be treated as secrets.
Requests missing or carrying an invalid token return 401 Unauthorized. Rotate tokens regularly; the dashboard supports overlapping active tokens to enable zero-downtime rotation.
| Scope | Access | Description |
|---|---|---|
| extract | read/write | Submit documents and retrieve results. |
| admin | read/write | Manage project settings, models, and usage. |
| readonly | read | Retrieve results and metrics; cannot submit jobs. |
Errors
ParseGrid uses standard HTTP status codes. Successful requests return 2xx; client mistakes return 4xx; server issues return 5xx. Every error response includes a machine-readable code and a human-readable message.
| Status | Code | Meaning |
|---|---|---|
| 400 | invalid_request | The request payload is malformed or missing required fields. |
| 401 | unauthorized | The Authorization header is absent or the token is invalid. |
| 404 | not_found | The referenced job, document, or project does not exist. |
| 413 | payload_too_large | The document exceeds the 25MB upload limit. |
| 429 | rate_limited | You have exceeded your plan's per-second rate limit. |
| 500 | internal_error | An unexpected server-side failure. Please retry with backoff. |
Parse Document
Submit an unstructured document (PDF, PNG, or JPG) to the extraction engine. ParseGrid will perform layout analysis, OCR, and table reconstruction to return structured JSON data.
| Parameter | Type | Description |
|---|---|---|
| file | binary | The document file to be parsed. Max size 25MB. |
| model_id | string | Optional. Specify a custom-trained model for extraction. |
| ocr_engine | string | Choose between standard or high_res. |
Our engine supports complex grid detection and nested data hierarchies. By default, the engine attempts to classify the document type and apply relevant schema mapping.
Retrieve Result
Fetch the structured result of a previously submitted extraction job. Use this for asynchronous workflows where the parse request returns a job ID rather than inline data.
| Parameter | Type | Description |
|---|---|---|
| job_id | string | The identifier returned by POST /v1/parse. |
The status field will be one of queued, processing, success, or failed. Poll every 1–2 seconds, or configure a webhook from the project settings to receive a push notification when processing completes.
Project Settings
Read and update project-level configuration: default OCR engine, retention window, allowed file types, and trusted callback URLs.
| Field | Type | Description |
|---|---|---|
| default_ocr | string | Engine used when a request omits ocr_engine. |
| retention_days | integer | How long uploaded documents and results are retained. |
| callback_url | string | Webhook URL that receives extraction completion events. |
Usage Metrics
Query usage for the current billing period, broken down by day. Useful for surfacing consumption in your own internal dashboards or for setting up early-warning alerts before plan limits are reached.
| Parameter | Type | Description |
|---|---|---|
| from | string | ISO 8601 date. Defaults to the start of the current billing period. |
| to | string | ISO 8601 date. Defaults to today. |
| granularity | string | One of day or hour. Defaults to day. |