BharatParse API
Extract structured JSON from any Indian invoice, bill, or receipt with a single POST request. No cloud setup, no IAM config, no per-metric billing.
Overview
BharatParse is a document extraction API built specifically for Indian businesses. Send any invoice, bill, or receipt as a base64-encoded file and receive clean, validated JSON in return - with confidence scores, field-level warnings, and GST-specific validation built in.
schema: "auto" and BharatParse will identify the document type and extract accordingly.
Why BharatParse vs AWS Textract / Google Vision
| Feature | BharatParse | AWS Textract | Google Vision |
|---|---|---|---|
| Setup time | 5 minutes | Days (IAM, VPC, SDK) | Days |
| Handwritten receipts | Yes (LLM-powered) | Limited | Limited |
| GST/GSTIN validation | Built-in | No | No |
| India-specific schemas | 13 schemas | Generic only | Generic only |
| Pricing | Flat monthly | Per page + per query | Per 1000 units |
| Confidence scores | Document-level + warnings | Per word | Per word |
Quick Start
Get your first extraction in under 2 minutes.
1. Get your API key
Subscribe to any plan on RapidAPI. Your API key will be shown in the dashboard.
2. Make your first request
# Python example
import base64, requests
# Load your document
with open("invoice.pdf", "rb") as f:
b64 = base64.b64encode(f.read()).decode()
# Call the API
response = requests.post(
"https://bharatparse-indian-invoice-bill.p.rapidapi.com/v1/extract",
headers={
"X-RapidAPI-Key": "YOUR_RAPIDAPI_KEY",
"X-RapidAPI-Host": "bharatparse-indian-invoice-bill.p.rapidapi.com",
"Content-Type": "application/json"
},
json={
"file_b64": b64,
"file_type": "pdf",
"schema": "auto",
"country": "IN"
}
)
print(response.json())
3. That's it
You'll receive structured JSON with all extracted fields, a confidence score, and any warnings about low-quality fields.
Authentication
BharatParse uses RapidAPI's standard authentication. Include your RapidAPI key in every request header:
| Header | Value |
|---|---|
X-RapidAPI-Key | Your RapidAPI subscription key |
X-RapidAPI-Host | bharatparse-indian-invoice-bill.p.rapidapi.com |
Content-Type | application/json |
Supported File Formats
BharatParse accepts both PDFs and images. You can send a scanned document, a phone photograph, a screenshot, or a downloaded PDF - all are handled by the same endpoint.
| Format | file_type value | Best for |
|---|---|---|
pdf | Downloaded invoices, e-tickets, multi-page bills (Jio, bank statements) | |
| JPEG / JPG | jpeg or jpg | Phone photos of receipts, restaurant bills, fuel pump memos |
| PNG | png | Screenshots of digital invoices, e-commerce order pages |
| WebP | webp | WhatsApp-shared images, modern browser screenshots |
| TIFF / TIF | tiff or tif | High-resolution scanner output, archival document scans |
File size limit
Maximum 20MB per request. Most invoices and bills are well under 1MB. Bank statements and multi-page PDFs are typically 2–5MB. If your file exceeds 20MB, compress the PDF or reduce image resolution before sending.
Endpoint
https://bharatparse-indian-invoice-bill.p.rapidapi.com
Request Parameters
All parameters are sent as a JSON body.
| Parameter | Type | Required | Description |
|---|---|---|---|
file_b64 |
string | Required | Base64-encoded file content. Max 20MB. |
file_type |
string | Required | File format: pdf, jpeg, jpg, png, webp, tiff, tif |
schema |
string | Optional | Document type. Default: auto. See schemas for all options. |
country |
string | Optional | Country code. Default: IN. Use for non-Indian documents. |
Response Format
All successful responses (HTTP 200) follow this structure:
{
"schema_detected": "restaurant", // Document type identified
"confidence": 0.92, // 0.0–1.0 extraction quality score
"data": { ... }, // Extracted fields (schema-specific)
"warnings": [ // Field-level issues (empty if none)
"Invoice date not visible in scan"
],
"processing_ms": 1843 // Processing time in milliseconds
}
Confidence Score Guide
| Score | Meaning | Recommended action |
|---|---|---|
| 0.90 – 1.00 | Excellent - clean document | Use directly |
| 0.70 – 0.89 | Good - minor issues | Review warnings |
| 0.50 – 0.69 | Fair - poor scan quality | Human review recommended |
| Below 0.50 | Low - heavily degraded | Re-scan or manual entry |
Document Schemas
Use the schema parameter to specify the document type, or use auto to let BharatParse detect it automatically.
Example: Restaurant Bill
Input: a blurry, angled phone scan of a Starbucks receipt. The OCR text was heavily degraded - yet all key fields extracted correctly.
Input document
Extracted JSON
{ "schema_detected": "restaurant", "confidence": 0.92, "data": { "restaurant_name": "Starbucks", "hsn_code": "996331", "line_items": [ { "name": "Tall Cold Coffee", "quantity": 1, "total": 320.00 } ], "taxable_value": 320.00, "cgst_rate": 2.5, "cgst_amount": 8.00, "sgst_rate": 2.5, "sgst_amount": 8.00, "grand_total": 336.00, "payment": { "mode": "starbucks_card", "card_last4": "1821" } }, "warnings": ["Invoice date not visible in scan"], "processing_ms": 1843 }Example: Handwritten Fuel Receipt
Input: a handwritten BPCL cash memo where the quantity "94:14" was written with a colon instead of a decimal point. BharatParse interpreted the context and extracted the correct value.
Input document
Extracted JSON
# Response for handwritten BPCL receipt
{
"schema_detected": "fuel",
"confidence": 0.90,
"data": {
"dealer_name": "N. M. Shamsuddin & Sons",
"oil_company": "BPCL",
"invoice_date": "2025-06-05",
"fuel_items": [
{
"fuel_type": "Speed",
"litres": 94.14,
"rate_per_litre": 21.24,
"amount": 2000.00
}
],
"total_amount": 2000.00
},
"warnings": ["Litres value '94:14' is handwritten and interpreted as 94.14"],
"processing_ms": 8598
}
Example: Telecom Bill
Input: a 7-page Jio Fiber PDF bill. BharatParse extracted only the billing summary - ignoring 80 rows of itemised data usage - and returned the fields that actually matter for accounting.
Input document
Extracted JSON
# Response for Jio Fiber bill
{
"schema_detected": "telecom",
"confidence": 1.0,
"data": {
"provider": "Jio",
"customer_name": "Mr. Rajesh Kumar",
"account_number": "XXXXXXXXXXXX9305",
"due_date": "2025-09-30",
"plan_name": "Postpaid_399_6M: Unlimited Data @ 30 Mbps",
"vendor_gstin": "24AABCI6363G1ZP",
"charges": {
"current_taxable_charges": 399.00,
"cgst_rate": 9.0, "cgst_amount": 35.91,
"sgst_rate": 9.0, "sgst_amount": 35.91
},
"total_payable": 470.82
},
"warnings": [],
"processing_ms": 24406
}
Example: IRCTC Train Ticket
Input: an IRCTC ERS PDF with the GST invoice on page 2. BharatParse extracted passenger details from page 1 and the full GST breakdown from page 2 in a single API call.
Input document
Extracted JSON
# Response for IRCTC Tejas Express ticket
{
"schema_detected": "travel",
"confidence": 0.95,
"data": {
"pnr": "8543381796",
"train_number": "82902",
"train_name": "IRCTC TEJAS EXP",
"journey_date": "2026-01-24",
"from_station": "AHMEDABAD JN (ADI)",
"boarding_station": "VADODARA JN (BRC)",
"to_station": "BORIVALI (BVI)",
"passengers": [
{ "name": "RAJESH KUMAR", "age": 67, "current_status": "WL/44" }
],
"fare": { "ticket_fare": 1680.00, "total_fare": 1715.40 },
"gst": { "igst_rate": 5.0, "igst_amount": 80.00, "total_tax": 80.00 }
},
"warnings": [],
"processing_ms": 11990
}
Example: GST Invoice
Full B2B tax invoice extraction with automatic GSTIN checksum validation.
# Response for B2B GST invoice
{
"schema_detected": "gst_invoice",
"confidence": 0.97,
"data": {
"invoice_number": "INV-2024-009182",
"invoice_date": "2024-11-15",
"vendor": {
"name": "Tata Power Ltd",
"gstin": "27AAACT2727Q1ZW",
"pan": "AAACT2727Q"
},
"line_items": [
{
"description": "IT Services",
"hsn_sac": "998313",
"taxable_amount": 50000.00,
"cgst_rate": 9, "cgst_amount": 4500.00,
"sgst_rate": 9, "sgst_amount": 4500.00
}
],
"totals": { "taxable_value": 50000, "grand_total": 59000 }
},
"warnings": [],
"processing_ms": 2100
}
Error Codes
| HTTP Code | Error | Cause & Fix |
|---|---|---|
| 400 | Invalid file_type | Use one of: pdf, jpeg, jpg, png, webp, tiff, tif |
| 400 | Invalid base64 | Ensure file_b64 is valid base64 encoded content |
| 400 | File too large | Max file size is 20MB |
| 429 | Rate limit exceeded | You've hit your monthly quota. Upgrade your plan. |
| 502 | Gemini API error | Upstream model error. Retry after a few seconds. |
Error Response Format
{
"detail": "Unsupported file_type 'bmp'. Use: pdf, jpeg, jpg, png, webp, tiff, tif"
}
Plans & Pricing
Subscribe at rapidapi.com. All plans include all 13 document schemas, confidence scores, and GST validation.
FAQ
What file formats are supported?
BharatParse accepts both PDFs and images. Supported formats: PDF, JPEG (.jpeg or .jpg), PNG, WebP, and TIFF (.tiff or .tif). Maximum file size is 20MB per request. This means you can send a phone photo of a receipt (JPEG), a scanned document (PDF or TIFF), or a screenshot (PNG) - all work equally well. For best accuracy, use PDF or high-resolution JPEG at 150 DPI or above.
Does it work with handwritten receipts?
Yes. BharatParse uses an LLM rather than traditional OCR. This means it reads documents the way a person would - understanding context rather than just recognising characters. The fuel schema is specifically tuned for handwritten pump memos. Confidence scores will reflect extraction uncertainty.
How accurate is GSTIN validation?
Every GSTIN goes through full 15-character format validation and checksum verification automatically. Invalid GSTINs are flagged in the warnings array rather than silently passed through.
What happens when the document type is unclear?
Use schema: "auto". The model will identify the document type and apply the appropriate extraction schema. The detected type is returned in schema_detected.
Can I use it for non-Indian documents?
Yes - set the country parameter to your ISO country code (e.g., AE for UAE, SG for Singapore). The extraction adapts tax field names for that country, though India-specific checks like GSTIN validation won't apply.
Is my data stored?
No. Documents are processed in memory and discarded immediately after extraction. Nothing is logged or stored.
What's the average response time?
Typically 2–10 seconds depending on document complexity and size. Simple single-page bills average around 2 seconds. Multi-page PDFs like bank statements take longer.