BharatParse API
Extract structured JSON from any Indian invoice, bill, or receipt with a single POST request. No cloud setup, no IAM config, no per-metric billing.
Overview
BharatParse is a specialized document extraction API built for Indian businesses. Send any invoice, bill, or receipt as a base64-encoded file and receive clean, validated JSON in return — with confidence scores, field-level warnings, and GST-specific validation built in.
schema: "auto" and BharatParse will identify the document type and extract accordingly.
Why BharatParse vs AWS Textract / Google Vision
| Feature | BharatParse | AWS Textract | Google Vision |
|---|---|---|---|
| Setup time | 5 minutes | Days (IAM, VPC, SDK) | Days |
| Handwritten receipts | Yes (LLM-powered) | Limited | Limited |
| GST/GSTIN validation | Built-in | No | No |
| India-specific schemas | 13 schemas | Generic only | Generic only |
| Pricing | Flat monthly | Per page + per query | Per 1000 units |
| Confidence scores | Document-level + warnings | Per word | Per word |
Quick Start
Get your first extraction in under 2 minutes.
1. Get your API key
Subscribe to any plan on RapidAPI. Your API key will be shown in the dashboard.
2. Make your first request
# Python example
import base64, requests
# Load your document
with open("invoice.pdf", "rb") as f:
b64 = base64.b64encode(f.read()).decode()
# Call the API
response = requests.post(
"https://bharatparse-indian-invoice-bill.p.rapidapi.com/v1/extract",
headers={
"X-RapidAPI-Key": "YOUR_RAPIDAPI_KEY",
"X-RapidAPI-Host": "bharatparse-indian-invoice-bill.p.rapidapi.com",
"Content-Type": "application/json"
},
json={
"file_b64": b64,
"file_type": "pdf",
"schema": "auto",
"country": "IN"
}
)
print(response.json())
3. That's it
You'll receive structured JSON with all extracted fields, a confidence score, and any warnings about low-quality fields.
Authentication
BharatParse uses RapidAPI's standard authentication. Include your RapidAPI key in every request header:
| Header | Value |
|---|---|
X-RapidAPI-Key | Your RapidAPI subscription key |
X-RapidAPI-Host | bharatparse-indian-invoice-bill.p.rapidapi.com |
Content-Type | application/json |
Supported File Formats
BharatParse accepts both PDFs and images. You can send a scanned document, a phone photograph, a screenshot, or a downloaded PDF — all are handled by the same endpoint.
| Format | file_type value | Best for |
|---|---|---|
pdf | Downloaded invoices, e-tickets, multi-page bills (Jio, bank statements) | |
| JPEG / JPG | jpeg or jpg | Phone photos of receipts, restaurant bills, fuel pump memos |
| PNG | png | Screenshots of digital invoices, e-commerce order pages |
| WebP | webp | WhatsApp-shared images, modern browser screenshots |
| TIFF / TIF | tiff or tif | High-resolution scanner output, archival document scans |
File size limit
Maximum 20MB per request. Most invoices and bills are well under 1MB. Bank statements and multi-page PDFs are typically 2–5MB. If your file exceeds 20MB, compress the PDF or reduce image resolution before sending.
Endpoint
https://bharatparse-indian-invoice-bill.p.rapidapi.com
Request Parameters
All parameters are sent as a JSON body.
| Parameter | Type | Required | Description |
|---|---|---|---|
file_b64 |
string | Required | Base64-encoded file content. Max 20MB. |
file_type |
string | Required | File format: pdf, jpeg, jpg, png, webp, tiff, tif |
schema |
string | Optional | Document type. Default: auto. See schemas for all options. |
country |
string | Optional | Country code. Default: IN. Use for non-Indian documents. |
Response Format
All successful responses (HTTP 200) follow this structure:
{
"schema_detected": "restaurant", // Document type identified
"confidence": 0.92, // 0.0–1.0 extraction quality score
"data": { ... }, // Extracted fields (schema-specific)
"warnings": [ // Field-level issues (empty if none)
"Invoice date not visible in scan"
],
"processing_ms": 1843 // Processing time in milliseconds
}
Confidence Score Guide
| Score | Meaning | Recommended action |
|---|---|---|
| 0.90 – 1.00 | Excellent — clean document | Use directly |
| 0.70 – 0.89 | Good — minor issues | Review warnings |
| 0.50 – 0.69 | Fair — poor scan quality | Human review recommended |
| Below 0.50 | Low — heavily degraded | Re-scan or manual entry |
Document Schemas
Use the schema parameter to specify the document type, or use auto to let BharatParse detect it automatically.
Example: Restaurant Bill
Works with blurry photos, angled scans, and OCR-degraded images from Starbucks, local restaurants, and food delivery printouts.
# Request
{
"file_b64": "<base64>",
"file_type": "jpeg",
"schema": "restaurant"
}
# Response
{
"schema_detected": "restaurant",
"confidence": 0.92,
"data": {
"restaurant_name": "Starbucks",
"hsn_code": "996331",
"line_items": [
{ "name": "Tall Cold Coffee", "quantity": 1, "total": 320.00 }
],
"taxable_value": 320.00,
"cgst_rate": 2.5, "cgst_amount": 8.00,
"sgst_rate": 2.5, "sgst_amount": 8.00,
"grand_total": 336.00,
"payment": { "mode": "starbucks_card", "card_last4": "1821" }
},
"warnings": ["Invoice date not visible in scan"],
"processing_ms": 1843
}
Example: Fuel Receipt
Handles handwritten petrol pump cash memos including BPCL Speed, HPCL Power, and Indian Oil branded receipts.
# Response for handwritten BPCL receipt
{
"schema_detected": "fuel",
"confidence": 0.90,
"data": {
"dealer_name": "N. M. Shamsuddin & Sons",
"oil_company": "BPCL",
"invoice_date": "2025-06-05",
"fuel_items": [
{
"fuel_type": "Speed",
"litres": 94.14,
"rate_per_litre": 21.24,
"amount": 2000.00
}
],
"total_amount": 2000.00
},
"warnings": ["Litres value '94:14' is handwritten and interpreted as 94.14"],
"processing_ms": 8598
}
Example: Telecom Bill
Extracts billing summary from multi-page Jio, Airtel, BSNL, and Vi PDF bills. Ignores itemised usage tables and focuses on what matters.
# Response for Jio Fiber bill
{
"schema_detected": "telecom",
"confidence": 1.0,
"data": {
"provider": "Jio",
"customer_name": "Mr. Shyam Arjandas Warialani",
"account_number": "411252569305",
"due_date": "2025-09-30",
"plan_name": "Postpaid_399_6M: Unlimited Data @ 30 Mbps",
"vendor_gstin": "24AABCI6363G1ZP",
"charges": {
"current_taxable_charges": 399.00,
"cgst_rate": 9.0, "cgst_amount": 35.91,
"sgst_rate": 9.0, "sgst_amount": 35.91
},
"total_payable": 470.82
},
"warnings": [],
"processing_ms": 24406
}
Example: IRCTC Train Ticket
Extracts PNR, passenger details, journey info, and the GST invoice from IRCTC e-ticket PDFs (ERS format).
# Response for IRCTC Tejas Express ticket
{
"schema_detected": "travel",
"confidence": 0.95,
"data": {
"pnr": "8543381796",
"train_number": "82902",
"train_name": "IRCTC TEJAS EXP",
"journey_date": "2026-01-24",
"from_station": "AHMEDABAD JN (ADI)",
"boarding_station": "VADODARA JN (BRC)",
"to_station": "BORIVALI (BVI)",
"passengers": [
{ "name": "SHYAM WARIALANI", "age": 67, "current_status": "WL/44" }
],
"fare": { "ticket_fare": 1680.00, "total_fare": 1715.40 },
"gst": { "igst_rate": 5.0, "igst_amount": 80.00, "total_tax": 80.00 }
},
"warnings": [],
"processing_ms": 11990
}
Example: GST Invoice
Full B2B tax invoice extraction with automatic GSTIN checksum validation.
# Response for B2B GST invoice
{
"schema_detected": "gst_invoice",
"confidence": 0.97,
"data": {
"invoice_number": "INV-2024-009182",
"invoice_date": "2024-11-15",
"vendor": {
"name": "Tata Power Ltd",
"gstin": "27AAACT2727Q1ZW",
"pan": "AAACT2727Q"
},
"line_items": [
{
"description": "IT Services",
"hsn_sac": "998313",
"taxable_amount": 50000.00,
"cgst_rate": 9, "cgst_amount": 4500.00,
"sgst_rate": 9, "sgst_amount": 4500.00
}
],
"totals": { "taxable_value": 50000, "grand_total": 59000 }
},
"warnings": [],
"processing_ms": 2100
}
Error Codes
| HTTP Code | Error | Cause & Fix |
|---|---|---|
| 400 | Invalid file_type | Use one of: pdf, jpeg, jpg, png, webp, tiff, tif |
| 400 | Invalid base64 | Ensure file_b64 is valid base64 encoded content |
| 400 | File too large | Max file size is 20MB |
| 429 | Rate limit exceeded | You've hit your monthly quota. Upgrade your plan. |
| 502 | Gemini API error | Upstream model error. Retry after a few seconds. |
Error Response Format
{
"detail": "Unsupported file_type 'bmp'. Use: pdf, jpeg, jpg, png, webp, tiff, tif"
}
Plans & Pricing
Subscribe at rapidapi.com. All plans include all 13 document schemas, confidence scores, and GST validation.
FAQ
What file formats are supported?
BharatParse accepts both PDFs and images. Supported formats: PDF, JPEG (.jpeg or .jpg), PNG, WebP, and TIFF (.tiff or .tif). Maximum file size is 20MB per request. This means you can send a phone photo of a receipt (JPEG), a scanned document (PDF or TIFF), or a screenshot (PNG) — all work equally well. For best accuracy, use PDF or high-resolution JPEG at 150 DPI or above.
Does it work with handwritten receipts?
Yes. BharatParse uses a large language model rather than traditional OCR, which means it can interpret handwritten values in context. The fuel schema is specifically tuned for handwritten pump memos. Confidence scores will reflect extraction uncertainty.
How accurate is GSTIN validation?
BharatParse performs full 15-character format validation plus checksum verification on every extracted GSTIN. Invalid GSTINs are flagged in the warnings array rather than silently passed through.
What happens when the document type is unclear?
Use schema: "auto". The model will identify the document type and apply the appropriate extraction schema. The detected type is returned in schema_detected.
Can I use it for non-Indian documents?
Yes — set the country parameter to your ISO country code (e.g., AE for UAE, SG for Singapore). The extraction adapts tax field names accordingly, though India-specific validation (GSTIN, HSN) will not apply.
Is my data stored?
No. Documents are processed in memory and immediately discarded. No document content is logged or stored at rest.
What's the average response time?
Typically 2–10 seconds depending on document complexity and size. Simple single-page bills average around 2 seconds. Multi-page PDFs like bank statements take longer.