BharatParse API

Extract structured JSON from any Indian invoice, bill, or receipt with a single POST request. No cloud setup, no IAM config, no per-metric billing.

13 Document Types GST Validated PDFs & Images Handwriting Support Gemini 2.5 Flash India-first

Overview

BharatParse is a specialized document extraction API built for Indian businesses. Send any invoice, bill, or receipt as a base64-encoded file and receive clean, validated JSON in return — with confidence scores, field-level warnings, and GST-specific validation built in.

Auto-detect mode: Not sure which schema to use? Set schema: "auto" and BharatParse will identify the document type and extract accordingly.

Why BharatParse vs AWS Textract / Google Vision

FeatureBharatParseAWS TextractGoogle Vision
Setup time5 minutesDays (IAM, VPC, SDK)Days
Handwritten receiptsYes (LLM-powered)LimitedLimited
GST/GSTIN validationBuilt-inNoNo
India-specific schemas13 schemasGeneric onlyGeneric only
PricingFlat monthlyPer page + per queryPer 1000 units
Confidence scoresDocument-level + warningsPer wordPer word

Quick Start

Get your first extraction in under 2 minutes.

1. Get your API key

Subscribe to any plan on RapidAPI. Your API key will be shown in the dashboard.

2. Make your first request

# Python example
import base64, requests

# Load your document
with open("invoice.pdf", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

# Call the API
response = requests.post(
    "https://bharatparse-indian-invoice-bill.p.rapidapi.com/v1/extract",
    headers={
        "X-RapidAPI-Key": "YOUR_RAPIDAPI_KEY",
        "X-RapidAPI-Host": "bharatparse-indian-invoice-bill.p.rapidapi.com",
        "Content-Type": "application/json"
    },
    json={
        "file_b64": b64,
        "file_type": "pdf",
        "schema": "auto",
        "country": "IN"
    }
)
print(response.json())

3. That's it

You'll receive structured JSON with all extracted fields, a confidence score, and any warnings about low-quality fields.

Authentication

BharatParse uses RapidAPI's standard authentication. Include your RapidAPI key in every request header:

HeaderValue
X-RapidAPI-KeyYour RapidAPI subscription key
X-RapidAPI-Hostbharatparse-indian-invoice-bill.p.rapidapi.com
Content-Typeapplication/json
Keep your key secret. Never expose it in client-side code or public repositories. Rotate it immediately from your RapidAPI dashboard if compromised.

Supported File Formats

BharatParse accepts both PDFs and images. You can send a scanned document, a phone photograph, a screenshot, or a downloaded PDF — all are handled by the same endpoint.

Formatfile_type valueBest for
PDFpdfDownloaded invoices, e-tickets, multi-page bills (Jio, bank statements)
JPEG / JPGjpeg or jpgPhone photos of receipts, restaurant bills, fuel pump memos
PNGpngScreenshots of digital invoices, e-commerce order pages
WebPwebpWhatsApp-shared images, modern browser screenshots
TIFF / TIFtiff or tifHigh-resolution scanner output, archival document scans
Tip for best results: For phone photos, ensure good lighting and hold the camera parallel to the document. The API handles tilt, blur, and partial shadows well — but extreme angles or very dark images reduce confidence scores.

File size limit

Maximum 20MB per request. Most invoices and bills are well under 1MB. Bank statements and multi-page PDFs are typically 2–5MB. If your file exceeds 20MB, compress the PDF or reduce image resolution before sending.

Endpoint

POST /v1/extract
Extract structured JSON from any Indian invoice, bill, or receipt. Accepts PDFs and images (JPEG, PNG, WebP, TIFF) up to 20MB.
Base URL: https://bharatparse-indian-invoice-bill.p.rapidapi.com

Request Parameters

All parameters are sent as a JSON body.

ParameterTypeRequiredDescription
file_b64 string Required Base64-encoded file content. Max 20MB.
file_type string Required File format: pdf, jpeg, jpg, png, webp, tiff, tif
schema string Optional Document type. Default: auto. See schemas for all options.
country string Optional Country code. Default: IN. Use for non-Indian documents.

Response Format

All successful responses (HTTP 200) follow this structure:

{
  "schema_detected": "restaurant",     // Document type identified
  "confidence": 0.92,               // 0.0–1.0 extraction quality score
  "data": { ... },                   // Extracted fields (schema-specific)
  "warnings": [                      // Field-level issues (empty if none)
    "Invoice date not visible in scan"
  ],
  "processing_ms": 1843            // Processing time in milliseconds
}

Confidence Score Guide

ScoreMeaningRecommended action
0.90 – 1.00Excellent — clean documentUse directly
0.70 – 0.89Good — minor issuesReview warnings
0.50 – 0.69Fair — poor scan qualityHuman review recommended
Below 0.50Low — heavily degradedRe-scan or manual entry

Document Schemas

Use the schema parameter to specify the document type, or use auto to let BharatParse detect it automatically.

auto
Auto-detects document type. Best for mixed document pipelines.
gst_invoice
B2B GST tax invoices with GSTIN, HSN/SAC codes, and tax breakdowns.
restaurant
Restaurant and cafe bills including Starbucks, Zomato, Swiggy printouts.
fuel
Petrol/diesel/CNG receipts — including handwritten pump memos (BPCL, HPCL, IOC).
telecom
Mobile and broadband bills — Jio, Airtel, BSNL, Vi, ACT Fibernet.
travel
IRCTC e-tickets (ERS), train bookings, bus tickets. Extracts PNR, GST invoice and passenger details.
utility
Electricity bills (BESCOM, MSEDCL, CESC), gas (Mahanagar Gas, IGL), water bills.
medical
Pharmacy/chemist bills, hospital invoices, diagnostic lab reports. Insurance claim ready.
ecommerce
Amazon, Flipkart, Meesho, Myntra, Nykaa order invoices including seller GSTIN.
rent
Rent receipts with landlord PAN extraction. HRA claim ready for Section 10(13A).
bank_statement
Bank account statements — HDFC, SBI, ICICI, Axis, Kotak. Full transaction extraction.
credit_card
Credit card monthly statements with transaction categorization and reward points.
generic
Fallback for any other Indian bill or receipt not covered by specific schemas.

Example: Restaurant Bill

Works with blurry photos, angled scans, and OCR-degraded images from Starbucks, local restaurants, and food delivery printouts.

# Request
{
  "file_b64": "<base64>",
  "file_type": "jpeg",
  "schema": "restaurant"
}

# Response
{
  "schema_detected": "restaurant",
  "confidence": 0.92,
  "data": {
    "restaurant_name": "Starbucks",
    "hsn_code": "996331",
    "line_items": [
      { "name": "Tall Cold Coffee", "quantity": 1, "total": 320.00 }
    ],
    "taxable_value": 320.00,
    "cgst_rate": 2.5, "cgst_amount": 8.00,
    "sgst_rate": 2.5, "sgst_amount": 8.00,
    "grand_total": 336.00,
    "payment": { "mode": "starbucks_card", "card_last4": "1821" }
  },
  "warnings": ["Invoice date not visible in scan"],
  "processing_ms": 1843
}

Example: Fuel Receipt

Handles handwritten petrol pump cash memos including BPCL Speed, HPCL Power, and Indian Oil branded receipts.

# Response for handwritten BPCL receipt
{
  "schema_detected": "fuel",
  "confidence": 0.90,
  "data": {
    "dealer_name": "N. M. Shamsuddin & Sons",
    "oil_company": "BPCL",
    "invoice_date": "2025-06-05",
    "fuel_items": [
      {
        "fuel_type": "Speed",
        "litres": 94.14,
        "rate_per_litre": 21.24,
        "amount": 2000.00
      }
    ],
    "total_amount": 2000.00
  },
  "warnings": ["Litres value '94:14' is handwritten and interpreted as 94.14"],
  "processing_ms": 8598
}

Example: Telecom Bill

Extracts billing summary from multi-page Jio, Airtel, BSNL, and Vi PDF bills. Ignores itemised usage tables and focuses on what matters.

# Response for Jio Fiber bill
{
  "schema_detected": "telecom",
  "confidence": 1.0,
  "data": {
    "provider": "Jio",
    "customer_name": "Mr. Shyam Arjandas Warialani",
    "account_number": "411252569305",
    "due_date": "2025-09-30",
    "plan_name": "Postpaid_399_6M: Unlimited Data @ 30 Mbps",
    "vendor_gstin": "24AABCI6363G1ZP",
    "charges": {
      "current_taxable_charges": 399.00,
      "cgst_rate": 9.0, "cgst_amount": 35.91,
      "sgst_rate": 9.0, "sgst_amount": 35.91
    },
    "total_payable": 470.82
  },
  "warnings": [],
  "processing_ms": 24406
}

Example: IRCTC Train Ticket

Extracts PNR, passenger details, journey info, and the GST invoice from IRCTC e-ticket PDFs (ERS format).

# Response for IRCTC Tejas Express ticket
{
  "schema_detected": "travel",
  "confidence": 0.95,
  "data": {
    "pnr": "8543381796",
    "train_number": "82902",
    "train_name": "IRCTC TEJAS EXP",
    "journey_date": "2026-01-24",
    "from_station": "AHMEDABAD JN (ADI)",
    "boarding_station": "VADODARA JN (BRC)",
    "to_station": "BORIVALI (BVI)",
    "passengers": [
      { "name": "SHYAM WARIALANI", "age": 67, "current_status": "WL/44" }
    ],
    "fare": { "ticket_fare": 1680.00, "total_fare": 1715.40 },
    "gst": { "igst_rate": 5.0, "igst_amount": 80.00, "total_tax": 80.00 }
  },
  "warnings": [],
  "processing_ms": 11990
}

Example: GST Invoice

Full B2B tax invoice extraction with automatic GSTIN checksum validation.

# Response for B2B GST invoice
{
  "schema_detected": "gst_invoice",
  "confidence": 0.97,
  "data": {
    "invoice_number": "INV-2024-009182",
    "invoice_date": "2024-11-15",
    "vendor": {
      "name": "Tata Power Ltd",
      "gstin": "27AAACT2727Q1ZW",
      "pan": "AAACT2727Q"
    },
    "line_items": [
      {
        "description": "IT Services",
        "hsn_sac": "998313",
        "taxable_amount": 50000.00,
        "cgst_rate": 9, "cgst_amount": 4500.00,
        "sgst_rate": 9, "sgst_amount": 4500.00
      }
    ],
    "totals": { "taxable_value": 50000, "grand_total": 59000 }
  },
  "warnings": [],
  "processing_ms": 2100
}

Error Codes

HTTP CodeErrorCause & Fix
400Invalid file_typeUse one of: pdf, jpeg, jpg, png, webp, tiff, tif
400Invalid base64Ensure file_b64 is valid base64 encoded content
400File too largeMax file size is 20MB
429Rate limit exceededYou've hit your monthly quota. Upgrade your plan.
502Gemini API errorUpstream model error. Retry after a few seconds.

Error Response Format

{
  "detail": "Unsupported file_type 'bmp'. Use: pdf, jpeg, jpg, png, webp, tiff, tif"
}

Plans & Pricing

Basic
$0
Free forever
50 requests/month
Pro
$29
per month
500 requests/month
Mega
$199
per month
10,000 requests/month

Subscribe at rapidapi.com. All plans include all 13 document schemas, confidence scores, and GST validation.

FAQ

What file formats are supported?

BharatParse accepts both PDFs and images. Supported formats: PDF, JPEG (.jpeg or .jpg), PNG, WebP, and TIFF (.tiff or .tif). Maximum file size is 20MB per request. This means you can send a phone photo of a receipt (JPEG), a scanned document (PDF or TIFF), or a screenshot (PNG) — all work equally well. For best accuracy, use PDF or high-resolution JPEG at 150 DPI or above.

Does it work with handwritten receipts?

Yes. BharatParse uses a large language model rather than traditional OCR, which means it can interpret handwritten values in context. The fuel schema is specifically tuned for handwritten pump memos. Confidence scores will reflect extraction uncertainty.

How accurate is GSTIN validation?

BharatParse performs full 15-character format validation plus checksum verification on every extracted GSTIN. Invalid GSTINs are flagged in the warnings array rather than silently passed through.

What happens when the document type is unclear?

Use schema: "auto". The model will identify the document type and apply the appropriate extraction schema. The detected type is returned in schema_detected.

Can I use it for non-Indian documents?

Yes — set the country parameter to your ISO country code (e.g., AE for UAE, SG for Singapore). The extraction adapts tax field names accordingly, though India-specific validation (GSTIN, HSN) will not apply.

Is my data stored?

No. Documents are processed in memory and immediately discarded. No document content is logged or stored at rest.

What's the average response time?

Typically 2–10 seconds depending on document complexity and size. Simple single-page bills average around 2 seconds. Multi-page PDFs like bank statements take longer.