Skip to content

OCR Model

OpenTyphoon.ai offers specialized OCR model optimized for Thai text recognition and document processing. Our OCR model is designed to handle various types of documents, images, and forms with high accuracy.

Available Models

Currently, we offer the following OCR model:

Model IDDescriptionRate LimitsRelease Date
typhoon-ocrSpecialized OCR & document parsing model2 req/s, 20 req/m2025-05-19

Getting Started

To use our OCR model, you’ll need to:

  1. Install the required package:
Terminal window
pip install typhoon-ocr
  1. Set up your API key as an environment variable:
Terminal window
export TYPHOON_OCR_API_KEY=your_api_key_here
  1. Start using the OCR function:
from typhoon_ocr import ocr_document
markdown = ocr_document(
pdf_or_image_path="document.pdf", # Works with PDFs or images
task_type="default", # Choose between "default" or "structure"
page_num=2 # Process page 2 of a PDF (default is 1, always 1 for images)
)
# Or with image
markdown = ocr_document(
pdf_or_image_path="scan.jpg", # Works with PDFs or images
task_type="default", # Choose between "default" or "structure"
)

Supported File Types

The typhoon-ocr model supports the following file formats:

  • Images: PNG, JPEG
  • Documents: PDF
  • Scanned documents
  • Photos of documents

Features

Our OCR model includes the following capabilities:

  • Convert images to PDFs for unified processing
  • Extract text and layout information from PDFs and images
  • Generate OCR-ready messages for API processing with Typhoon OCR model
  • Built-in prompt templates for different document processing tasks
  • Process specific pages from multi-page PDF documents

Example Usage

Here’s a more detailed example of using the OCR model:

from typhoon_ocr import ocr_document
# Process a specific page from a PDF
markdown = ocr_document(
pdf_or_image_path="document.pdf",
task_type="default",
page_num=2
)
print(markdown)
# Process an image with structured output
markdown = ocr_document(
pdf_or_image_path="invoice.jpg",
task_type="structure"
)
print(markdown)

Next Steps