Skip to main content
Version: 1.0.0

OCR General Document

Optical Image Recognition (OCR) or Image to Text is a process of extracting textual information from image file formats (PDF, PNG, JPEG, etc.) that cannot be edited, searched, or counted into a machine-readable text format. This makes it easier and more accurate for a computer to perform some tasks on text. For example, an OCR and information retrieval can be used in document search from PDF files.

Base Model - VISAI OCR General Document (TH-EN)

Provider: VISAI.ai

The model is trained with a dataset which contains both a synthesized dataset and some official PDF documents scraped online. The synthesized dataset was generated to mimic the document format and to make the model more robust on different generation conditions. Additionally, several augmentation techniques were also applied to make the model more robust.

Authentication

OCR General Document requires API key for API request. Go to VISAI Console - API Key to create and get your API Key.

  • X-API-Key