Optical Character Recognition
Overview
The OCR plugin allows you to run OCR on image assets in your project. The plugin will draw a bounding box around each OCR token, and add the recognized token text in the metadata of each box.
Optical Character Recognition
Plugin Functionality
The Optical Character Recognition (OCR) Plugin automatically extracts text from images, converting it into structured annotations and add the recognized text in the metadata of the annotations. You can choose the level of detection (block, paragraph, line, or word) and specify the language for recognition.

Supported Data Types
Image
Supported Annotation Tools
Bounding Box
Plugin Configuration
The Overwrite setting in model plugins controls whether existing annotations are replaced or kept. When enabled, the plugin replaces all existing annotations with new model predictions; when disabled, it simply adds the new results without deleting what’s already there.
The Class Mapping setting defines how the model’s predicted classes are linked to your project’s label schema. Follow these steps to prepare your class mapping.
In the "Class Mapping" field, open the left dropdown, and pick from one of the classes the plugin can detect.
Open the right dropdown, and pick from one of the tools you have created in your project.
Click on the "plus" button to finalize the pairing. Now the object class and your bounding box tool are linked. The plugin will use the selected bounding box tool to label the selected category.
Starting again from Step 1, link as many tools to categories as needed.

You may vary a number of settings related to your export from the Config JSON field. Each option is detailed below:
{
"ocr_level": "paragraph",
"selected_language": "eng"
}"ocr_level": Defines the granularity of text extraction during OCR.
Options:
"block"
"paragraph"
"line"
"word"
Example:
"ocr_level": "paragraph"

"selected_language": Specifies the language used for OCR text detection and recognition. Accepts standard ISO language codes.
Options:
"ara"
"deu"
"eng"
"fra"
"tur"
...
Example:
"selected_language": "eng"
Last updated