Table of Contents

Class OCRTesseract

Namespace
OpenCvSharp.Text
Assembly
OpenCvSharp.dll

Recognize text using the tesseract-ocr API.

Takes image on input and returns recognized text in the output_text parameter. Optionallyprovides also the Rects for individual text elements found(e.g.words), and the list of those text elements with their confidence values.

public sealed class OCRTesseract : BaseOCR, IDisposable, ICvPtrHolder
Inheritance
OCRTesseract
Implements
Inherited Members

Methods

Create(string?, string?, string?, int, int)

Creates an instance of the OCRTesseract class. Initializes Tesseract.

public static OCRTesseract Create(string? datapath = null, string? language = null, string? charWhitelist = null, int oem = 3, int psmode = 3)

Parameters

datapath string

datapath the name of the parent directory of tessdata ended with "/", or null to use the system's default directory.

language string

an ISO 639-3 code or NULL will default to "eng".

charWhitelist string

specifies the list of characters used for recognition. null defaults to "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ".

oem int

tesseract-ocr offers different OCR Engine Modes (OEM), by deffault tesseract::OEM_DEFAULT is used.See the tesseract-ocr API documentation for other possible values.

psmode int

tesseract-ocr offers different Page Segmentation Modes (PSM) tesseract::PSM_AUTO (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.

Returns

OCRTesseract

DisposeManaged()

Releases managed resources

protected override void DisposeManaged()

Run(Mat, Mat, out string, out Rect[], out string?[], out float[], ComponentLevels)

Recognize text using the tesseract-ocr API. Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found(e.g.words), and the list of those text elements with their confidence values.

public override void Run(Mat image, Mat mask, out string outputText, out Rect[] componentRects, out string?[] componentTexts, out float[] componentConfidences, ComponentLevels componentLevel = ComponentLevels.Word)

Parameters

image Mat

Input image CV_8UC1 or CV_8UC3

mask Mat
outputText string

Output text of the tesseract-ocr.

componentRects Rect[]

If provided the method will output a list of Rects for the individual text elements found(e.g.words or text lines).

componentTexts string[]

If provided the method will output a list of text strings for the recognition of individual text elements found(e.g.words or text lines).

componentConfidences float[]

If provided the method will output a list of confidence values for the recognition of individual text elements found(e.g.words or text lines).

componentLevel ComponentLevels

OCR_LEVEL_WORD (by default), or OCR_LEVEL_TEXT_LINE.

Run(Mat, out string, out Rect[], out string?[], out float[], ComponentLevels)

Recognize text using the tesseract-ocr API. Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found(e.g.words), and the list of those text elements with their confidence values.

public override void Run(Mat image, out string outputText, out Rect[] componentRects, out string?[] componentTexts, out float[] componentConfidences, ComponentLevels componentLevel = ComponentLevels.Word)

Parameters

image Mat

Input image CV_8UC1 or CV_8UC3

outputText string

Output text of the tesseract-ocr.

componentRects Rect[]

If provided the method will output a list of Rects for the individual text elements found(e.g.words or text lines).

componentTexts string[]

If provided the method will output a list of text strings for the recognition of individual text elements found(e.g.words or text lines).

componentConfidences float[]

If provided the method will output a list of confidence values for the recognition of individual text elements found(e.g.words or text lines).

componentLevel ComponentLevels

OCR_LEVEL_WORD (by default), or OCR_LEVEL_TEXT_LINE.

SetWhiteList(string)

public void SetWhiteList(string charWhitelist)

Parameters

charWhitelist string