Show / Hide Table of Contents

Class OCRTesseract

Recognize text using the tesseract-ocr API.

Takes image on input and returns recognized text in the output_text parameter. Optionallyprovides also the Rects for individual text elements found(e.g.words), and the list of those text elements with their confidence values.

Inheritance
System.Object
DisposableObject
DisposableCvObject
BaseOCR
OCRTesseract
Implements
ICvPtrHolder
Inherited Members
DisposableCvObject.ptr
DisposableCvObject.DisposeUnmanaged()
DisposableCvObject.CvPtr
DisposableObject.DataHandle
DisposableObject.IsDisposed
DisposableObject.IsEnabledDispose
DisposableObject.AllocatedMemory
DisposableObject.AllocatedMemorySize
DisposableObject.Dispose()
DisposableObject.Dispose(Boolean)
DisposableObject.AllocGCHandle(Object)
DisposableObject.AllocMemory(Int32)
DisposableObject.NotifyMemoryPressure(Int64)
DisposableObject.ThrowIfDisposed()
Namespace: OpenCvSharp.Text
Assembly: OpenCvSharp.dll
Syntax
public sealed class OCRTesseract : BaseOCR, ICvPtrHolder

Methods

| Improve this Doc View Source

Create(String, String, String, Int32, Int32)

Creates an instance of the OCRTesseract class. Initializes Tesseract.

Declaration
public static OCRTesseract Create(string datapath = null, string language = null, string charWhitelist = null, int oem = 3, int psmode = 3)
Parameters
Type Name Description
System.String datapath

datapath the name of the parent directory of tessdata ended with "/", or null to use the system's default directory.

System.String language

an ISO 639-3 code or NULL will default to "eng".

System.String charWhitelist

specifies the list of characters used for recognition. null defaults to "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ".

System.Int32 oem

tesseract-ocr offers different OCR Engine Modes (OEM), by deffault tesseract::OEM_DEFAULT is used.See the tesseract-ocr API documentation for other possible values.

System.Int32 psmode

tesseract-ocr offers different Page Segmentation Modes (PSM) tesseract::PSM_AUTO (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.

Returns
Type Description
OCRTesseract
| Improve this Doc View Source

DisposeManaged()

Releases managed resources

Declaration
protected override void DisposeManaged()
Overrides
DisposableObject.DisposeManaged()
| Improve this Doc View Source

Run(Mat, Mat, out String, out Rect[], out String[], out Single[], ComponentLevels)

Recognize text using the tesseract-ocr API. Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found(e.g.words), and the list of those text elements with their confidence values.

Declaration
public override void Run(Mat image, Mat mask, out string outputText, out Rect[] componentRects, out string[] componentTexts, out float[] componentConfidences, ComponentLevels componentLevel = ComponentLevels.Word)
Parameters
Type Name Description
Mat image

Input image CV_8UC1 or CV_8UC3

Mat mask
System.String outputText

Output text of the tesseract-ocr.

OpenCvSharp.Rect[] componentRects

If provided the method will output a list of Rects for the individual text elements found(e.g.words or text lines).

System.String[] componentTexts

If provided the method will output a list of text strings for the recognition of individual text elements found(e.g.words or text lines).

System.Single[] componentConfidences

If provided the method will output a list of confidence values for the recognition of individual text elements found(e.g.words or text lines).

ComponentLevels componentLevel

OCR_LEVEL_WORD (by default), or OCR_LEVEL_TEXT_LINE.

Overrides
BaseOCR.Run(Mat, Mat, out String, out Rect[], out String[], out Single[], ComponentLevels)
| Improve this Doc View Source

Run(Mat, out String, out Rect[], out String[], out Single[], ComponentLevels)

Recognize text using the tesseract-ocr API. Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found(e.g.words), and the list of those text elements with their confidence values.

Declaration
public override void Run(Mat image, out string outputText, out Rect[] componentRects, out string[] componentTexts, out float[] componentConfidences, ComponentLevels componentLevel = ComponentLevels.Word)
Parameters
Type Name Description
Mat image

Input image CV_8UC1 or CV_8UC3

System.String outputText

Output text of the tesseract-ocr.

OpenCvSharp.Rect[] componentRects

If provided the method will output a list of Rects for the individual text elements found(e.g.words or text lines).

System.String[] componentTexts

If provided the method will output a list of text strings for the recognition of individual text elements found(e.g.words or text lines).

System.Single[] componentConfidences

If provided the method will output a list of confidence values for the recognition of individual text elements found(e.g.words or text lines).

ComponentLevels componentLevel

OCR_LEVEL_WORD (by default), or OCR_LEVEL_TEXT_LINE.

Overrides
BaseOCR.Run(Mat, out String, out Rect[], out String[], out Single[], ComponentLevels)
| Improve this Doc View Source

SetWhiteList(String)

Declaration
public void SetWhiteList(string charWhitelist)
Parameters
Type Name Description
System.String charWhitelist

Implements

ICvPtrHolder
  • Improve this Doc
  • View Source
In This Article
Back to top Generated by DocFX