Class OCRTesseract
- Namespace
- OpenCvSharp.Text
- Assembly
- OpenCvSharp.dll
Recognize text using the tesseract-ocr API.
Takes image on input and returns recognized text in the output_text parameter. Optionallyprovides also the Rects for individual text elements found(e.g.words), and the list of those text elements with their confidence values.
public sealed class OCRTesseract : BaseOCR, IDisposable, ICvPtrHolder
- Inheritance
-
OCRTesseract
- Implements
- Inherited Members
Methods
Create(string?, string?, string?, int, int)
Creates an instance of the OCRTesseract class. Initializes Tesseract.
public static OCRTesseract Create(string? datapath = null, string? language = null, string? charWhitelist = null, int oem = 3, int psmode = 3)
Parameters
datapathstringdatapath the name of the parent directory of tessdata ended with "/", or null to use the system's default directory.
languagestringan ISO 639-3 code or NULL will default to "eng".
charWhiteliststringspecifies the list of characters used for recognition. null defaults to "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ".
oeminttesseract-ocr offers different OCR Engine Modes (OEM), by deffault tesseract::OEM_DEFAULT is used.See the tesseract-ocr API documentation for other possible values.
psmodeinttesseract-ocr offers different Page Segmentation Modes (PSM) tesseract::PSM_AUTO (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.
Returns
DisposeManaged()
Releases managed resources
protected override void DisposeManaged()
Run(Mat, Mat, out string, out Rect[], out string?[], out float[], ComponentLevels)
Recognize text using the tesseract-ocr API. Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found(e.g.words), and the list of those text elements with their confidence values.
public override void Run(Mat image, Mat mask, out string outputText, out Rect[] componentRects, out string?[] componentTexts, out float[] componentConfidences, ComponentLevels componentLevel = ComponentLevels.Word)
Parameters
imageMatInput image CV_8UC1 or CV_8UC3
maskMatoutputTextstringOutput text of the tesseract-ocr.
componentRectsRect[]If provided the method will output a list of Rects for the individual text elements found(e.g.words or text lines).
componentTextsstring[]If provided the method will output a list of text strings for the recognition of individual text elements found(e.g.words or text lines).
componentConfidencesfloat[]If provided the method will output a list of confidence values for the recognition of individual text elements found(e.g.words or text lines).
componentLevelComponentLevelsOCR_LEVEL_WORD (by default), or OCR_LEVEL_TEXT_LINE.
Run(Mat, out string, out Rect[], out string?[], out float[], ComponentLevels)
Recognize text using the tesseract-ocr API. Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found(e.g.words), and the list of those text elements with their confidence values.
public override void Run(Mat image, out string outputText, out Rect[] componentRects, out string?[] componentTexts, out float[] componentConfidences, ComponentLevels componentLevel = ComponentLevels.Word)
Parameters
imageMatInput image CV_8UC1 or CV_8UC3
outputTextstringOutput text of the tesseract-ocr.
componentRectsRect[]If provided the method will output a list of Rects for the individual text elements found(e.g.words or text lines).
componentTextsstring[]If provided the method will output a list of text strings for the recognition of individual text elements found(e.g.words or text lines).
componentConfidencesfloat[]If provided the method will output a list of confidence values for the recognition of individual text elements found(e.g.words or text lines).
componentLevelComponentLevelsOCR_LEVEL_WORD (by default), or OCR_LEVEL_TEXT_LINE.
SetWhiteList(string)
public void SetWhiteList(string charWhitelist)
Parameters
charWhiteliststring