Class OCRTesseract
Recognize text using the tesseract-ocr API.
Takes image on input and returns recognized text in the output_text parameter. Optionallyprovides also the Rects for individual text elements found(e.g.words), and the list of those text elements with their confidence values.
Implements
Inherited Members
Namespace: OpenCvSharp.Text
Assembly: OpenCvSharp.dll
Syntax
public sealed class OCRTesseract : BaseOCR, ICvPtrHolder
Methods
| Improve this Doc View SourceCreate(String, String, String, Int32, Int32)
Creates an instance of the OCRTesseract class. Initializes Tesseract.
Declaration
public static OCRTesseract Create(string datapath = null, string language = null, string charWhitelist = null, int oem = 3, int psmode = 3)
Parameters
Type | Name | Description |
---|---|---|
System.String | datapath | datapath the name of the parent directory of tessdata ended with "/", or null to use the system's default directory. |
System.String | language | an ISO 639-3 code or NULL will default to "eng". |
System.String | charWhitelist | specifies the list of characters used for recognition. null defaults to "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ". |
System.Int32 | oem | tesseract-ocr offers different OCR Engine Modes (OEM), by deffault tesseract::OEM_DEFAULT is used.See the tesseract-ocr API documentation for other possible values. |
System.Int32 | psmode | tesseract-ocr offers different Page Segmentation Modes (PSM) tesseract::PSM_AUTO (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values. |
Returns
Type | Description |
---|---|
OCRTesseract |
DisposeManaged()
Releases managed resources
Declaration
protected override void DisposeManaged()
Overrides
| Improve this Doc View SourceRun(Mat, Mat, out String, out Rect[], out String[], out Single[], ComponentLevels)
Recognize text using the tesseract-ocr API. Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found(e.g.words), and the list of those text elements with their confidence values.
Declaration
public override void Run(Mat image, Mat mask, out string outputText, out Rect[] componentRects, out string[] componentTexts, out float[] componentConfidences, ComponentLevels componentLevel = ComponentLevels.Word)
Parameters
Type | Name | Description |
---|---|---|
Mat | image | Input image CV_8UC1 or CV_8UC3 |
Mat | mask | |
System.String | outputText | Output text of the tesseract-ocr. |
OpenCvSharp.Rect[] | componentRects | If provided the method will output a list of Rects for the individual text elements found(e.g.words or text lines). |
System.String[] | componentTexts | If provided the method will output a list of text strings for the recognition of individual text elements found(e.g.words or text lines). |
System.Single[] | componentConfidences | If provided the method will output a list of confidence values for the recognition of individual text elements found(e.g.words or text lines). |
ComponentLevels | componentLevel | OCR_LEVEL_WORD (by default), or OCR_LEVEL_TEXT_LINE. |
Overrides
| Improve this Doc View SourceRun(Mat, out String, out Rect[], out String[], out Single[], ComponentLevels)
Recognize text using the tesseract-ocr API. Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found(e.g.words), and the list of those text elements with their confidence values.
Declaration
public override void Run(Mat image, out string outputText, out Rect[] componentRects, out string[] componentTexts, out float[] componentConfidences, ComponentLevels componentLevel = ComponentLevels.Word)
Parameters
Type | Name | Description |
---|---|---|
Mat | image | Input image CV_8UC1 or CV_8UC3 |
System.String | outputText | Output text of the tesseract-ocr. |
OpenCvSharp.Rect[] | componentRects | If provided the method will output a list of Rects for the individual text elements found(e.g.words or text lines). |
System.String[] | componentTexts | If provided the method will output a list of text strings for the recognition of individual text elements found(e.g.words or text lines). |
System.Single[] | componentConfidences | If provided the method will output a list of confidence values for the recognition of individual text elements found(e.g.words or text lines). |
ComponentLevels | componentLevel | OCR_LEVEL_WORD (by default), or OCR_LEVEL_TEXT_LINE. |
Overrides
| Improve this Doc View SourceSetWhiteList(String)
Declaration
public void SetWhiteList(string charWhitelist)
Parameters
Type | Name | Description |
---|---|---|
System.String | charWhitelist |