LEADTOOLS入门教程：Leadtools .NET OCR用法-zhuhm-ChinaUnix博客

八哥的窝子

首页　| 　博文目录　| 　关于我

zhuhm

博客访问： 413195
博文数量： 159
博客积分： 372
博客等级：一等列兵
技术积分： 1693
用户组：普通用户
注册时间： 2012-01-13 17:05

个人简介

致力于图形处理和移动开发。

文章分类

全部博文（159）

用户界面UI（4）
报表（3）
软件测试（5）
Windows 8（1）
开源（3）
DICOM（4）
数据库（1）
项目管理（2）
BI商业智能（1）
Web开发（23）
原型设计（2）
图像处理（16）
条形码（8）
视频音频处理（3）
云计算（3）
移动开发（8）
OCR（7）
HTML5（1）
生物识别（1）
测试与优化（5）
DevExpress（0）
Java（3）
软件保护（2）
仿真软件（2）
IDE（12）
图表类（33）
未分配的博文（6）

文章存档

2014年（77）

2013年（67）

2012年（15）

我的朋友

zxqcreaz

相关博文

LEADTOOLS入门教程：Leadtools .NET OCR用法

分类： C#/.net

2013-12-12 10:30:32

LEADTOOLS OCR功能提供了将光学字符识别（OCR）技术融合到应用程序中的方法。OCR可将位图图像转换为文本。

一旦在系统中安装LEADTOOLS .NET OCR工具包，用户便可以在程序中使用LEADTOOLS OCR。需要注意的是，在用户使用OCR属性，方法和事件之前，必须对OCR功能解锁。

用户可以添加引用到Leadtools.Forms.Ocr.dll和 Leadtools.Forms.DocumentWriter.dll组件从而启动LEADTOOLS for .NET OCR。这些组件包含了各种接口、类、结构和委托。

由于LEADTOOLS OCR工具包支持多个引擎，一旦创建了IOcrEngine接口实例，与引擎接口的实际代码便被存储在一个被动态加载的单独程序集中。因此，你必须确保即将使用的引擎程序集位于旁边的Leadtools.Forms.Ocr.dll组件。如果你需要自动检测依赖关系，你可以将引擎程序集作为引用添加到程序中。

LEADTOOLS提供了实现下列功能的方法：

从各种文字、文字处理、数据库或者电子表格文档中识别和导出文本；
在单线程或者多线程环境下执行OCR处理；
选择需要识别的文档语言，如英语，丹麦语，荷兰语，芬兰语，法语，德语，意大利语，挪威语，葡萄牙语，俄语，西班牙语或瑞典语；
自动或手动将复杂页面划分为文本区，图像区，表格区，线，页眉和页脚；
识别前，设置精度阈值以控制识别精度；
自动检测传真，点阵和其他degraded文档；
支持多种文档保存格式，如Adobe PDF、 PDF/A, MS Word, MS Excel和UNICODE文本等等。
处理文本和图形。

通过OCR手柄与OCR引擎和包含的页面列表的OCR文档进行交互。OCR手柄是安装在系统上的LEADTOOLS OCR和OCR引擎之间的通信会话。OCR手柄是一种内部结构，包含了识别、获取信息、设置信息和文本验证的所有必要信息。

识别单页或多页的步骤如下：

1、选择所需引擎类型并创建IOcrEngine接口实例；

2、利用 IOcrEngine.Startup方法启动OCR引擎；

3、简单单页或多页OCR文档；

4、手动或自动创建页面区域；

5、设置OCR引擎所需的活动语言；

6、设置拼写检查语言；

7、识别；

8、保存识别结果；

9、关闭OCR引擎。

步骤4，5，6和7可以不必依照顺序进行，只要在OCR引擎启动后和页面识别之间执行这几个步骤即可。

下面的示例展示了如何执行上述步骤：

Visual Basic

' Assuming you added "Imports Leadtools.Forms.Ocr" and "Imports Leadtools.Forms.DocumentWriter" at the beginning of this class
' *** Step 1: Select the engine type and create an instance of the IOcrEngine interface.
' We will use the LEADTOOLS OCR Plus engine and use it in the same process
Dim ocrEngine As IOcrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Plus, False)

' *** Step 2: Startup the engine.

' Use the default parameters
ocrEngine.Startup(Nothing, Nothing, Nothing, "C:\LEADTOOLS 18\Bin\Common\OcrAdvantageRuntime")

' *** Step 3: Create an OCR document with one or more pages.

Dim ocrDocument As IOcrDocument = ocrEngine.DocumentManager.CreateDocument()

' Add all the pages of a multi-page TIF image to the document
ocrDocument.Pages.AddPages("C:\Users\Public\Documents\LEADTOOLS Images\Ocr.tif", 1, -1, Nothing)

' *** Step 4: Establish zones on the page(s), either manually or automatically

' Automatic zoning
ocrDocument.Pages.AutoZone(Nothing)

' *** Step 5: (Optional) Set the active languages to be used by the OCR engine

' Enable English and German languages
ocrEngine.LanguageManager.EnableLanguages(New String() {"en", "de"})

' *** Step 6: (Optional) Set the spell checking language
' Enable the spell checking system and set English as the spell language
ocrEngine.SpellCheckManager.SpellCheckEngine = OcrSpellCheckEngine.Native
ocrEngine.SpellCheckManager.SpellLanguage = "en"

' *** Step 7: (Optional) Set any special recognition module options

' Change the fill method for the first zone in the first page to be Omr
Dim ocrZone As OcrZone = ocrDocument.Pages(0).Zones(0)
ocrZone.FillMethod = OcrZoneFillMethod.Omr
ocrDocument.Pages(0).Zones(0) = ocrZone

' *** Step 8: Recognize

ocrDocument.Pages.Recognize(Nothing)

' *** Step 9: Save recognition results

' Save the results to a PDF file
ocrDocument.Save("C:\Users\Public\Documents\LEADTOOLS Images\Document.pdf", DocumentFormat.Pdf, Nothing)
ocrDocument.Dispose()

' *** Step 10: Shut down the OCR engine when finished
ocrEngine.Shutdown()
ocrEngine.Dispose()

C#

// Assuming you added "using Leadtools.Codecs;", "using Leadtools.Forms.Ocr;" and "using Leadtools.Forms.DocumentWriters;" at the beginning of this class
// *** Step 1: Select the engine type and create an instance of the IOcrEngine interface.

// We will use the LEADTOOLS OCR Plus engine and use it in the same process
IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false);

// *** Step 2: Startup the engine.

// Use the default parameters
ocrEngine.Startup(null, null, null, @"C:\LEADTOOLS 18\Bin\Common\OcrAdvantageRuntime");

// *** Step 3: Create an OCR document with one or more pages.

IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument();

// Add all the pages of a multi-page TIF image to the document
ocrDocument.Pages.AddPages(@"C:\Users\Public\Documents\LEADTOOLS Images\Ocr.tif", 1, -1, null);

// *** Step 4: Establish zones on the page(s), either manually or automatically

// Automatic zoning
ocrDocument.Pages.AutoZone(null);

// *** Step 5: (Optional) Set the active languages to be used by the OCR engine

// Enable English and German languages
ocrEngine.LanguageManager.EnableLanguages(new string[] { "en", "de" });

// *** Step 6: (Optional) Set the spell checking language

// Enable the spell checking system and set English as the spell language
ocrEngine.SpellCheckManager.SpellCheckEngine = OcrSpellCheckEngine.Native;
ocrEngine.SpellCheckManager.SpellLanguage = "en";

// *** Step 7: (Optional) Set any special recognition module options

// Change the fill method for the first zone in the first page to be default
OcrZone ocrZone = ocrDocument.Pages[0].Zones[0];
ocrZone.FillMethod = OcrZoneFillMethod.Default;
ocrDocument.Pages[0].Zones[0] = ocrZone;

// *** Step 8: Recognize

ocrDocument.Pages.Recognize(null);

// *** Step 9: Save recognition results

// Save the results to a PDF file
ocrDocument.Save(@"C:\Users\Public\Documents\LEADTOOLS Images\Document.pdf", DocumentFormat.Pdf, null);
ocrDocument.Dispose();

// *** Step 10: Shut down the OCR engine when finished
ocrEngine.Shutdown();
ocrEngine.Dispose();

阅读(1037) | 评论(0) | 转发(0) |

上一篇：最佳开源应用程序：图形/图像处理软件

下一篇：LEADTOOLS使用教程：LEADTOOLS Barcode用法

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6