zoukankan      html  css  js  c++  java
  • atitit。ocr框架类库大全 attilax总结

    atititocr框架类库大全 attilax总结

     

    Tesseract

    Asprise JavaOCR

     

     

    闲来无事,发现百度有一个OCR文字识别接口,感觉挺有意思的,拿来研究一下。       

    百度服务简介:文字识别是百度自然场景OCR服务,依托百度业界领先的OCR算法,提供了整图文字检测、识别、整图文字识别、整图文字行定位和单字图像识别等功能。

    不多说啦,直接看demo吧!

     

     

    java4less

    The J4L OCR tools is set of components that can be used to include OCR capabilities in Java applications. That means you can receive faxes, PDF files or scan documents and extract business information from the images. The main 3 components are:

    a Java wrapper for the Tesseract OCR engine. The OCR engine Tesseract itself is delivered under the Apache 2.0 license and we support a version compiled for windows only.

    a PDF to text converter. 

    a text document parser.

    The document recognition process can therefore be divided in 2 steps:

    The component takes an image file (tif, png, jpg....) or a PDF file and returns the text contained in it. The Java wrapper will perform this operation by using Tesseract. Alternatively you can use any other OCR engine. If you are however using a PDF file, you will use our PDF to Text converter.

    In the second step, your Java application needs to understand the text returned by the OCR engine or PDF converter. This is done by the document parser. The document parser uses as input as text string (the data) and a xml file that describes the structure of the document and the ouput is a business document either as a Java object or as a XML file

     

     

     

    JAVA实现百度OCR文字识别功能 - 张荣珍的专栏 - 博客频道 - CSDN.NET.html

    作者:: 绰号:老哇的爪子 ( 全名::Attilax Akbar Al Rapanui 阿提拉克斯 阿克巴 阿尔 拉帕努伊 ) 

    汉字名:艾提拉(艾龙)   EMAIL:1466519819@qq.com

    转载请注明来源: http://www.cnblogs.com/attilax/

    Atiend

     

  • 相关阅读:
    nginx 服务企业应用
    3D模型展示以及体积、表面积计算
    php实现MySQL读写分离
    three.js实现3D模型展示
    thinkphp5.1+think-queue
    GIT记住远端仓库地址密码
    php实现采集(仅做参考)
    phpStudy集成环境apche+openssl配置本地https
    HTTP与HTTPS区别
    在父页面用Iframe加载子页面时,将父页面的title替换成子页面title
  • 原文地址:https://www.cnblogs.com/attilax/p/6021585.html
Copyright © 2011-2022 走看看