zoukankan      html  css  js  c++  java
  • [转]Mat to PIX during integrate opencv with tesseract

    Mat to PIX during integrate opencv with tesseract  

    2013-01-29

     转载地址:http://hepeng421.blog.163.com/blog/static/11948517201302911311745/
     
    when I integrate opencv with tesseract ,this is the following:
    I'm using OpenCV to extract a subimage of a scanned document and would like to use tesseract to perform OCR over this subimage.
    I found out that I can use two methods for text recognition in tesseract, but so far I wasn't able to find a working solution.
     
    A.) How can I convert a cv::Mat into a PIX*? (PIX* is a datatype of leptonica)
     
    Based on vasiles code below, this is essentially my current code:
     
     cv::Mat image = cv::imread("c:/image.png"); 
     cv::Mat subImage = image(cv::Rect(50, 200, 300, 100)); 
     
     int depth;
     if(subImage.depth() == CV_8U)
        depth = 8;
     //other cases not considered yet
     
     PIX* pix = pixCreateHeader(subImage.size().width, subImage.size().height, depth);
     pix->data = (l_uint32*) subImage.data; 
     
     tesseract::TessBaseAPI tess;
     STRING text; 
     if(tess.ProcessPage(pix, 0, 0, &text))
     {
        std::cout << text.string(); 
     }   
    While it doesn't crash or anything, the OCR result still is wrong. It should recognize one word of my sample image, but instead it returns some non-readable characters.
     
    The method PIX_HEADER doesn't exist, so I used pixCreateHeader, but it doesn't take the number of channels as an argument. So how can I set the number of channels?
     
    B.) How can I use cv::Mat for TesseractRect() ?
     
    Tesseract offers another method for text recognition with this signature:
     
    char * TessBaseAPI::TesseractRect   (   
        const UINT8 *   imagedata,
        int     bytes_per_pixel,
        int     bytes_per_line,
        int     left,
        int     top,
        int     width,
        int     height   
    )   
    Currently I am using the following code, but it also returns non-readable characters (although different ones than from the code above.
     
    char* cr = tess.TesseractRect(
               subImage.data, 
               subImage.channels(), 
               subImage.channels() * subImage.size().width, 
               0, 
               0, 
               subImage.size().width, 
               subImage.size().height);   
    c.)sample code here:
    tesseract::TessBaseAPI tess; 
    cv::Mat sub = image(cv::Rect(50, 200, 300, 100));
    tess.SetImage((uchar*)sub.data, sub.size().width, sub.size().height, sub.channels(), sub.step1());
    tess.Recognize(0);
    const char* out = tess.GetUTF8Text();
     
    d.)the solution in details:
    First, make a deep copy of your subImage, so that it will be stored in a coninuous memory block:
     
    cv::Mat subImage = image(cv::Rect(50, 200, 300, 100)).clone(); 
    Then, init a PIX headed (I don't know how) with the correct parameters.
     
    // ???? Put your own constructor here. 
    PIX* pix = new PIX_HEADER(width, height, channels, depth); 
    OR, create it manually:
     
    PIX pix;
    pix.width = subImage.width;
    ...
    Then set the pix data pointer to the subImage data pointer
     
    pix.data = subImage.data;
    Finally, make sure your subImage objects does not go out of scope before you finish your work with pix.
     
    e.)the solution found over internet:
    the solution provided is not yet achieved as linking the two(tesseract ocr-opencv)has become a challenge.
     
    And so,have attached a image which gives u a clear idea about what kind of input is 
    The numbers are only marked for your ease understanding which are nothing but the location of the text available.
     
    Suppose its a input image where  i have to read the contents within the box.
     
    If there is nothing just the rectangle with text, you can pass image to tesseract.
    If there is something around rectangle with text (e.g. you want to ignore everything outside rectangle) you need to:
    identify rectangle coordinates (with some opencv function or maybe with GetConnectedComponents from tesseract api)
    use SetRectangle after SetImage from tesseract api
     
    then how to code that.am not using any of the python tesseract library as in my case programming language used is c++.
     
    It would be of great help if somebody comes up within the snippet to achieve this.
     
    Do you mean something like this?
     
    #include <tesseract/baseapi.h>
    #include <leptonica/allheaders.h>
    #include <opencv2/opencv.hpp>
     
    int main() {
        tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
        if (api->Init("/usr/src/tesseract-ocr/", "eng"))  {
          fprintf(stderr, "Could not initialize tesseract. ");
          return 1;
        }
     
        IplImage *img = cvLoadImage("/home/user/sampleimage.png");
        if ( img == 0 ) {
          fprintf(stderr, "Cannot load input file! ");
          return 1;
        }
        api->SetImage((unsigned char*)img->imageData, img->width,
                       img->height, img->nChannels, img->widthStep);
     
        // be aware of tesseract coord systems starting at left top corner!
        api->SetRectangle(129, 184, 484, 108);
        char* outText = api->GetUTF8Text();
        printf("OCR output: ");
        printf(outText);
     
        api->Clear();
        api->End();
        delete [] outText;
        delete api;
        cvReleaseImage(&img);
     
        return 0;
    }
  • 相关阅读:
    kettle结合MySQL生成保留最近6个月月度报告_20161009
    reduce用法
    【npm下载依赖包失败】gyp ERR! stack Error: EACCES: permission denied, mkdir问题解决方案
    【前端算法3】插入排序
    【前端算法2】快速排序
    【前端算法1】二分查找
    diy 滚动条 样式 ---- 核心代码
    [数据结构] 栈
    [数据结构] 列表
    day02 Python 运算符
  • 原文地址:https://www.cnblogs.com/Crysaty/p/6529669.html
Copyright © 2011-2022 走看看