zoukankan      html  css  js  c++  java
  • [转]Mat to PIX during integrate opencv with tesseract

    Mat to PIX during integrate opencv with tesseract  

    2013-01-29

     转载地址:http://hepeng421.blog.163.com/blog/static/11948517201302911311745/
     
    when I integrate opencv with tesseract ,this is the following:
    I'm using OpenCV to extract a subimage of a scanned document and would like to use tesseract to perform OCR over this subimage.
    I found out that I can use two methods for text recognition in tesseract, but so far I wasn't able to find a working solution.
     
    A.) How can I convert a cv::Mat into a PIX*? (PIX* is a datatype of leptonica)
     
    Based on vasiles code below, this is essentially my current code:
     
     cv::Mat image = cv::imread("c:/image.png"); 
     cv::Mat subImage = image(cv::Rect(50, 200, 300, 100)); 
     
     int depth;
     if(subImage.depth() == CV_8U)
        depth = 8;
     //other cases not considered yet
     
     PIX* pix = pixCreateHeader(subImage.size().width, subImage.size().height, depth);
     pix->data = (l_uint32*) subImage.data; 
     
     tesseract::TessBaseAPI tess;
     STRING text; 
     if(tess.ProcessPage(pix, 0, 0, &text))
     {
        std::cout << text.string(); 
     }   
    While it doesn't crash or anything, the OCR result still is wrong. It should recognize one word of my sample image, but instead it returns some non-readable characters.
     
    The method PIX_HEADER doesn't exist, so I used pixCreateHeader, but it doesn't take the number of channels as an argument. So how can I set the number of channels?
     
    B.) How can I use cv::Mat for TesseractRect() ?
     
    Tesseract offers another method for text recognition with this signature:
     
    char * TessBaseAPI::TesseractRect   (   
        const UINT8 *   imagedata,
        int     bytes_per_pixel,
        int     bytes_per_line,
        int     left,
        int     top,
        int     width,
        int     height   
    )   
    Currently I am using the following code, but it also returns non-readable characters (although different ones than from the code above.
     
    char* cr = tess.TesseractRect(
               subImage.data, 
               subImage.channels(), 
               subImage.channels() * subImage.size().width, 
               0, 
               0, 
               subImage.size().width, 
               subImage.size().height);   
    c.)sample code here:
    tesseract::TessBaseAPI tess; 
    cv::Mat sub = image(cv::Rect(50, 200, 300, 100));
    tess.SetImage((uchar*)sub.data, sub.size().width, sub.size().height, sub.channels(), sub.step1());
    tess.Recognize(0);
    const char* out = tess.GetUTF8Text();
     
    d.)the solution in details:
    First, make a deep copy of your subImage, so that it will be stored in a coninuous memory block:
     
    cv::Mat subImage = image(cv::Rect(50, 200, 300, 100)).clone(); 
    Then, init a PIX headed (I don't know how) with the correct parameters.
     
    // ???? Put your own constructor here. 
    PIX* pix = new PIX_HEADER(width, height, channels, depth); 
    OR, create it manually:
     
    PIX pix;
    pix.width = subImage.width;
    ...
    Then set the pix data pointer to the subImage data pointer
     
    pix.data = subImage.data;
    Finally, make sure your subImage objects does not go out of scope before you finish your work with pix.
     
    e.)the solution found over internet:
    the solution provided is not yet achieved as linking the two(tesseract ocr-opencv)has become a challenge.
     
    And so,have attached a image which gives u a clear idea about what kind of input is 
    The numbers are only marked for your ease understanding which are nothing but the location of the text available.
     
    Suppose its a input image where  i have to read the contents within the box.
     
    If there is nothing just the rectangle with text, you can pass image to tesseract.
    If there is something around rectangle with text (e.g. you want to ignore everything outside rectangle) you need to:
    identify rectangle coordinates (with some opencv function or maybe with GetConnectedComponents from tesseract api)
    use SetRectangle after SetImage from tesseract api
     
    then how to code that.am not using any of the python tesseract library as in my case programming language used is c++.
     
    It would be of great help if somebody comes up within the snippet to achieve this.
     
    Do you mean something like this?
     
    #include <tesseract/baseapi.h>
    #include <leptonica/allheaders.h>
    #include <opencv2/opencv.hpp>
     
    int main() {
        tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
        if (api->Init("/usr/src/tesseract-ocr/", "eng"))  {
          fprintf(stderr, "Could not initialize tesseract. ");
          return 1;
        }
     
        IplImage *img = cvLoadImage("/home/user/sampleimage.png");
        if ( img == 0 ) {
          fprintf(stderr, "Cannot load input file! ");
          return 1;
        }
        api->SetImage((unsigned char*)img->imageData, img->width,
                       img->height, img->nChannels, img->widthStep);
     
        // be aware of tesseract coord systems starting at left top corner!
        api->SetRectangle(129, 184, 484, 108);
        char* outText = api->GetUTF8Text();
        printf("OCR output: ");
        printf(outText);
     
        api->Clear();
        api->End();
        delete [] outText;
        delete api;
        cvReleaseImage(&img);
     
        return 0;
    }
  • 相关阅读:
    win8 app scrollviewer ZoomMode
    win8 metro app 不支持 trigger
    WinRT Convert Stream to BitmapImage
    Java深度历险(二)——Java类的加载、链接和初始化(收藏)
    简单的触发器实现
    Java深度历险(一)——Java字节代码的操纵(收藏)
    借用网上大神的一些知识,html5 video 视频播放都兼容(Android,iOS,电脑)
    还原或删除sql server 2008数据库时,经常烩出现: “因为数据库正在使用,所以无法获得对数据库的独占访问权”,终解决方案
    调研《构建之法》指导下的历届作品
    Hibernate 学习笔记一
  • 原文地址:https://www.cnblogs.com/Crysaty/p/6529669.html
Copyright © 2011-2022 走看看