zoukankan      html  css  js  c++  java
  • 学习Python3 试了一下百度OCR和腾讯OCR

    因为有个小功能,需要用一下OCR,所以先找了2家,百度和腾讯,如何开通,如何创建应用获得key等不作说明了

    百度的比较简单,引用一个AipOcr全部搞定,代码如下:

    from aip import AipOcr
    
    #下面3个变量请自行更改
    APP_ID = '1111118'
    API_KEY = 'r011111111iAfy'
    SECRET_KEY = 'ZKca1111111DK5XZrq'
    
    aipOcr  = AipOcr(APP_ID, API_KEY, SECRET_KEY)
    
    # 读取图片
    filePath = "d:/temp/0001.png"
    def get_file_content(filePath):
        with open(filePath, 'rb') as fp:
            return fp.read()
    
    # 定义参数变量
    options = {
      'detect_direction': 'true',
      'language_type': 'CHN_ENG',
    }
    
    # 调用通用文字识别接口
    result = aipOcr.basicAccurate(get_file_content(filePath), options)
    
    print(result)

    腾讯的比较坑B,有python的库,但是2.0的,这不重要,重要的是python的库中其它的识别有,但没有识别印刷体的,需要用http去请求,去NM的。

    可能是刚学python,在ocr请求中的签名让我弄了一整天,MD,网上那些的签名都是别的应用的,总之难死我了

    后来,下载了它们的java版的sdk,看了一下他们的签名代码,然后经过结果比对,总算弄出来了

    全部代码如下:

    import requests
    import hmac
    import hashlib
    import base64
    import time
    import random
    
    
    appid =  '12111173'
    bucket = ""
    secret_id ='AKIDI111RAjYU' # 参考官方文档
    secret_key = 'S2iRe011111iM6xlHo'  # 同上
    
    expired = time.time() + 2592000
    onceExpired = 0
    current = time.time()
    rdm = ''.join(random.choice("0123456789") for i in range(10))
    info = "a=" + appid + "&b=" + bucket + "&k=" + secret_id + "&e=" + str(expired) + "&t=" + str(current) + "&r=" + str(rdm) + "&u=0&f="
    print(info)
    signature = bytes(info, encoding='utf-8')
    secretkey = bytes(secret_key, encoding='utf-8')
    my_sign = hmac.new(secretkey,signature, hashlib.sha1).digest()
    bb= my_sign+signature
    sign1 = base64.b64encode(bb)
    sign2=str(sign1,'utf-8')
    print(sign2)
    url = "http://recognition.image.myqcloud.com/ocr/general"
    headers = {'Host': 'recognition.image.myqcloud.com',
               "Authorization": sign2 ,
               }
    files = {'appid': (None, appid),
             'bucket': (None, bucket),
             'image': ('1.jpg', open('d:/temp/0001.png', 'rb'), 'image/jpeg')
             }
    
    r = requests.post(url, files=files, headers=headers)
    responseinfo = r.content
    
    print(responseinfo)

    识别同一个图片,百度的竟然比不过,明显的一个USD识别成了JSD,我ca。。。。。

  • 相关阅读:
    ASM ClassReader failed to parse class file
    idea运行java项目js中文乱码如何解决
    Error:(182, 32) java: 常量字符串过长
    ssh启动报错:org.dom4j.DocumentException: Connection timed out: connect Nested exception: Connection timed out: connect
    [Intro to Deep Learning with PyTorch -- L2 -- N14] Sigmoid function
    [CSS3] CSS Selector
    [HTML5] document.activeElement
    [Intro to Deep Learning with PyTorch -- L2 -- N9] Perceptron Trick
    [Javascript] Broadcaster, operator, listener pattern: Write a debounce operator -- 1
    [CSS] place-content = align-items + justify-content
  • 原文地址:https://www.cnblogs.com/szyicol/p/9413516.html
Copyright © 2011-2022 走看看