基于百度AI文字识别系列

About

百度AI平台有丰富的接口供我们调用，包括人脸识别、文字识别、语音识别等，非常的方便。
想要使用该平台，首先要有一个百度账号，然后进入百度IA开放平台，创建相应的应用：

进入文字识别，这时可能会提示你登录，登录完事后，进入自己的控制台，选择文字识别，然后创建应用，应用名和描述视情况填写。然后记住下图中的相关参数，后续会用到。

通用文字识别

首先要下载包：

pip install baidu-aip

图片：

代码：

from aip import AipOcr


def initial():
    """ 初始化连接 """
    APP_ID = '你的 App ID'
    API_KEY = '你的 Api Key'
    SECRET_KEY = '你的 Secret Key'    
    return AipOcr(APP_ID, API_KEY, SECRET_KEY)

def get_file_content(filePath):
    """ 读取图片 """
    with open(filePath, 'rb') as f:
        return f.read()

if __name__ == '__main__':
    client = initial()
    image = get_file_content('img3.png')
    res1 = client.basicGeneral(image)  # 调用通用文字识别, 图片参数为本地图片
    res2 = client.basicAccurate(image)  # 调用通用文字识别（高精度版）
    # 调用通用文字识别, 图片参数为远程url图片
    res3 = client.basicGeneralUrl('https://img2018.cnblogs.com/blog/1168165/201906/1168165-20190623215706582-962703809.png')
    print(res1)  # 返回结果
    for text in res1['words_result']:
        print(text['words'])

车牌识别

图片：

代码

from aip import AipOcr


def initial():
    """ 初始化连接 """
    APP_ID = '你的 App ID'
    API_KEY = '你的 Api Key'
    SECRET_KEY = '你的 Secret Key'    
    return AipOcr(APP_ID, API_KEY, SECRET_KEY)

def get_file_content(filePath):
    with open(filePath, 'rb') as f:
        return f.read()

if __name__ == '__main__':
    client = initial()
    image = get_file_content('car1.jpg')
    option = {
        'multi_detect': True   # true指识别一张图片中多个车牌，false则识别一张
    }
    res3 = client.licensePlate(image, option)
    # print(res3)  # 返回结果
    for item in res3['words_result']:
        print(item['number'])

四行代码实现英汉翻译

首先要下载翻译库：

pip install translate

上代码：

from translate import Translator
translator = Translator(to_lang='chinese')
translation = translator.translate('good morning')
print(translation)

英汉翻译并且自动播放

首先下载百度ai的模块：

pip install baidu-aip

代码：


import os
import time
from translate import Translator
from  aip import AipSpeech

def initial():
    """ 初始化连接 """
    APP_ID = '16739774'
    API_KEY = 'D2TGLMzSmT7SAUQjtQpV8Yw4'
    SECRET_KEY = 't0uG3QAUnxxr1s2Og9NCt8r8jsgbf2G7'
    return AipSpeech(APP_ID, API_KEY, SECRET_KEY)

client = initial()



translator = Translator(from_lang='chinese', to_lang='english')
translation = translator.translate('hello, 西大街怎么走？')


result = client.synthesis(translation, 'zh', 1, {
    'vol': 5,  # 音量，取值0-15，默认为5中音量
    'pid': 5,  # 音调，取值0-9，默认为5中语调
    'spd': 4,  # 语速，取值0-9，默认为5中语速
    'per': 0,  # 发音人选择, 0为女声，1为男声，3为情感合成-度逍遥，4为情感合成-度丫丫，默认为普通女
})

# 识别正确返回语音二进制 错误则返回dict 参照下面错误码
if not isinstance(result, dict):
    with open('audio.mp3', 'wb') as f:
        f.write(result)

print(translation)
os.system('audio.mp3')

time.sleep(5)
os.system('taskkill /F /IM wmplayer.exe')