zoukankan      html  css  js  c++  java
  • paddlehub -- 开箱即用的模型库

    PaddleHub

    https://github.com/PaddlePaddle/PaddleHub

    令人惊叹的已训练好的模型工具库, 基于Paddle。

    Awesome pre-trained models toolkit based on PaddlePaddle.(260+ models including Image, Text, Audio and Video with Easy Inference & Serving deployment)

    提供丰富、高质量、直接可用的已训练好的模型

    不需要深度学习背景

    覆盖四大类别,图像、文本、音频、视频

    开源、免费

    Introduction

    • PaddleHub aims to provide developers with rich, high-quality, and directly usable pre-trained models.
    • No need for deep learning background, you can use AI models quickly and enjoy the dividends of the artificial intelligence era.
    • Covers 4 major categories of Image, Text, Audio, and Video, and supports one-click prediction, easy service deployment and transfer learning
    • All models are OPEN SOURCE, FREE to download and use them in offline scenario.

    特定模型服务于特定场景

    https://www.paddlepaddle.org.cn/hub

    PaddleHub
    便捷地获取PaddlePaddle生态下的预训练模型,完成模型的管理和一键预测。配合使用Fine-tune API,可以基于大规模预训练模型快速完成迁移学习,让预训练模型能更好地服务于用户特定场景的应用。
    • 无需数据和训练,一键模型应用
    • 一键模型转服务
    • 易用的迁移学习
    • 丰富的预训练模型

    安装两个库

    !pip install --upgrade paddlepaddle -i https://mirror.baidu.com/pypi/simple
    !pip install --upgrade paddlehub -i https://mirror.baidu.com/pypi/simple

    示例

    几行代码就可使用。

    如下是中文分词工具使用

    !pip install --upgrade paddlepaddle -i https://mirror.baidu.com/pypi/simple
    !pip install --upgrade paddlehub -i https://mirror.baidu.com/pypi/simple
    
    import paddlehub as hub
    
    lac = hub.Module(name="lac")
    test_text = ["今天是个好天气。"]
    
    results = lac.cut(text=test_text, use_gpu=False, batch_size=1, return_tag=True)
    print(results)
    #{'word': ['今天', '是', '个', '好天气', '。'], 'tag': ['TIME', 'v', 'q', 'n', 'w']}

    模型库-modelbase

    https://www.paddlepaddle.org.cn/modelbase

    • 智能视觉(PaddleCV)
      • 图像分类
      • 目标检测
      • 图像分割
      • 关键点检测
      • 图像生成
      • 场景文字识别
      • 度量学习
      • 视频
    • 智能文本处理(PaddleNLP)
      • NLP 基础技术
      • NLP 核心技术
      • NLP系统应用
    • 智能推荐(PaddleRec)
    • 智能语音(PaddleSpeech)
    • 其他模型
     

    chinese_ocr_db_crnn_mobile

    https://www.paddlepaddle.org.cn/hubdetail?name=chinese_ocr_db_crnn_mobile&en_category=TextRecognition

    支持中文的OCR模型。

    支持三种使用方式

    命令行预测

    $ hub run chinese_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE"

    API调用

    import paddlehub as hub
    import cv2
    
    ocr = hub.Module(name="chinese_ocr_db_crnn_mobile")
    result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')])
    
    # or
    # result = ocr.recognize_text(paths=['/PATH/TO/IMAGE'])

    服务部署

    启动PaddleHub Serving

    运行启动命令:

    $ hub serving start -m chinese_ocr_db_crnn_mobile

    发送预测请求

    配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果

    import requests
    import json
    import cv2
    import base64
    
    def cv2_to_base64(image):
        data = cv2.imencode('.jpg', image)[1]
        return base64.b64encode(data.tostring()).decode('utf8')
    
    # 发送HTTP请求
    data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
    headers = {"Content-type": "application/json"}
    url = "http://127.0.0.1:8866/predict/chinese_ocr_db_crnn_mobile"
    r = requests.post(url=url, headers=headers, data=json.dumps(data))
    
    # 打印预测结果
    print(r.json()["results"])

    DEMO

    https://github.com/fanqingsong/code_snippet/blob/master/machine_learning/paddle/ocr.py

    从验证码中提取数字

    import paddlehub as hub
    import cv2
    
    ocr = hub.Module(name="chinese_ocr_db_crnn_mobile")
    result = ocr.recognize_text(images=[cv2.imread('./test2.png')])
    
    print(result)

    如下为打印,粗体为提取数字。

     WARNING: Logging before InitGoogleLogging() is written to STDERR
    W0310 14:03:05.659210 11502 default_variables.cpp:429] Fail to open /proc/self/io: No such file or directory [2]
    /root/.pyenv/versions/3.6.8/lib/python3.6/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
      import imp
    [2021-03-10 14:03:14,697] [ WARNING] - The _initialize method in HubModule will soon be deprecated, you can use the __init__() to handle the initialization of the object
    W0310 14:03:14.708413 11502 analysis_predictor.cc:1145] Deprecated. Please use CreatePredictor instead.
    [2021-03-10 14:03:15,063] [ WARNING] - The _initialize method in HubModule will soon be deprecated, you can use the __init__() to handle the initialization of the object
    [{'save_path': '', 'data': [{'text': '6067', 'confidence': 0.8805994987487793, 'text_box_position': [[9, 2], [52, 2], [52, 16], [9, 16]]}]}]

    出处:http://www.cnblogs.com/lightsong/ 本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接。
  • 相关阅读:
    [转] 分代垃圾回收的 新旧代引用问题(原标题:Back To Basics: Generational Garbage Collection)
    javascript中数组总结
    mybatis 与 反射
    mybatis 与 缓存
    mybatis 与 日志
    mybatis与 Exception
    mybatis 与 xml
    Redis -- 03 持久化
    Redis -- 02 配置文件解析
    【翻译】Crafting a Next-Gen Material Pipeline for The Order: 1886
  • 原文地址:https://www.cnblogs.com/lightsong/p/14511461.html
Copyright © 2011-2022 走看看