zoukankan      html  css  js  c++  java
  • TensorFlow 验证码识别

    TensorFlow 验证码识别

     以下资料来源于极客时间学习资料

    • 准备模型开发环境

    第三方依赖包
    Pillow (PIL Fork) 
     
      PIL(Python Imaging Library) 为 Python 解释器添加了图像处理功能。但是,在 2009 年发布
    1.1.7 版本后,社区便停止更新和维护。
     
      Pillow 是由 Alex Clark 及社区贡献者 一起开发和维护的一款分叉自 PIL 的图像工具库。
    至今,社区依然非常活跃,Pillow 仍在快速迭代。
     
      Pillow提供广泛的文件格式支持,高效的内部表示和相当强大的图像处理功能。
    核心图像库旨在快速访问以几种基本像素格式存储的数据, 它应该为一般的图像处理工
    具提供坚实的基础。 
    captcha 
     
      Catpcha 是一个生成图像和音频验证码的开源工具库。
    from captcha.image import ImageCaptcha
    from captcha.audio import AudioCaptcha
    image
    = ImageCaptcha(fonts=['/path/A.ttf', '/path/B.ttf’]) data = image.generate('1234’) image.write('1234', 'out.png’)
    audio = AudioCaptcha(voicedir='/path/to/voices’) data = audio.generate('1234’) audio.write('1234', 'out.wav’)
    pydot
     
      pydot 是用纯 Python 实现的 GraphViz 接口,支持使用 GraphViz 解析和存储 DOT语言
        (graph description language)。其主要依赖 pyparsing 和 GraphViz 这两个工具库。
     
      pyparsing:仅用于加载DOT文件,在 pydot 安装期间自动安装。
     
      GraphViz:将图形渲染为PDF,PNG,SVG等格式文件,需独立安装。 
    flask
     
      flask 是一个基于 Werkzeug 和 jinja2 开发的 Python Web 应用程序框架,遵从 BSD 开源协
    议。它以一种简约的方式实现了框架核心,又保留了扩展性。 

     

    • 生成验证码数据集

    验证码(CAPTCHA)简介 
     
      全自动区分计算机和人类的公开图灵测试(英语:Completely Automated Public Turing test
    to tell Computers and Humans Apart,简称CAPTCHA),俗称验证码,是一种区分用户是
    计算机或人的公共全自动程序。在CAPTCHA测试中,作为服务器的计算机会自动生成一
    个问题由用户来解答。这个问题可以由计算机生成并评判,但是必须只有人类才能解答。
    由于计算机无法解答CAPTCHA的问题,所以回答出问题的用户就可以被认为是人类。
     
      一种常用的CAPTCHA测试是让用户输入一个扭曲变形的图片上所显示的文字或数字,扭
    曲变形是为了避免被光学字符识别(OCR, Optical Character Recognition)之类的计算机程
    序自动识别出图片上的文数字而失去效果。由于这个测试是由计算机来考人类,而不是
    标准图灵测试中那样由人类来考计算机,人们有时称CAPTCHA是一种反向图灵测试。 
    验证码(CAPTCHA)破解 
     
      一些曾经或者正在使用中的验证码系统已被破解。
     
      这包括Yahoo验证码的一个早期版本 EZ-Gimpy,PayPal使用的验证码,LiveJournal、
    phpBB使用的验证码,很多金融机构(主要是银行)使用的网银验证码以及很多其他网站
    使用的验证码。
     
      俄罗斯的一个黑客组织使用一个自动识别软件在2006年破解了Yahoo的CAPTCHA准确
    率大概是15%,但是攻击者可以每天尝试10万次,相对来说成本很低。而在2008年,
    Google的CAPTCHA也被俄罗斯黑客所破解。攻击者使用两台不同的计算机来调整破解进
    程,可能是用第二台计算机学习第一台对CAPTCHA的破解,或者是对成效进行监视。 
    验证码(CAPTCHA)演进
     
    验证码(CAPTCHA)生成
     
    使用 Pillow(PIL Fork) 和 captcha 库生成验证码图像:
     
    PIL.Image.open(fp, mode=‘r’) - 打开和识别输入的图像(文件)
     
    captcha.image.ImageCaptcha(width, height,) – 创建 ImageCaptcha 实例
    captcha.image.ImageCaptcha.write(‘1234’, ‘out.png’) – 生成验证码并保存
    captcha.image.ImageCaptcha.generate(‘1234’) – 生成验证码图像

    代码实现: 

    创建验证码数据集
    引入第三方包
    from captcha.image import ImageCaptcha
    
    import random
    import numpy as np
    
    import tensorflow.gfile as gfile
    import matplotlib.pyplot as plt
    import PIL.Image as Image
    
    定义常量和字符集
    NUMBER = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
    LOWERCASE = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u',
                'v', 'w', 'x', 'y', 'z']
    UPPERCASE = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U',
               'V', 'W', 'X', 'Y', 'Z']
    
    CAPTCHA_CHARSET = NUMBER   # 验证码字符集
    CAPTCHA_LEN = 4            # 验证码长度
    CAPTCHA_HEIGHT = 60        # 验证码高度
    CAPTCHA_WIDTH = 160        # 验证码宽度
    
    
    TRAIN_DATASET_SIZE = 5000     # 验证码数据集大小
    TEST_DATASET_SIZE = 1000 
    TRAIN_DATA_DIR = './train-data/' # 验证码数据集目录
    TEST_DATA_DIR = './test-data/'
    
    生成随机字符的方法
    def gen_random_text(charset=CAPTCHA_CHARSET, length=CAPTCHA_LEN):
        text = [random.choice(charset) for _ in range(length)]
        return ''.join(text)
    
    创建并保存验证码数据集的方法
    def create_captcha_dataset(size=100,
                               data_dir='./data/',
                               height=60,                           
                               width=160,
                               image_format='.png'):
    
        # 如果保存验证码图像,先清空 data_dir 目录
        if gfile.Exists(data_dir):
            gfile.DeleteRecursively(data_dir)
        gfile.MakeDirs(data_dir)
        
        # 创建 ImageCaptcha 实例 captcha
        captcha = ImageCaptcha(width=width, height=height)
    
        for _ in range(size):
            # 生成随机的验证码字符
            text = gen_random_text(CAPTCHA_CHARSET, CAPTCHA_LEN)
            captcha.write(text, data_dir + text + image_format)
            
        return None
    
    创建并保存训练集
    create_captcha_dataset(TRAIN_DATASET_SIZE, TRAIN_DATA_DIR)
    
    创建并保存测试集
    create_captcha_dataset(TEST_DATASET_SIZE, TEST_DATA_DIR)
    
    生成并返回验证码数据集的方法
    def gen_captcha_dataset(size=100,
                            height=60,                           
                            width=160,
                            image_format='.png'):
    
        # 创建 ImageCaptcha 实例 captcha
        captcha = ImageCaptcha(width=width, height=height)
    
        # 创建图像和文本数组
        images, texts = [None]*size, [None]*size
        for i in range(size):
            # 生成随机的验证码字符
            texts[i] = gen_random_text(CAPTCHA_CHARSET, CAPTCHA_LEN)
            # 使用 PIL.Image.open() 识别新生成的验证码图像 
            # 然后,将图像转换为形如(CAPTCHA_WIDTH, CAPTCHA_HEIGHT, 3) 的 Numpy 数组
            images[i] = np.array(Image.open(captcha.generate(texts[i])))
            
        return images, texts
    
    生成 100 张验证码图像和字符
    images, texts = gen_captcha_dataset()
    
    plt.figure()
    for i in range(20):
        plt.subplot(5,4,i+1) # 绘制前20个验证码,以5行4列子图形式展示
        plt.tight_layout() # 自动适配子图尺寸
        plt.imshow(images[i])
        plt.title("Label: {}".format(texts[i])) # 设置标签为子图标题
        plt.xticks([]) # 删除x轴标记
        plt.yticks([]) # 删除y轴标记
    plt.show()
    
    

    • 输入与输出数据处理

    输入数据处理 
     
    图像处理:RGB图 -> 灰度图 -> 规范化数据
    输入数据处理
     
    适配 Keras 图像数据格式:“channels_frist” 或 “channels_last”
     
    输出数据处理 
     
    One-hot 编码:验证码转向量
    解码:模型输出向量转验证码

     

    代码实现: 

    数据处理
    
    引入第三方包
    from PIL import Image
    from keras import backend as K
    
    import random
    import glob
    
    import numpy as np
    import tensorflow.gfile as gfile
    import matplotlib.pyplot as plt
    
    定义超参数和字符集
    NUMBER = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
    LOWERCASE = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u',
                'v', 'w', 'x', 'y', 'z']
    UPPERCASE = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U',
               'V', 'W', 'X', 'Y', 'Z']
    
    CAPTCHA_CHARSET = NUMBER   # 验证码字符集
    CAPTCHA_LEN = 4            # 验证码长度
    CAPTCHA_HEIGHT = 60        # 验证码高度
    CAPTCHA_WIDTH = 160        # 验证码宽度
    
    TRAIN_DATA_DIR = './train-data/\' # 验证码数据集目录
    
    读取训练集前 100 张图片,并通过文件名解析验证码(标签)
    image = []
    text = []
    count = 0
    for filename in glob.glob(TRAIN_DATA_DIR + '*.png'):
        image.append(np.array(Image.open(filename)))
        text.append(filename.lstrip(TRAIN_DATA_DIR).rstrip('.png'))
        count += 1
        if count >= 100:
            break
    
    text[0]
    '''
    '0005'
    '''
    
    数据可视化
    plt.figure()
    for i in range(20):
        plt.subplot(5,4,i+1) # 绘制前20个验证码,以5行4列子图形式展示
        plt.tight_layout() # 自动适配子图尺寸
        plt.imshow(image[i])
        plt.title("Label: {}".format(text[i])) # 设置标签为子图标题
        plt.xticks([]) # 删除x轴标记
        plt.yticks([]) # 删除y轴标记
    plt.show()
    
    
    image = np.array(image, dtype=np.float32)
    print(image.shape)
    '''
    (100, 60, 160, 3)
    '''
    
    将 RGB 验证码图像转为灰度图
    def rgb2gray(img):
        # Y' = 0.299 R + 0.587 G + 0.114 B 
        # https://en.wikipedia.org/wiki/Grayscale#Converting_color_to_grayscale
        return np.dot(img[...,:3], [0.299, 0.587, 0.114])
    
    image = rgb2gray(image)
    
    print(image.shape)
    '''
    (100, 60, 160)
    '''
    
    image[0]
    '''
    array([[250.766, 250.766, 250.766, ..., 250.766, 250.766, 250.766],
           [250.766, 250.766, 250.766, ..., 250.766, 250.766, 250.766],
           [250.766, 250.766, 250.766, ..., 250.766, 250.766, 250.766],
           ...,
           [250.766, 250.766, 250.766, ..., 250.766, 250.766, 250.766],
           [250.766, 250.766, 250.766, ..., 250.766, 250.766, 250.766],
           [250.766, 250.766, 250.766, ..., 250.766, 250.766, 250.766]])
    '''
    
    plt.figure()
    for i in range(20):
        plt.subplot(5,4,i+1) # 绘制前20个验证码,以5行4列子图形式展示
        plt.tight_layout() # 自动适配子图尺寸
        plt.imshow(image[i], cmap='Greys')
        plt.title("Label: {}".format(text[i])) # 设置标签为子图标题
        plt.xticks([]) # 删除x轴标记
        plt.yticks([]) # 删除y轴标记
    plt.show()
    
    
    数据规范化
    image = image / 255
    image[0]
    '''
    array([[0.98339608, 0.98339608, 0.98339608, ..., 0.98339608, 0.98339608,
            0.98339608],
           [0.98339608, 0.98339608, 0.98339608, ..., 0.98339608, 0.98339608,
            0.98339608],
           [0.98339608, 0.98339608, 0.98339608, ..., 0.98339608, 0.98339608,
            0.98339608],
           ...,
           [0.98339608, 0.98339608, 0.98339608, ..., 0.98339608, 0.98339608,
            0.98339608],
           [0.98339608, 0.98339608, 0.98339608, ..., 0.98339608, 0.98339608,
            0.98339608],
           [0.98339608, 0.98339608, 0.98339608, ..., 0.98339608, 0.98339608,
            0.98339608]])
    '''
    
    image.shape[0]
    '''
    100
    '''
    
    image.shape
    '''
    (100, 60, 160)
    '''
    
    适配 Keras 图像数据格式
    def fit_keras_channels(batch, rows=CAPTCHA_HEIGHT, cols=CAPTCHA_WIDTH):
        if K.image_data_format() == 'channels_first':
            batch = batch.reshape(batch.shape[0], 1, rows, cols)
            input_shape = (1, rows, cols)
        else:
            batch = batch.reshape(batch.shape[0], rows, cols, 1)
            input_shape = (rows, cols, 1)
        
        return batch, input_shape
    
    image, input_shape = fit_keras_channels(image)
    print(image.shape)
    print(input_shape)
    '''
    (100, 60, 160, 1)
    (60, 160, 1)
    '''
    
    type(image)
    '''
    numpy.ndarray
    '''
    
    对验证码中每个字符进行 one-hot 编码
    def text2vec(text, length=CAPTCHA_LEN, charset=CAPTCHA_CHARSET):
        text_len = len(text)
        # 验证码长度校验
        if text_len != length:
            raise ValueError('Error: length of captcha should be {}, but got {}'.format(length, text_len))
        
        # 生成一个形如(CAPTCHA_LEN*CAPTHA_CHARSET,) 的一维向量
        # 例如,4个纯数字的验证码生成形如(4*10,)的一维向量
        vec = np.zeros(length * len(charset))
        for i in range(length):
            # One-hot 编码验证码中的每个数字
            # 每个字符的热码 = 索引 + 偏移量
            vec[charset.index(text[i]) + i*len(charset)] = 1
        return vec
    
    text = list(text)
    vec = [None]*len(text)
    
    for i in range(len(vec)):
        vec[i] = text2vec(text[i])
    
    vec[0]
    '''
    array([1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.,
           0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
           0., 1., 0., 0., 0., 0.])
    '''
    
    text[0]
    '''
    '0005'
    '''
    
    将验证码向量解码为对应字符
    def vec2text(vector):
        if not isinstance(vector, np.ndarray):
            vector = np.asarray(vector)
        vector = np.reshape(vector, [CAPTCHA_LEN, -1])
        text = ''
        for item in vector:
            text += CAPTCHA_CHARSET[np.argmax(item)]
        return text
    
    # 模型对 ‘3935’ 验证码推理的输出值
    yy_vec = np.array([[2.0792404e-10, 4.3756086e-07, 3.1140310e-10, 9.9823320e-01,
                        5.1135743e-15, 3.7417038e-05, 1.0556480e-08, 9.0933657e-13,
                        2.7573466e-07, 1.7286760e-03, 1.1030550e-07, 1.1852034e-07,
                        7.9457263e-10, 3.4533365e-09, 6.6065012e-14, 2.8996323e-05,
                        7.6345885e-13, 3.1817032e-16, 3.9540555e-05, 9.9993122e-01,
                        5.3814397e-13, 1.2061575e-10, 1.6408040e-03, 9.9833637e-01,
                        6.5149628e-08, 5.2246549e-12, 1.1365444e-08, 9.5700288e-12,
                        2.2725430e-05, 5.2195204e-10, 3.2457771e-13, 2.1413280e-07,
                        7.3547295e-14, 4.4094882e-06, 3.8390007e-07, 9.9230206e-01,
                        6.4467136e-03, 3.9224533e-11, 1.2461344e-03, 1.1253484e-07]],
                      dtype=np.float32)
    
    yy = vec2text(yy_vec)
    yy
    '''
    '3935'
    '''
    
    img = rgb2gray(np.array(Image.open('3935.png')))
    
    plt.figure()
    plt.imshow(img, cmap='Greys')
    plt.title("Label: {}".format(yy)) # 设置标签为图标题
    plt.xticks([]) # 删除x轴标记
    plt.yticks([]) # 删除y轴标记
    plt.show()

    • 模型结构设计

    分类问题

     

    图像分类模型 AlexNet

     

    使用卷积进行特征提取

     

    图像分类模型 VGG-16 

     

    验证码识别模型结构

     

    验证码识别模型实现

     

    • 模型损失函数设计

    交叉熵(Cross-Entropy, CE)
     
      我们使用交叉熵作为该模型的损失函数。
      虽然 Categorical / Binary CE 是更常用的损失函数,不过他们都是 CE 的变体。
    CE 定义如下: 

     

    对于二分类问题 (C‘=2) ,CE 定义如下:

     

     
     
     
    Categorical CE Loss(Softmax Loss)
     
      常用于输出为 One-hot 向量的多类别分类(Multi-Class Classification)模型。

     

    Binary CE Loss(Sigmoid CE Loss) 
     
      与 Softmax Loss 不同,Binary CE Loss 对于每个向量分量(class)都是独立
    的,这意味着每个向量分量计算的损失不受其他分量的影响
    因此,它常被用于多标签分类(Multi-label classification)模型。

     

     实现代码:

    训练模型
    引入第三方包
    from PIL import Image
    from keras import backend as K
    from keras.utils import plot_model
    from keras.models import *
    from keras.layers import *
    
    import glob
    import pickle
    
    import numpy as np
    import tensorflow.gfile as gfile
    import matplotlib.pyplot as plt
    
    定义超参数和字符集
    NUMBER = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
    LOWERCASE = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u',
                'v', 'w', 'x', 'y', 'z']
    UPPERCASE = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U',
               'V', 'W', 'X', 'Y', 'Z']
    
    CAPTCHA_CHARSET = NUMBER   # 验证码字符集
    CAPTCHA_LEN = 4            # 验证码长度
    CAPTCHA_HEIGHT = 60        # 验证码高度
    CAPTCHA_WIDTH = 160        # 验证码宽度
    
    TRAIN_DATA_DIR = './train-data/\' # 验证码数据集目录
    TEST_DATA_DIR = './test-data/\'
    
    BATCH_SIZE = 100 # 每一批训练的个数
    EPOCHS = 20 # 训练集训练的轮数
    OPT = 'adam' # 优化器
    LOSS = 'binary_crossentropy' # 模型损失函数
    
    # 训练后的文件保存
    MODEL_DIR = './model/train_demo/'
    MODEL_FORMAT = '.h5' # 保存格式
    HISTORY_DIR = './history/train_demo/' # 用于保存训练记录
    HISTORY_FORMAT = '.history'
    
    # 输出文件名格式
    filename_str = "{}captcha_{}_{}_bs_{}_epochs_{}{}"
    
    # 模型网络结构文件
    MODEL_VIS_FILE = 'captcha_classfication' + '.png'
    # 模型文件
    MODEL_FILE = filename_str.format(MODEL_DIR, OPT, LOSS, str(BATCH_SIZE), str(EPOCHS), MODEL_FORMAT)
    # 训练记录文件
    HISTORY_FILE = filename_str.format(HISTORY_DIR, OPT, LOSS, str(BATCH_SIZE), str(EPOCHS), HISTORY_FORMAT)
    
    将 RGB 验证码图像转为灰度图
    def rgb2gray(img):
        # Y' = 0.299 R + 0.587 G + 0.114 B 
        # https://en.wikipedia.org/wiki/Grayscale#Converting_color_to_grayscale
        return np.dot(img[...,:3], [0.299, 0.587, 0.114])
    
    对验证码中每个字符进行 one-hot 编码
    def text2vec(text, length=CAPTCHA_LEN, charset=CAPTCHA_CHARSET):
        text_len = len(text)
        # 验证码长度校验
        if text_len != length:
            raise ValueError('Error: length of captcha should be {}, but got {}'.format(length, text_len))
        
        # 生成一个形如(CAPTCHA_LEN*CAPTHA_CHARSET,) 的一维向量
        # 例如,4个纯数字的验证码生成形如(4*10,)的一维向量
        vec = np.zeros(length * len(charset))
        for i in range(length):
            # One-hot 编码验证码中的每个数字
            # 每个字符的热码 = 索引 + 偏移量
            vec[charset.index(text[i]) + i*len(charset)] = 1
        return vec
    
    将验证码向量解码为对应字符
    def vec2text(vector):
        if not isinstance(vector, np.ndarray):
            vector = np.asarray(vector)
        vector = np.reshape(vector, [CAPTCHA_LEN, -1])
        text = ''
        for item in vector:
            text += CAPTCHA_CHARSET[np.argmax(item)]
        return text
    
    适配 Keras 图像数据格式
    def fit_keras_channels(batch, rows=CAPTCHA_HEIGHT, cols=CAPTCHA_WIDTH):
        if K.image_data_format() == 'channels_first':
            batch = batch.reshape(batch.shape[0], 1, rows, cols)
            input_shape = (1, rows, cols)
        else:
            batch = batch.reshape(batch.shape[0], rows, cols, 1)
            input_shape = (rows, cols, 1)
        
        return batch, input_shape
    
    读取训练集
    X_train = []
    Y_train = []
    for filename in glob.glob(TRAIN_DATA_DIR + '*.png'):
        X_train.append(np.array(Image.open(filename)))
        Y_train.append(filename.lstrip(TRAIN_DATA_DIR).rstrip('.png'))
    
    X_train[0][1][1]
    '''
    array([253, 249, 254], dtype=uint8)
    '''
    
    Y_train[0]
    '''
    '0005'
    '''
    
    处理训练集图像
    # list -> rgb(numpy)
    X_train = np.array(X_train, dtype=np.float32)
    # rgb -> gray
    X_train = rgb2gray(X_train)
    # normalize
    X_train = X_train / 255
    # Fit keras channels
    X_train, input_shape = fit_keras_channels(X_train)
    
    print(X_train.shape, type(X_train))
    print(input_shape)
    '''
    (3919, 60, 160, 1) <class 'numpy.ndarray'>
    (60, 160, 1)
    '''
    
    处理训练集标签
    Y_train = list(Y_train)
    
    for i in range(len(Y_train)):
        Y_train[i] = text2vec(Y_train[i])
    
    Y_train = np.asarray(Y_train)
    
    print(Y_train.shape, type(Y_train))
    '''
    (3919, 40) <class 'numpy.ndarray'>
    '''
    
    读取测试集,处理对应图像和标签
    X_test = []
    Y_test = []
    for filename in glob.glob(TEST_DATA_DIR + '*.png'):
        X_test.append(np.array(Image.open(filename)))
        Y_test.append(filename.lstrip(TEST_DATA_DIR).rstrip('.png'))
    
    # list -> rgb -> gray -> normalization -> fit keras 
    X_test = np.array(X_test, dtype=np.float32)
    X_test = rgb2gray(X_test)
    X_test = X_test / 255
    X_test, _ = fit_keras_channels(X_test)
    
    Y_test = list(Y_test)
    for i in range(len(Y_test)):
        Y_test[i] = text2vec(Y_test[i])
    
    Y_test = np.asarray(Y_test)
    
    print(X_test.shape, type(X_test))
    print(Y_test.shape, type(Y_test))
    '''
    (958, 60, 160, 1) <class 'numpy.ndarray'>
    (958, 40) <class 'numpy.ndarray'>
    '''
    
    创建验证码识别模型
    # 输入层
    inputs = Input(shape = input_shape, name = "inputs")
    
    # 第1层卷积
    conv1 = Conv2D(32, (3, 3), name = "conv1")(inputs)
    relu1 = Activation('relu', name="relu1")(conv1)
    
    # 第2层卷积
    conv2 = Conv2D(32, (3, 3), name = "conv2")(relu1)
    relu2 = Activation('relu', name="relu2")(conv2)
    pool2 = MaxPooling2D(pool_size=(2,2), padding='same', name="pool2")(relu2)
    
    # 第3层卷积
    conv3 = Conv2D(64, (3, 3), name = "conv3")(pool2)
    relu3 = Activation('relu', name="relu3")(conv3)
    pool3 = MaxPooling2D(pool_size=(2,2), padding='same', name="pool3")(relu3)
    
    # 将 Pooled feature map 摊平后输入全连接网络
    x = Flatten()(pool3)
    
    # Dropout
    x = Dropout(0.25)(x)
    
    # 4个全连接层分别做10分类,分别对应4个字符。
    x = [Dense(10, activation='softmax', name='fc%d'%(i+1))(x) for i in range(4)]
    
    # 4个字符向量拼接在一起,与标签向量形式一致,作为模型输出。
    outs = Concatenate()(x)
    
    # 定义模型的输入与输出
    model = Model(inputs=inputs, outputs=outs)
    model.compile(optimizer=OPT, loss=LOSS, metrics=['accuracy'])
    
    查看模型摘要
    model.summary() # 输出模型摘要信息
    '''
    __________________________________________________________________________________________________
    Layer (type)                    Output Shape         Param #     Connected to                     
    ==================================================================================================
    inputs (InputLayer)             (None, 60, 160, 1)   0                                            
    __________________________________________________________________________________________________
    conv1 (Conv2D)                  (None, 58, 158, 32)  320         inputs[0][0]                     
    __________________________________________________________________________________________________
    relu1 (Activation)              (None, 58, 158, 32)  0           conv1[0][0]                      
    __________________________________________________________________________________________________
    conv2 (Conv2D)                  (None, 56, 156, 32)  9248        relu1[0][0]                      
    __________________________________________________________________________________________________
    relu2 (Activation)              (None, 56, 156, 32)  0           conv2[0][0]                      
    __________________________________________________________________________________________________
    pool2 (MaxPooling2D)            (None, 28, 78, 32)   0           relu2[0][0]                      
    __________________________________________________________________________________________________
    conv3 (Conv2D)                  (None, 26, 76, 64)   18496       pool2[0][0]                      
    __________________________________________________________________________________________________
    relu3 (Activation)              (None, 26, 76, 64)   0           conv3[0][0]                      
    __________________________________________________________________________________________________
    pool3 (MaxPooling2D)            (None, 13, 38, 64)   0           relu3[0][0]                      
    __________________________________________________________________________________________________
    flatten_1 (Flatten)             (None, 31616)        0           pool3[0][0]                      
    __________________________________________________________________________________________________
    dropout_1 (Dropout)             (None, 31616)        0           flatten_1[0][0]                  
    __________________________________________________________________________________________________
    fc1 (Dense)                     (None, 10)           316170      dropout_1[0][0]                  
    __________________________________________________________________________________________________
    fc2 (Dense)                     (None, 10)           316170      dropout_1[0][0]                  
    __________________________________________________________________________________________________
    fc3 (Dense)                     (None, 10)           316170      dropout_1[0][0]                  
    __________________________________________________________________________________________________
    fc4 (Dense)                     (None, 10)           316170      dropout_1[0][0]                  
    __________________________________________________________________________________________________
    concatenate_1 (Concatenate)     (None, 40)           0           fc1[0][0]                        
                                                                     fc2[0][0]                        
                                                                     fc3[0][0]                        
                                                                     fc4[0][0]                        
    ==================================================================================================
    Total params: 1,292,744
    Trainable params: 1,292,744
    Non-trainable params: 0
    __________________________________________________________________________________________________
    '''
    
    模型可视化
    #import os
    #os.environ["PATH"] += os.pathsep + 'D:Program FilesPython37Graphviz2.38in'
    plot_model(model, to_file=MODEL_VIS_FILE, show_shapes=True, show_layer_names=True)
    
    训练模型
    history = model.fit(X_train,
                        Y_train,
                        batch_size=BATCH_SIZE,
                        epochs=EPOCHS,
                        verbose=2,
                        validation_data=(X_test, Y_test))
    '''
    Train on 3919 samples, validate on 958 samples
    Epoch 1/10
     - 45s - loss: 0.3256 - acc: 0.9000 - val_loss: 0.3249 - val_acc: 0.9000
    Epoch 2/10
     - 48s - loss: 0.3242 - acc: 0.9000 - val_loss: 0.3229 - val_acc: 0.9000
    Epoch 3/10
     - 47s - loss: 0.3075 - acc: 0.9001 - val_loss: 0.2962 - val_acc: 0.9007
    Epoch 4/10
     - 50s - loss: 0.2374 - acc: 0.9126 - val_loss: 0.2463 - val_acc: 0.9120
    Epoch 5/10
     - 51s - loss: 0.1729 - acc: 0.9367 - val_loss: 0.2253 - val_acc: 0.9190
    Epoch 6/10
     - 48s - loss: 0.1363 - acc: 0.9508 - val_loss: 0.2114 - val_acc: 0.9230
    Epoch 7/10
     - 48s - loss: 0.1136 - acc: 0.9589 - val_loss: 0.2175 - val_acc: 0.9236
    Epoch 8/10
     - 48s - loss: 0.0943 - acc: 0.9666 - val_loss: 0.2242 - val_acc: 0.9234
    Epoch 9/10
     - 48s - loss: 0.0825 - acc: 0.9702 - val_loss: 0.2185 - val_acc: 0.9241
    Epoch 10/10
     - 48s - loss: 0.0742 - acc: 0.9735 - val_loss: 0.2321 - val_acc: 0.9245
    '''
    
    预测样例
    print(vec2text(Y_test[3]))
    '''
    0030
    '''
    
    yy = model.predict(X_test[6].reshape(1, 60, 160, 1))
    print(vec2text(yy))
    '''
    0080
    '''
    
    保存模型
    if not gfile.Exists(MODEL_DIR):
        gfile.MakeDirs(MODEL_DIR)
    
    model.save(MODEL_FILE)
    print('Saved trained model at %s ' % MODEL_FILE)
    '''
    Saved trained model at ./model/train_demo/captcha_adam_binary_crossentropy_bs_100_epochs_10.h5 
    '''
    
    保存训练过程记录
    history.history['acc']
    '''
    [0.8999999165534973,
     0.8999999165534973,
     0.9001274999990607,
     0.9125924786763339,
     0.9367058057966328,
     0.950835685807243,
     0.9589372451556645,
     0.9666305299953831,
     0.970170951324569,
     0.9735455595738245]
    '''
    
    history.history['loss']
    '''
    [0.3255925034649307,
     0.3241707802077403,
     0.30746189193264445,
     0.23740254261567295,
     0.17286433247575106,
     0.1362645993939344,
     0.11359802067363466,
     0.09430851856910565,
     0.08249860131624981,
     0.07421532272722199]
    '''
    
    history.history.keys()
    '''
    dict_keys(['val_loss', 'val_acc', 'loss', 'acc'])
    '''
    
    if gfile.Exists(HISTORY_DIR) == False:
        gfile.MakeDirs(HISTORY_DIR)
    
    with open(HISTORY_FILE, 'wb') as f:
        pickle.dump(history.history, f)
    
    print(HISTORY_FILE)
    '''
    ./history/train_demo/captcha_adam_binary_crossentropy_bs_100_epochs_10.history
    '''

    • 模型训练过程分析

    模型训练过程

     

    学习率(Learning rate)
    学习率与损失值变化(模型收敛速度)直接相关。
     
    何时加大学习率
    • 训练初期,损失值一直没什么波动
     
    何时减小学习率
    • 训练初期,损失值直接爆炸或者 NAN
    • 损失值先开始速降,后平稳多时
    • 训练后期,损失值反复上下波动
    优化器介绍:SGD(Stochastic Gradient Descent)

     

    优化器介绍:SGD-M(Momentum) 
     
      SGD 在遇到沟壑时容易陷入震荡。为此,可以为其引入动量(Momentum),加速 SGD
    在正确方向的下降并抑制震荡。
     
    优化器介绍:Adagrad – RMSprop – Adam
     

     

     
     
    优化器对比:鞍点
     
     
     
     
     
    优化器对比: 验证码识别模型
     

    代码实现:

    模型训练过程分析
    
    引入第三方包
    import glob
    import pickle
    
    import numpy as np
    import matplotlib.pyplot as plt
    
    加载训练过程记录
    history_file = './pre-trained/history/optimizer/binary_ce/captcha_adam_binary_crossentropy_bs_100_epochs_100.history'
    with open(history_file, 'rb') as f:
        history = pickle.load(f)
    
    训练过程可视化
    fig = plt.figure()
    plt.subplot(2,1,1)
    plt.plot(history['acc'])
    plt.plot(history['val_acc'])
    plt.title('Model Accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='lower right')
    
    plt.subplot(2,1,2)
    plt.plot(history['loss'])
    plt.plot(history['val_loss'])
    plt.title('Model Loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper right')
    plt.tight_layout()
    
    plt.show()
    
    
    定义过程可视化方法
    def plot_training(history=None, metric='acc', title='Model Accuracy', loc='lower right'):
        model_list = []
        fig = plt.figure(figsize=(10, 8))
        for key, val in history.items():
            model_list.append(key.replace(HISTORY_DIR, '').rstrip('.history'))
            plt.plot(val[metric])
    
        plt.title(title)
        plt.ylabel(metric)
        plt.xlabel('epoch')
        plt.legend(model_list, loc=loc)
        plt.show()
    
    加载预训练模型记录
    HISTORY_DIR = './pre-trained/history/optimizer/binary_ce/'
    history = {}
    for filename in glob.glob(HISTORY_DIR + '*.history'):
        with open(filename, 'rb') as f:
            history[filename] = pickle.load(f)
    
    for key, val in history.items():
        print(key.replace(HISTORY_DIR, '').rstrip('.history'), val.keys())
    '''
    ./pre-trained/history/optimizer/binary_cecaptcha_adadelta_binary_crossentropy_bs_100_epochs_100 dict_keys(['val_loss', 'val_acc', 'loss', 'acc'])
    ./pre-trained/history/optimizer/binary_cecaptcha_adagrad_binary_crossentropy_bs_100_epochs_100 dict_keys(['val_loss', 'val_acc', 'loss', 'acc'])
    ./pre-trained/history/optimizer/binary_cecaptcha_adam_binary_crossentropy_bs_100_epochs_100 dict_keys(['val_loss', 'val_acc', 'loss', 'acc'])
    ./pre-trained/history/optimizer/binary_cecaptcha_rmsprop_binary_crossentropy_bs_100_epochs_100 dict_keys(['val_loss', 'val_acc', 'loss', 'acc'])
    '''
    
    准确率变化(训练集)
    plot_training(history)
    
    
    损失值变化(训练集)
    plot_training(history, metric='loss', title='Model Loss', loc='upper right')
    
    
    准确率变化(测试集)
    plot_training(history, metric='val_acc', title='Model Accuracy (val)')
    
    
    损失值变化(测试集)
    plot_training(history, metric='val_loss', title='Model Loss (val)', loc='upper right')
    
    
     
     
     

    • 模型部署与效果演示

    数据-模型-服务流水线 

    使用 Flask 快速搭建 验证码识别服务

     

    使用 Flask 启动 验证码识别服务

    访问 验证码识别服务

    app.py

    import base64
    
    import numpy as np
    import tensorflow as tf
    
    from io import BytesIO
    from flask import Flask, request, jsonify
    from keras.models import load_model
    from PIL import Image
    
    NUMBER = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
    LOWERCASE = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u',
                'v', 'w', 'x', 'y', 'z']
    UPPERCASE = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U',
               'V', 'W', 'X', 'Y', 'Z']
    
    CAPTCHA_CHARSET = NUMBER   # 验证码字符集
    CAPTCHA_LEN = 4            # 验证码长度
    CAPTCHA_HEIGHT = 60        # 验证码高度
    CAPTCHA_WIDTH = 160        # 验证码宽度
    
    # 10 个 Epochs 训练的模型
    MODEL_FILE = './pre-trained/model/captcha_rmsprop_binary_crossentropy_bs_100_epochs_10.h5'
    
    def vec2text(vector):
        if not isinstance(vector, np.ndarray):
            vector = np.asarray(vector)
        vector = np.reshape(vector, [CAPTCHA_LEN, -1])
        text = ''
        for item in vector:
            text += CAPTCHA_CHARSET[np.argmax(item)]
        return text
    
    def rgb2gray(img):
        # Y' = 0.299 R + 0.587 G + 0.114 B 
        # https://en.wikipedia.org/wiki/Grayscale#Converting_color_to_grayscale
        return np.dot(img[...,:3], [0.299, 0.587, 0.114])
    
    app = Flask(__name__) # 创建 Flask 实例
    
    # 测试 URL
    @app.route('/ping', methods=['GET', 'POST'])
    def hello_world():
        return 'pong'
    
    # 验证码识别 URL
    @app.route('/predict', methods=['POST'])
    def predict():
        response = {'success': False, 'prediction': '', 'debug': 'error'}
        received_image= False
        if request.method == 'POST':
            if request.files.get('image'): # 图像文件
                image = request.files['image'].read()
                received_image = True
                response['debug'] = 'get image'
            elif request.get_json(): # base64 编码的图像文件
                encoded_image = request.get_json()['image']
                image = base64.b64decode(encoded_image)
                received_image = True
                response['debug'] = 'get json'
            if received_image:
                image = np.array(Image.open(BytesIO(image)))
                image = rgb2gray(image).reshape(1, 60, 160, 1).astype('float32') / 255
                with graph.as_default():
                    pred = model.predict(image)
                response['prediction'] = response['prediction'] + vec2text(pred)
                response['success'] = True
                response['debug'] = 'predicted'
        else:
            response['debug'] = 'No Post'
        return jsonify(response)
    
    model = load_model(MODEL_FILE) # 加载模型
    graph = tf.get_default_graph() # 获取 TensorFlow 默认数据流图
  • 相关阅读:
    maven surefire入门
    编译原理随笔4(自下而上的语法分析-递归法)
    编译原理随笔3(自上而下的语法分析-推导法)
    编译原理随笔1
    LeetCode刷题笔记-DP算法-取数问题
    算法刷题笔记-stack-四则运算
    LeetCode刷题笔记-递归-反转二叉树
    Beta里程碑总结
    评价cnblogs.com的用户体验
    我们的团队目标
  • 原文地址:https://www.cnblogs.com/LXL616/p/11253673.html
Copyright © 2011-2022 走看看