zoukankan html css js c++ java

python 验证码识别示例（五）简单验证码识别

今天介绍一个简单验证的识别。

主要是标准的格式，没有扭曲和变现。就用 pytesseract 去识别一下。

验证码地址：http://wscx.gjxfj.gov.cn/zfp/webroot/xfsxcx.html

需要识别的验证码是：

因为这个验证码有干扰点，所以直接识别的效果非常不好。

首先对验证码进行二值化和降噪。

效果如下：

识别结果：

识别率只有百分之四十，针对这么低的识别率，可以去切割分类，目前这个验证码很容易去切割。提高验证码的识别率问题。

二值化代码：

# coding:utf-8
import sys, os
from PIL import Image, ImageDraw

# 二值数组
t2val = {}


def twoValue(image, G):
    for y in xrange(0, image.size[1]):
        for x in xrange(0, image.size[0]):
            g = image.getpixel((x, y))
            if g > G:
                t2val[(x, y)] = 1
            else:
                t2val[(x, y)] = 0


# 根据一个点A的RGB值，与周围的8个点的RBG值比较，设定一个值N（0 <N <8），当A的RGB值与周围8个点的RGB相等数小于N时，此点为噪点
# G: Integer 图像二值化阀值
# N: Integer 降噪率 0 <N <8
# Z: Integer 降噪次数
# 输出
#  0：降噪成功
#  1：降噪失败
def clearNoise(image, N, Z):
    for i in xrange(0, Z):
        t2val[(0, 0)] = 1
        t2val[(image.size[0] - 1, image.size[1] - 1)] = 1

        for x in xrange(1, image.size[0] - 1):
            for y in xrange(1, image.size[1] - 1):
                nearDots = 0
                L = t2val[(x, y)]
                if L == t2val[(x - 1, y - 1)]:
                    nearDots += 1
                if L == t2val[(x - 1, y)]:
                    nearDots += 1
                if L == t2val[(x - 1, y + 1)]:
                    nearDots += 1
                if L == t2val[(x, y - 1)]:
                    nearDots += 1
                if L == t2val[(x, y + 1)]:
                    nearDots += 1
                if L == t2val[(x + 1, y - 1)]:
                    nearDots += 1
                if L == t2val[(x + 1, y)]:
                    nearDots += 1
                if L == t2val[(x + 1, y + 1)]:
                    nearDots += 1

                if nearDots < N:
                    t2val[(x, y)] = 1


def saveImage(filename, size):
    image = Image.new("1", size)
    draw = ImageDraw.Draw(image)

    for x in xrange(0, size[0]):
        for y in xrange(0, size[1]):
            draw.point((x, y), t2val[(x, y)])

    image.save(filename)
for i in range(1,11):
    path =  "5/" + str(i) + ".jpg"
    image = Image.open(path).convert("L")
    twoValue(image, 222)
    clearNoise(image, 3, 6)
    path1 = "5/" + str(i) + ".png"
    saveImage(path1, image.size)

识别代码：

#coding:utf-8
from common.contest import *
from PIL import Image
import pytesseract

def recognize_captcha(img_path):
    im = Image.open(img_path)
    tessdata_dir_config = '--tessdata-dir "C:\Program Files (x86)\Tesseract-OCR\tessdata"'
    num = pytesseract.image_to_string(im,config=tessdata_dir_config)
    return num

if __name__ == '__main__':
    for i in range(1, 11):
        img_path = "5/" + str(i) + ".png"
        res = recognize_captcha(img_path)
        strs = res.split("
")
        print strs[0].replace(" ",'')

查看全文

相关阅读:
jsp与spring mvc后台controller间参数传递处理之总结
 又一个无效的列类型错误Error setting null for parameter #7 with JdbcType NULL . Try setting a different JdbcType for this parameter or a different jdbcTypeForNull configuration property. Cause: java.sql.SQLExcept
SSM文件上传要点总结
 关于SSM中mybatis向oracle添加语句采用序列自增的问题
 oracle和mysql的一些区别
 mapper.xml实现oracle的分页语句
 2.数组的解构赋值
 3.Vue 实例
 2.Vue.js 是什么
 1. vue 的安装

原文地址：https://www.cnblogs.com/xuchunlin/p/11333578.html

python 验证码识别示例（五） 简单验证码识别

python 验证码识别示例（五）简单验证码识别