tesseract下载路径:https://digi.bib.uni-mannheim.de/tesseract/
下载直接安装
1、在使用pytesseract打开图片是遇到错误,没有找到文件
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your PATH
![](http://upload-images.jianshu.io/upload_images/12840157-801bdf11a11e6295.png?imageMogr2/auto-orient/strip|imageView2/2/w/1200/format/webp)
2、排查解决:
在我们下载了PIL(命令pip install pillow)后,找到pytesseract.py文件,里面的tesseract_cmd ='tesseract',这里并没有指定文件路径
![](http://upload-images.jianshu.io/upload_images/12840157-fc39f758b0514003.png?imageMogr2/auto-orient/strip|imageView2/2/w/1200/format/webp)
3、直接在pycharm中安装tesseract-cor失败
![](http://upload-images.jianshu.io/upload_images/12840157-b56f4f87e640cef3.png?imageMogr2/auto-orient/strip|imageView2/2/w/751/format/webp)
4、从网上找到相应的‘Tesseract-OCR’下载安装(寻找对应版本):
地址:https://github.com/tesseract-ocr/tesseract/wiki
Windows为例:
点击Tesseract at UB Mannheim
![](http://upload-images.jianshu.io/upload_images/12840157-facb202bd8f4d651.png?imageMogr2/auto-orient/strip|imageView2/2/w/1062/format/webp)
找到符合自己电脑的下载
![](http://upload-images.jianshu.io/upload_images/12840157-9dd7e1e9325d06c3.png?imageMogr2/auto-orient/strip|imageView2/2/w/1134/format/webp)
5、下载完后安装Tesseract-OCR
![](http://upload-images.jianshu.io/upload_images/12840157-2ef34f747bc1c4ff.png?imageMogr2/auto-orient/strip|imageView2/2/w/257/format/webp)
![](http://upload-images.jianshu.io/upload_images/12840157-316d447a813c6ff7.png?imageMogr2/auto-orient/strip|imageView2/2/w/503/format/webp)
选择自己安装的目录(要添加到环境变量里面去),一直下一步就完成了
![](http://upload-images.jianshu.io/upload_images/12840157-102087cbb06bfa3a.png?imageMogr2/auto-orient/strip|imageView2/2/w/503/format/webp)
6、添加到环境变量的系统变量(PATH)去
![](http://upload-images.jianshu.io/upload_images/12840157-a161b4f5c831c123.png?imageMogr2/auto-orient/strip|imageView2/2/w/1200/format/webp)
7、增加一个TESSDATA_PREFIX变量名,变量值还是我的安装路径C:Program FilesTesseract-OCR essdata这是将语言字库文件夹添加到变量中;
![](http://upload-images.jianshu.io/upload_images/12840157-46a143bdf1960c1e.png?imageMogr2/auto-orient/strip|imageView2/2/w/691/format/webp)
8、打开终端,输入:tesseract -v,可以看到版本信息
![](http://upload-images.jianshu.io/upload_images/12840157-b30d29fc359b743b.png?imageMogr2/auto-orient/strip|imageView2/2/w/1103/format/webp)
9、在pytesseract库下的pytesseract.py文件中找到tesseract_cmd = 'tesseract',修改成 tesseract_cmd =r'C:Program FilesTesseract-OCR esseract.exe'
(刚才安装的路径下)
![](http://upload-images.jianshu.io/upload_images/12840157-53e6ced50354bcae.png?imageMogr2/auto-orient/strip|imageView2/2/w/1200/format/webp)
10、再去运行程序
可以简单识别验证码,没有报错了
![](http://upload-images.jianshu.io/upload_images/12840157-a26100fe3c4b14d1.png?imageMogr2/auto-orient/strip|imageView2/2/w/913/format/webp)
使用pytesseract识别验证码中遇到异常如下:
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path
安装Pillow,命令pip install Pillow,安装完毕会在Python文件夹下Libsite-packagespytesseract这个文件夹,里面有pytesseract.py文件
检查上述报错中的pytesseract.py源码,发现如下说明:
# CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY tesseract_cmd = 'tesseract'
从网上找到相应的‘Tesseract-OCR’下载安装(寻找对应版本):https://github.com/tesseract-ocr/tesseract/wiki
安装后的默认文件路径为(这里使用的是Windows版本):C:Program Files (x86)Tesseract-OCR
然后将源码中的:
tesseract_cmd = 'tesseract'
更改为:
tesseract_cmd = r'C:Program Files (x86)Tesseract-OCR esseract.exe'