zoukankan html css js c++ java

Python 爬虫之LOL皮肤图片抓取

爬虫之LOL皮肤图片抓取

## 原创作品，转载请注明出处，欢迎各路朋友前来交流！
## 本脚本只用作学习交流，请勿给对方站点造成流量压力！

1.用到的模块

1.requests #网络请求模块
2.re #正则模块
3.json

2.难点说明

1.皮肤图片不是直接显示在页面中，通过js获取然后渲染出来的。
2.分析js代码，找到关键url

3.完整代码

import requests
import re
import json


class LolHeroSkin:

    def __init__(self):
        # 通过抓包和js分析，得到下边两个url
        # 第一个是所有英雄的json列表
        # 第二个是分析js得出的皮肤文件url
        self.hero_index_url = 'https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js'
        self.skinBaseUrl = 'https://ossweb-img.qq.com/images/lol/web201310/skin/big'

    @staticmethod
    def get_hero_data_url(e_name, c_name):
        """
        根据英雄昵称，获取指定js
        """
        return f'https://lol.qq.com/biz/hero/{e_name}.js', c_name

    def get_hero_list(self):
        """
        获取所有英雄的json内容
        """
        res = requests.get(url=self.hero_index_url)
        res.encoding = 'gb2312'
        hero_list = res.json().get('hero')
        print('共', len(hero_list), '个英雄！')
        for name in hero_list:
            yield name.get('alias'), name.get('name')

    def get_hero_skin_id(self, e_name, c_name):
        """
        获取皮肤对应的id,通过正则匹配出来
        """
        url, c_name = self.get_hero_data_url(e_name, c_name)
        res = requests.get(url=url)
        if res.status_code == 200:
            try:
                hero = json.loads(re.search('(.*?=){2}({.*})', res.text).group(2))
                return hero.get('data').get('skins')
            except Exception as e:
                print(e)
                print(res.text)
        else:
            print('404--')
            return None

    def get_hero_skin_urls(self):
        num = 0
        for e_name, c_name in self.get_hero_list():
            num += 1

            skin_list = r.get_hero_skin_id(e_name, c_name)
            if skin_list:
                print('正在抓取第', num, f'位英雄【{c_name}】的皮肤 ，', '共：', len(skin_list), '个！')
                for i in skin_list:
                    yield self.skinBaseUrl + i.get('id') + '.jpg', c_name + i.get('id')
            else:
                print('404异常！！')

    @staticmethod
    def save_img(url, name):
        res = requests.get(url=url)
        # 这里如果是特殊皮肤，页面上是加载不出来的，返回404，不是脚本的原因，是他的网页本来就不能查看！
        if res.status_code == 200:
            with open('lol_img/' + name + '.jpg', 'wb') as f:
                f.write(res.content)
        else:
            pass

    def run(self):
        for url, name in r.get_hero_skin_urls():
            self.save_img(url, name)


if __name__ == '__main__':
    r = LolHeroSkin()
    r.run()

个性签名：独学而无友，则孤陋而寡闻！

如果觉得这篇文章对你有小小的帮助的话，记得点个“关注”哦，博主在此感谢！还可以扫码添加好友，交流编程上的问题哦！

万水千山总是情，点赞再走行不行！哈哈哈(っ•̀ω•́)っ✎⁾⁾！

查看全文

相关阅读:
POJ3764 The xorlongest Path
POJ1733 Parity game
POJ3301 Texas Trip
POJ2135 Farm Tour
POJ2516 Minimum Cost
Mem478
PROJECTEULER48
POJ1201 Intervals
CSS 伪元素 (Pseudoelements)
JQuery显示隐藏层

原文地址：https://www.cnblogs.com/bladecheng/p/14967506.html

Python 爬虫 之LOL皮肤图片抓取

爬虫 之LOL皮肤图片抓取

1.用到的模块

2.难点说明

3.完整代码

Python 爬虫之LOL皮肤图片抓取

爬虫之LOL皮肤图片抓取