#今日目标 **多线程之小米商店APP爬虫** 爬取小米商店所有社交APP ``` import requests import time from threading import Thread from queue import Queue import json class XiaoAppSpider(object): def __init__(self): self.url='http://app.mi.com/categotyAllListApi?page={}&categoryId=2&pageSize=30' self.headers={'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36X-Requested-With: XMLHttpRequest'} self.url_queue=Queue() self.n=0 #url队列 def url_in(self): for i in range(67): url=self.url.format(i) #入队列 self.url_queue.put(url) #线程事件函数 def get_data(self): while True: if self.url_queue.empty(): break #get地址,请求+解析+保存 url=self.url_queue.get() html=requests.get(url=url,headers=self.headers).content.decode('utf-8') html=json.loads(html) #with open('xiao.json','a') as f: #app_dict={} for app in html['data']: app_name=app['displayName'] app_link='http://app.mi.com/details?'+app['packageName'] print(app_name,app_link) self.n +=1 #主函数 def main(self): #url入队列 self.url_in() #创建多线程 t_list=[] for i in range (5): t=Thread(target=self.get_data) t_list.append(t) t.start() for i in t_list: i.join() print('应用数量:',self.n) if __name__ == '__main__': start=time.time() spider=XiaoAppSpider() spider.main() end=time.time() print('执行时间为{}'.format(end-start)) ```