zoukankan      html  css  js  c++  java
  • python---网络爬虫

    写了一个简单的网络爬虫:

    #coding=utf-8
    from bs4 import BeautifulSoup
    import requests
    url = "http://www.weather.com.cn/textFC/hb.shtml"
    def get_temperature(url):
        headers = {
            'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36',
            'Upgrade-Insecure-Requests':'1',
            'Referer':'http://www.weather.com.cn/weather1d/10129160502A.shtml',
            'Host':'www.weather.com.cn'
        }
        res = requests.get(url,headers=headers)
        res.encoding = "utf-8"
        content = res.content # 拿到的是ascll编码
        content = content.decode('UTF-8')# 转成UTF-8编码
        #print(content)
    
        soup = BeautifulSoup(content,'lxml')
        conMidetab = soup.find('div',class_='conMidtab')
        conMidetab2_list = conMidetab.find_all('div',class_='conMidtab2')
        for x in conMidetab2_list:
            tr_list = x.find_all('tr')[2:] # 所有的tr
            province = ''
            min = 0
            for index,x in enumerate(tr_list):
                if index == 0:
                    td_list = x.find_all('td')
                    province = td_list[0].text.replace('
    ','')
                    city = td_list[1].text.replace('
    ','')
                    min = td_list[7].text.replace('
    ','')
                else:
                    td_list = x.find_all('td')
                    city = td_list[0].text.replace('
    ','')
                    min = td_list[6].text.replace('
    ','')
                print(province,city,min)
            # province_list = tr_list[2]
            # td_list = province_list.find_all('td')
            # province_td = td_list[0]
            # province = province_td.text
            # #print(province.replace('
    ',''))
    get_temperature(url)
  • 相关阅读:
    将网址设为首页和添加到收藏夹(JS脚本)
    窗体居中一句话
    链接跳转JS
    工厂学习心得(三)Asp.net
    工厂模型学习心得(一)(ASP.Net)
    ubuntu终端快捷键
    Win7锐捷:无法找到网卡或系统未完成网卡初始化操作.请稍后再尝试认证
    Xming的使用
    在Windows上建立ssh服务器
    Windows XP循环启动
  • 原文地址:https://www.cnblogs.com/e0yu/p/9505490.html
Copyright © 2011-2022 走看看