zoukankan      html  css  js  c++  java
  • 【转】Python爬虫_示例

    爬虫项目:爬取汽车之家新闻资讯

     
    # requests+Beautifulsoup爬取汽车之家新闻
    
    import requests
    from bs4 import BeautifulSoup
    
    response=requests.get('https://www.autohome.com.cn/news/')
    response.encoding='gbk'
    
    with open('a.html','w',encoding='utf-8') as f:
        f.write(response.text)
    soup=BeautifulSoup(response.text,'lxml')
    
    
    news=soup.find(id='auto-channel-lazyload-article').select('ul li a')
    
    
    for tag in news:
        link=tag.attrs['href']
        imag=tag.select('.article-pic img')[0].attrs['src']
        title=tag.find('h3').get_text()
        sub_time=tag.find(class_='fn-left').get_text()
        browsing_num=tag.select('.fn-right em')[0].get_text()
        comment=tag.find('p').get_text()
        msg='''
        ======================================
        链接:http:%s
        图片:http:%s
        标题:%s
        发布时间:%s
        浏览数:%s
        介绍:%s
        ''' %(link,imag,title,sub_time,browsing_num,comment)
    
        print(msg)
     
     
     
  • 相关阅读:
    Mysql分布式事务
    Mysql锁
    Mysql事务隔离级别
    java 资源监控
    Mysql子查询
    javaWeb四大域对象
    KVM 迁移
    KVM 虚拟化
    网络基础
    系统简单启动过程
  • 原文地址:https://www.cnblogs.com/hedeyong/p/7791648.html
Copyright © 2011-2022 走看看