zoukankan      html  css  js  c++  java
  • 爬取校园新闻

    str = requests.get('http://news.gzcc.cn/html/xiaoyuanxinwen/')

    str.encoding = 'utf-8'
     
    from bs4 import BeautifulSoup
    soup=BeautifulSoup(str.text,'html.parser')
    #print(soup)
     
    d=soup.select('li')
    for news in d:
        if len(news.select('.news-list-title')) > 0:
            t=print(news.select('.news-list-title')[0].text)
            a=news.select('a')[0].attrs
            print(a['href'])
     
            strd = requests.get(a['href'])
            strd.encoding = 'utf-8'
            soupd = BeautifulSoup(strd.text, 'html.parser')
            cont=soupd.select('#content')
            timet=soupd.select('.show-info')
            print(timet[0].text[0:25])
            print(timet[0].text[30:38])
            print(timet[0].text[38:45])
            print(timet[0].text[46:56])
            print(timet[0].text[62:])
  • 相关阅读:
    P2P编程(十)
    9.25
    9.22
    pycharm常用快捷命令
    sublime常用快捷方式
    3.1
    总想听你说起不曾喜欢你
    1.1
    python 网络编程和并发编程题
    知识点梳理 网络基础
  • 原文地址:https://www.cnblogs.com/Molemole/p/8709169.html
Copyright © 2011-2022 走看看