zoukankan      html  css  js  c++  java
  • 爬取校园新闻首页的新闻

    import requests
    re=requests.get('http://news.gzcc.cn/html/xiaoyuanxinwen/')
    re.encoding='utf-8'
    from bs4 import BeautifulSoup
    soup = BeautifulSoup(re.text,'html.parser')
    #print(soup.select('li'))
    for news in soup.select('li'):
        if len(news.select('.news-list-title'))>0:
            d=news.select('.news-list-title')[0].text
            e = news.select('.news-list-description')[0].text
            r = news.select('.news-list-info')[0].text
            #print(d)
            f=news.select('a')[0].attrs['href']
            #f=news.a.attrs['href']
            print(e,f)
            print(d,r)
    
            res = requests.get(f)
            res.encoding = 'utf-8'
            soupd = BeautifulSoup(res.text, 'html.parser')
            #print(soupd.select('.show-content')[0].text)
            print(soupd.select('.show-info')[0].text[0:25])
            print(soupd.select('.show-info')[0].text[30:38])
            print(soupd.select('.show-info')[0].text[38:45])
            print(soupd.select('.show-info')[0].text[46:56])
            print(soupd.select('.show-info')[0].text[62:])
            break
  • 相关阅读:
    RSA算法原理(二)
    RSA算法原理(一)
    Diffie-Hellman 算法
    1028:Ignatius and the Princess III
    1014:Uniform Generator
    1013:Digital Roots
    常见OJ评判结果对照表
    Django模板系统
    Django之视图
    Django之路由系统
  • 原文地址:https://www.cnblogs.com/168-hui/p/8717413.html
Copyright © 2011-2022 走看看