zoukankan      html  css  js  c++  java
  • 网络爬虫基础练习

    import requests
    url = 'http://localhost:63342/bd/gouxueyuan.html?_ijt=kn4osq2f4cqos8pf8vjvmkrah7'
    res = requests.get(url)
    res.encoding = 'utf-8'
    
    from bs4 import BeautifulSoup
    soup = BeautifulSoup(res.text,'html.parser')
    
    
    print(soup.select('h1')[0].text)
    
    
    for link in soup.select('a'):
        print(link.get('href'))
    
    for i in soup.select('li'):
        print(i.contents)
    
    print(soup.select('.news-list-title')[0].text)
    print(soup.select('li')[1].a.attrs['href'])
    print(soup.select('.news-list-info')[0].contents[0].text)
    print(soup.select('.news-list-info')[0].contents[1].text)
  • 相关阅读:
    iou与giou对比
    Linux学习第一天 vim
    奖励加分申请
    人月神话阅读笔记3
    5.27
    5.26
    5.25
    5.23
    5.22
    5.21
  • 原文地址:https://www.cnblogs.com/129lai/p/8668839.html
Copyright © 2011-2022 走看看