zoukankan      html  css  js  c++  java
  • 抓取最好大学网

    import requests
    import bs4
    from bs4 import BeautifulSoup
    
    def getHTMLText(url):
        try:
            r=requests.get(url )
            r.raise_for_status()
            r.encoding=r.apparent_encoding
            return r.text
        except :
            return ""
    def fillUniverList(ulist,html):
        soup=BeautifulSoup(html,"html.parser")
        for tr in soup.find('tbody').children:
            if isinstance(tr,bs4.element.Tag):
                tds=tr('td')
                ulist.append([tds[0].string, tds[1].string, tds[3].string])
    def printUniverList(ulist,num):
        tplt="{0:^10}	{1:^6}	{2:^10}"
        print(tplt.format("排名","学校名称","总分",chr(12288)))
        for i in range(num):
            u=ulist[i]
            print(tplt.format(u[0],u[1],u[2]))
        print("Suc",str(num))
    def main():
        uinfo=[]
        url="http://www.zuihaodaxue.com/zuihaodaxuepaiming2019.html"
        html=getHTMLText(url)
        fillUniverList(uinfo,html)
        printUniverList(uinfo,100)
    main()
  • 相关阅读:
    MySQL-安装mysql8
    MySQL-Prometheus
    MySQL-sysbench
    MySQL-客户端登录问题
    学习进度第十六周
    学习进度第十五周
    寻找最长单词链
    用户体验评价
    学习进度第十四周
    找水王问题
  • 原文地址:https://www.cnblogs.com/jestin/p/12911151.html
Copyright © 2011-2022 走看看