zoukankan      html  css  js  c++  java
  • 网络爬虫简单入门--数据抓取-数据解析-数据显示-数据入库-B

    代码1:

    a=[3.45,4.45,5]
    b=[5,4]
    c=["aa",456,True]
    myList=[]
    myList.append(a)
    myList.append(b)
    myList.append(c)
    print(myList)

    代码2:

    #coding=utf-8
    list=[]
    for i in range(1,101):
        list.append(i)
    
    # print(list)
    
    tempList=[]
    newList=[]
    
    while True:
        num=0
        for temp in list:
            tempList.append(temp)
            num+=1
            if num==3:
                newList.append(tempList)
                tempList=[]
                num=0
                continue
        if temp==100:
            newList.append(tempList)
            break
    
    print(newList)

    代码3:

    import requests
    from bs4 import BeautifulSoup
    allUniv = []
    def getHTMLText(url):
        try:
            r = requests.get(url, timeout=30)
            r.raise_for_status()
            r.encoding = 'utf-8'
            return r.text
        except:
            return ""
    def fillUnivList(soup):
        data = soup.find_all('tr')
        for tr in data:
            ltd = tr.find_all('td')
            if len(ltd)==0:
                continue
            singleUniv = []
            for td in ltd:
                singleUniv.append(td.string)
            allUniv.append(singleUniv)
    def printUnivList(num):
        print("{:^4}{:^10}{:^5}{:^8}{:^10}".format("排名","学校名称","省市","总分","培养规模"))
        for i in range(num):
            u=allUniv[i]
            print("{:^4}{:^10}{:^5}{:^8}{:^10}".format(u[0],u[1],u[2],u[3],u[6]))
    def main():
        url = 'http://www.zuihaodaxue.cn/zuihaodaxuepaiming2016.html'
        html = getHTMLText(url)
        soup = BeautifulSoup(html, "html.parser")
        fillUnivList(soup)
        printUnivList(10)
    main()

    课后作业:

    1.复制上述代码,在Python环境下运行。

    2.读懂上述代码。

  • 相关阅读:
    css属性选择器
    css各浏览器的兼容性写法
    使元素的宽度由内容撑开的方法
    DOM应用
    css圣杯布局、等高布局
    总结css兼容问题
    table注意事项
    相对定位、绝对定位在IE6的问题
    IE6,7下li标签的间隙
    清除浮动的7种方法
  • 原文地址:https://www.cnblogs.com/exesoft/p/12988105.html
Copyright © 2011-2022 走看看