zoukankan      html  css  js  c++  java
  • 第十二节 豆瓣电影实战

    import requests
    from lxml import etree
    
    
    headers = {
        'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36',
        'Referer':'https://pagead2.googlesyndication.com/pagead/s/cookie_push.html'
    }
    url = 'https://movie.douban.com/cinema/nowplaying/fuzhou/'
    resp = requests.get(url, headers = headers)
    text = resp.text
    html = etree.HTML(text)
    ul = html.xpath('//ul[@class="lists"]')[0]
    lis = ul.xpath('./li[@class="list-item"]')
    movies = []
    for li in lis:
        title = li.xpath('@data-title')
        score = li.xpath('@data-score')
        star = li.xpath('@data-star')
        duration = li.xpath('@data-duration')
        region = li.xpath('@data-region')
        director = li.xpath('@data-director')
        actors = li.xpath('@data-actors')
        movie = {
            "title":title,
            "score":score,
            "star":star,
            "duration":duration,
            "region":region,
            "director":director,
            "actors":actors
        }
        movies.append(movie)
    for m in movies:
        print(m)
  • 相关阅读:
    ThreadLocal
    spring概述
    排序
    内存的分配原则
    常用概念比较
    垃圾回收机制
    java的内存模型
    对象的内存布局
    adb connect 和 install 通讯流程
    Android硬件抽象层(HAL)深入剖析(三)
  • 原文地址:https://www.cnblogs.com/kogmaw/p/12506966.html
Copyright © 2011-2022 走看看