zoukankan html css js c++ java

python爬虫：爬取百度云盘资料，保存下载地址、链接标题、链接详情

在网上看到的教程，但是我嫌弃那个教程写的乱（虽然最后显示我也没高明多少，哈哈），就随手写了一个

主要是嫌弃盘搜那些恶心的广告，这样直接下载下来，眼睛清爽多了。

用pyinstall 打包成EXE文件，就可以安安静静的下载东西了。。。。

#refer:http://upvup.com/html/python/2015-12-13/21.html

#!/usr/bin/python
# -*- encoding:utf-8 -*-

import requests
from bs4 import BeautifulSoup
import sys
reload(sys)
sys.setdefaultencoding('utf-8')

def yunpan_search(key):
    url='http://www.wangpansou.cn/s.php?q='+key
    html=requests.get(url)
    soup=BeautifulSoup(html.text,"lxml")
    url_get=soup.find_all('a',{'class':'cse-search-result_content_item_top_a'})
    info_get=soup.find_all('div',{'class':'cse-search-result_content_item_mid'})
    f = open('baidu_source.txt','w')
    for i in range(len(url_get)):
        href=url_get[i]['href']
        title=''
        for c in url_get[i].children:
            title+=c.string.strip()

        information=''
        for info in info_get[i].children:
            information+=info.string.strip().replace('
','')

        print str(i+1)+'_'*60
        print '下载地址--'+href+'
'+'链接标题--'+title+'
'+'链接详情--'+information+'

'
        f.write(str(i+1)+'. _____________________________________________________________________
')
        f.write('下载地址--'+href+'
'+'链接标题--'+title+'
'+'链接详情--'+information+'

')
    f.close()


if __name__=='__main__':
    key=raw_input('please input what you want to look for:')
    yunpan_search(key)
    print('finish')

查看全文

相关阅读:
mysql进阶语句优化---day40
pymysql基本语法，sql注入攻击，python操作pymysql，数据库导入导出及恢复数据---day38
单表查询，多表查询，子查询---day37
mysql-数据类型，类型约束，联合唯一约束，表与表之间的关系，存储引擎---day36
mysql安装及增删改查操作---day35
死锁,互斥锁,递归锁，线程事件Event，线程队列Queue，进程池和线程池，回调函数，协程的使用，协程的例子---day33
进程之间共享数据Manager，线程相关使用Thread，用类定义线程，守护线程setDaemon，线程锁Lock，线程信号量Semaphore---day32
Docker部署go-fastdfs
Docker部署gitlab
Docker部署hasura

原文地址：https://www.cnblogs.com/miranda-tang/p/5584825.html