zoukankan      html  css  js  c++  java
  • python的selenium如何下载pdf文件

    1.近期下载一些数据,涉及到pdf文件,因一些原因,需要用到selenium,希望不是在浏览器中打开pdf,而是下载这个文件

    # -*- coding: utf-8 -*-
    import time
    import re
    import io
    from selenium import webdriver
    from selenium.webdriver.common.action_chains import ActionChains
    
    import sys
    reload(sys)
    sys.setdefaultencoding("utf-8") 
    
    option = webdriver.ChromeOptions()
    option.add_experimental_option("excludeSwitches", ['enable-automation'])
    option.add_experimental_option('prefs',  {
        "download.default_directory": "D:\edesk\outtask\AIopt\AIOPTjiaofu\lunwen",
        "download.prompt_for_download": False,
        "download.directory_upgrade": True,
        "plugins.always_open_pdf_externally": True  #这句配置很重要
        }
    ) 
    driver = webdriver.Chrome(chrome_options=option)  
    wait = ui.WebDriverWait(driver,20)
    
    
    driver.get("https://xxxxx1-s2.0-S2095495621000383-main.pdf") 
    time.sleep(10)
    driver.set_window_size(width=1000, height=800, windowHandle="current")
    flag = False
    if not flag:
        #可以手动输入一次密码,后面访问均有cookie
        raw_input('login:')
    print "login!!!"
    time.sleep(10) 
    driver.get("https://xxxxx1-s2.0-S2095495621000383-main.pdf") 
    time.sleep(20)  
    print "download!!!"
    driver.quit()
  • 相关阅读:
    STL
    Makefile
    配置pyqt5环境 for python3.4 on Linux Mint 17.1
    SELinux Policy Macros
    python爬虫
    python常用模块
    python中if __name__ == '__main__':
    Centos 7 .Net core后台守护进程Supervisor配置
    阅读Google Protocol Buffers 指南,整理pb语法
    Google Protocol Buffers 入门
  • 原文地址:https://www.cnblogs.com/lingwang3/p/14440087.html
Copyright © 2011-2022 走看看