zoukankan      html  css  js  c++  java
  • phantomjs 设置代理

    phantomjs 可通过以下设置代理ip

    #coding=utf-8
    import os
    import re
    import time
    import requests
    from scrapy.selector import HtmlXPathSelector
    from scrapy.http import HtmlResponse
    from selenium import webdriver
    from selenium.webdriver.common.proxy import ProxyType

    import sys
    reload(sys)
    sys.setdefaultencoding("utf-8")

    import warnings
    warnings.filterwarnings("ignore")


    if __name__ == '__main__':
    PATH_PHANTOMJS=r'D:phantomjsinphantomjs.exe'
    browser=webdriver.PhantomJS(PATH_PHANTOMJS)
    #真实ip
    browser.get('http://icanhazip.com/') #获取ip网址
    response = HtmlResponse(url='',body=str(browser.page_source))
    hxs = HtmlXPathSelector(response)
    print 'your ip is:',''.join(hxs.select('//text()').extract()).strip()
    #代理ip
    proxy=webdriver.Proxy()
    proxy.proxy_type=ProxyType.MANUAL
    proxy.http_proxy='220.248.229.45:3128'
    #将代理设置添加到webdriver.DesiredCapabilities.PHANTOMJS中
    proxy.add_to_capabilities(webdriver.DesiredCapabilities.PHANTOMJS)
    browser.start_session(webdriver.DesiredCapabilities.PHANTOMJS)
    browser.get('http://icanhazip.com/') #获取ip网址
    response = HtmlResponse(url='',body=str(browser.page_source))
    hxs = HtmlXPathSelector(response)
    print 'your proxy ip is:',''.join(hxs.select('//text()').extract()).strip()

    经测试,实际有效,截图如下:

    后期可以考虑将此方法运用至下载中,减少爬虫被封的几率。

  • 相关阅读:
    基本MVVM 和 ICommand用法举例(转)
    WPF C# 命令的运行机制
    628. Maximum Product of Three Numbers
    605. Can Place Flowers
    581. Shortest Unsorted Continuous Subarray
    152. Maximum Product Subarray
    216. Combination Sum III
    448. Find All Numbers Disappeared in an Array
    268. Missing Number
    414. Third Maximum Number
  • 原文地址:https://www.cnblogs.com/niansi/p/6574957.html
Copyright © 2011-2022 走看看