zoukankan      html  css  js  c++  java
  • java爬虫代理

    public static Document getDocByJsoups(String href) {
    String ip = "124.47.7.38";
    int port = 80;
    Document doc = null;
    try {
    Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(ip, port));
    URL url = new URL(href);
    HttpsURLConnection urlcon = (HttpsURLConnection) url.openConnection(proxy);
    urlcon.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 10.0; …) Gecko/20100101 Firefox/60.0");
    urlcon.setRequestProperty("Cookie", "eCM1_5408_saltkey=Z6Sdvgri; eC…-8b23-ed947885e286-1531456912");
    urlcon.connect(); // 获取连接
    InputStream is = urlcon.getInputStream();
    BufferedReader buffer = new BufferedReader(new InputStreamReader(is));
    StringBuffer bs = new StringBuffer();
    String l = null;
    while ((l = buffer.readLine()) != null) {
    bs.append(l);
    }
    doc = Jsoup.parse(bs.toString());
    } catch (Exception e) {
    e.printStackTrace();
    logger.error(e.getMessage());
    }
    return doc;
    }

  • 相关阅读:
    校验规则,纯数字。几位有效数字,保留几位小数
    银行卡校验规则(Luhn算法)
    forEach兼容ie8
    node.js
    gulp
    observer
    webpack.config.js 配置
    内存泄漏(Memory Leak)
    cdn
    前端 各种插件的官网
  • 原文地址:https://www.cnblogs.com/lixxx/p/9407142.html
Copyright © 2011-2022 走看看