zoukankan      html  css  js  c++  java
  • 使用HttpUrlConnection访问www.163.com遇到503问题,用设置代理加以解决

    一次我使用如下程序连接到网易,意图获取其网站的html文本:

    try {
                String urlPath = "http://www.163.com/";
    
                URL url = new URL(urlPath);
                HttpURLConnection connection = (HttpURLConnection) url.openConnection();
                connection.setRequestMethod("GET");
                connection.connect();
                int responseCode = connection.getResponseCode();
                if (responseCode == HttpURLConnection.HTTP_OK) {
                    InputStream inputStream = connection.getInputStream();
                    File dir = new File("D:\logs\");
                    if (!dir.exists()) {
                        dir.mkdirs();
                    }
                    File file = new File(dir, "163.txt");
                    FileOutputStream fos = new FileOutputStream(file);
                    byte[] buf = new byte[1024 * 8];
                    int len = -1;
                    while ((len = inputStream.read(buf)) != -1) {
                        fos.write(buf, 0, len);
                    }
                    fos.flush();
                    fos.close();
                }else {
                    System.out.println("download file failed because responseCode="+responseCode);
                }
    
            } catch (Exception e) {
                e.printStackTrace();
            }

    但是,实质性代码没有进去,而是进去了else分支,原因是返回码是503。

    503是服务器未准备好的意思,但是我用浏览器访问网易是正常的,于是我想有以下可能:

    1.网易采用了防爬机制,得在头信息里加入浏览器信息以绕过。

    2.未必是网易给我返回的503,中途路由一样可以给我返回。

    经测试后,发现头信息加入浏览器信息无效。

    这时想浏览器里有代理设置,HttpUrlConnection没有代理怎么可以上网呢,于是在代码开头处加入了代理;

                // SetProxy
                System.setProperty("http.proxyHost", "pkg.proxy.prod.jp.local");
                System.setProperty("http.proxyPort", "10080");

    然后测试就顺利通过了。

    下面是全部代码,供大家参考:

    package urlconn;
    
    import java.io.File;
    import java.io.FileOutputStream;
    import java.io.InputStream;
    import java.net.HttpURLConnection;
    import java.net.URL;
    
    public class DownloadFileTest {
        public static void main(String[] args) {
            try {
                // SetProxy
                System.setProperty("http.proxyHost", "pkg.proxy.prod.jp.local");
                System.setProperty("http.proxyPort", "10080");
    
                String urlPath = "http://www.163.com/";
    
                URL url = new URL(urlPath);
                HttpURLConnection connection = (HttpURLConnection) url.openConnection();
                connection.setRequestMethod("GET");
                connection.connect();
                int responseCode = connection.getResponseCode();
                if (responseCode == HttpURLConnection.HTTP_OK) {
                    InputStream inputStream = connection.getInputStream();
                    File dir = new File("D:\logs\");
                    if (!dir.exists()) {
                        dir.mkdirs();
                    }
                    File file = new File(dir, "163.txt");
                    FileOutputStream fos = new FileOutputStream(file);
                    byte[] buf = new byte[1024 * 8];
                    int len = -1;
                    while ((len = inputStream.read(buf)) != -1) {
                        fos.write(buf, 0, len);
                    }
                    fos.flush();
                    fos.close();
                }else {
                    System.out.println("download file failed because responseCode="+responseCode);
                }
    
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

    --2020-03-03--

  • 相关阅读:
    计算机组成原理_存储器
    常用CMD命令
    swiper及其父级隐藏之后轮播失效问题
    canvas生成海报
    移动端h5 实现多个音频播放
    vuex的一些学习
    关于H5的一些杂思细想(一)
    vue Error: No PostCSS Config found in
    vue路由传参的三种方式区别(params,query)
    vue-cli+mock.js+axios模拟前后台数据交互
  • 原文地址:https://www.cnblogs.com/heyang78/p/12400403.html
Copyright © 2011-2022 走看看