zoukankan      html  css  js  c++  java
  • 利用HttpClient4访问网页

    一、HttpClient介绍

      虽然在 JDK 的 java.net 包中已经提供了访问 HTTP 协议的基本功能,但是它没有提供足够的灵活性和其他应用程序需要的功能。HttpClient 是 Apache Jakarta Common 下的子项目,用来提供高效的、最新的、功能丰富的支持 HTTP 协议的客户端编程工具包,并且它支持 HTTP 协议最新的版本和建议。

    二、使用范例(以下版本4.3)

      1,通过get方式,请求网页内容。我们首先创建httpclient对象,然后通过httpclient来执行http get方法,httpresponse获得服务端响应的所有内容,httpentity为获取的网页消息体。

            CloseableHttpClient httpclient = HttpClients.createDefault();
            try {
                // 以get方法执行请求
                HttpGet httpGet = new HttpGet(“http://localhost/”);
                // 获得服务器响应的所有信息
                CloseableHttpResponse responseGet = httpclient.execute(httpGet);
                try {
                    System.out.println(responseGet.getStatusLine());
                    // 获得服务器响应的消息体(不包括http head)
                    HttpEntity entity = responseGet.getEntity();
    
                    if (entity != null) {
                        // 获得响应字符集编码
                        ContentType contentType = ContentType.getOrDefault(entity);
                        Charset charset = contentType.getCharset();
                        InputStream is = entity.getContent();
                        // 将inputstream转化为reader,并使用缓冲读取,还可按行读取内容
                        BufferedReader br = new BufferedReader(
                                new InputStreamReader(is, charset));
                        String line = null;
                        while ((line = br.readLine()) != null) {
                            System.out.println(line);
                        }
                        is.close();
                    }
                } finally {
                    responseGet.close();
                }
    
            } finally {
                httpclient.close();
            }
    View Code

      2,通过post方式提交表单。浏览器可将登录后的会话信息存储到本地,登陆之后的每次请求都会自动向服务器发送cookie信息,幸好的是httpclient亦可自动处理cookie信息。

            CloseableHttpClient httpclient = HttpClients.createDefault();
    
                // 以post方法发起登录请求
                String urlString = "http://localhost/llogin.do";
                HttpPost httpPost = new HttpPost(urlString);
                List<NameValuePair> nvps = new ArrayList<NameValuePair>();
                nvps.add(new BasicNameValuePair("username", "admin"));
                nvps.add(new BasicNameValuePair("password", "admin"));
                // 添加post参数
                httpPost.setEntity(new UrlEncodedFormEntity(nvps));
                CloseableHttpResponse response = httpclient.execute(httpPost);
    
                try {
                    // 状态302的话,重定向,则无法获取响应消息体
                    System.out.println(response.getStatusLine());
                    // 获得服务器响应的消息体(不包括http head)
                    HttpEntity entity = response.getEntity();
    
                    if (entity != null) {
                        // 获得响应字符集编码
                        ContentType contentType = ContentType.getOrDefault(entity);
                        Charset charset = contentType.getCharset();
                        InputStream is = entity.getContent();
                        // 将inputstream转化为reader,并使用缓冲读取,还可按行读取内容
                        BufferedReader br = new BufferedReader(
                                new InputStreamReader(is, charset));
                        String line = null;
                        while ((line = br.readLine()) != null) {
                            System.out.println(line);
                        }
                        is.close();
                    }
    
                } finally {
                    response.close();
                }
    View Code

      3,重定向。httpclient默认可自动处理重定向请求,但是post方式需另外设置。

            LaxRedirectStrategy redirectStrategy = new LaxRedirectStrategy();
            CloseableHttpClient httpclient = HttpClients.custom()
                    .setRedirectStrategy(redirectStrategy)
                    .build();
            HttpClientContext context = HttpClientContext.create();
            try {
                // 以post方法执行登录请求
                HttpPost httpPost = new HttpPost(urlString);
                List<NameValuePair> nvps = new ArrayList<NameValuePair>();
                nvps.add(new BasicNameValuePair("username", "admin"));
                nvps.add(new BasicNameValuePair("password", "admin"));
                // 添加post参数
                httpPost.setEntity(new UrlEncodedFormEntity(nvps));
                CloseableHttpResponse response = httpclient.execute(httpPost, context);
    
                try {
                    // 状态302的话,重定向,则无法获取响应消息体
                    System.out.println(response.getStatusLine());
                    // 获得服务器响应的消息体(不包括http head)
                    HttpEntity entity = response.getEntity();
    
                    //输出最终访问地址
                    HttpHost targetHost = context.getTargetHost();
                    System.out.println(targetHost);
                    List<URI> redirecLocations = context.getRedirectLocations();
                    URI location = URIUtils.resolve(httpPost.getURI(), targetHost, redirecLocations);
                    System.out.println("Final HTTP location: " + location.toASCIIString());
                    
                    
                    if (entity != null) {
                        // 获得响应字符集编码
                        ContentType contentType = ContentType.getOrDefault(entity);
                        Charset charset = contentType.getCharset();
                        InputStream is = entity.getContent();
                        // 将inputstream转化为reader,并使用缓冲读取,还可按行读取内容
                        BufferedReader br = new BufferedReader(
                                new InputStreamReader(is, charset));
                        String line = null;
                        while ((line = br.readLine()) != null) {
                            System.out.println(line);
                        }
                        is.close();
                    }
    
                } finally {
                    response.close();
                }
    
            } finally {
                httpclient.close();
            }
    View Code

      4,利用httpclient,我们可以封装一个方法,只要传入httpclient对象和url地址,即可返回网页内容。

        public static String getHtml(HttpClient httpClient, String url)  {
    
            // HttpClient主要用来执行http方法
            CloseableHttpClient httpclient = HttpClients.createDefault();
            try {
                // 以get方法向服务端发起请求
                HttpGet httpGet = new HttpGet(url);
                // 获得服务器响应的所有信息
                CloseableHttpResponse responseGet = httpclient.execute(httpGet);
    
                try {
                    // 获得服务器响应的消息体(不包括http head)
                    HttpEntity entity = responseGet.getEntity();
    
                    if (entity != null) {
                        // 获得响应字符集编码
                        ContentType contentType = ContentType.getOrDefault(entity);
                        Charset charset = contentType.getCharset();
                        InputStream is = entity.getContent();
                        //IOUtils是common-io提供的
                        String htmlString = IOUtils.toString(is);
                        
                        is.close();
                        return htmlString;
                    }
                } finally {
                    responseGet.close();
                }
    
            } catch (Exception e) {
                e.printStackTrace();
            }
            
            return null;
        }
    View Code

      另外,若访问的是图片,则可从输入流中将内容存储到byte数组中,如byte[] image = IOUtils.toByteArray(is),返回byte[]即可;若想下载保存到本地,可使用IOUtils的方法:IOUtils.copy(is, new FileOutputStream(filename))。

    这里略提一下Apache-Commons-IO组件,它是对jdk中的io包进行拓展,让我们可以更方便处理输入输出流和对文件的处理。

    最后,要想学习熟悉httpclient,最好就是查看其官方文档和它提供的范例,它的文档和范例都很不错,推荐阅读。

  • 相关阅读:
    [Aaronyang] 写给自己的WPF4.5 笔记[2依赖属性]
    [Aaronyang] 写给自己的WPF4.5 笔记[1布局]
    [AaronYang]C#人爱学不学[7]
    [AaronYang]C#人爱学不学[6]
    [AaronYang]C#人爱学不学[5]
    [AaronYang]C#人爱学不学[4]
    [AaronYang]C#人爱学不学[3]
    [AaronYang]C#人爱学不学[2]
    [AaronYang]C#人爱学不学[1]
    [aaronyang原创] Mssql 一张表3列的sql面试题,看你sql学的怎么样
  • 原文地址:https://www.cnblogs.com/jianzhi/p/3362742.html
Copyright © 2011-2022 走看看