zoukankan      html  css  js  c++  java
  • 网络爬虫之HttpClient

    网络爬虫之HttpClient

    1. HttpClient

      • 网络爬虫就是用程序爬取资源,需要使用Http协议访问互联网的网页,在爬虫过程中使用java的Http协议客户端HttpClient这个技术来实现抓取网页中的数据

    2. HttpClient之Get请求

    3. 下面进行代码实现

    网络爬虫之HttpClient

    1. HttpClient

      • 网络爬虫就是用程序爬取资源,需要使用Http协议访问互联网的网页,在爬虫过程中使用java的Http协议客户端HttpClient这个技术来实现抓取网页中的数据

    2. HttpClient之Get请求

    3.  代码:

    package cn.itcast.crawler.test;

    import org.apache.http.HttpEntity;
    import org.apache.http.client.methods.CloseableHttpResponse;
    import org.apache.http.client.methods.HttpGet;
    import org.apache.http.impl.client.CloseableHttpClient;
    import org.apache.http.impl.client.HttpClients;
    import org.apache.http.util.EntityUtils;

    import java.io.IOException;

    public class HttpGetTest {
    public static void main(String[] args) {
    //1.创建HttpClient对象
    CloseableHttpClient httpClient= HttpClients.createDefault();
    //2.创建HttpGet对象,设置URL地址
    HttpGet httpGet=new HttpGet("https://www.baidu.com");
    //使用httpClient发起响应获取repsonse
    CloseableHttpResponse response=null;
    try {
    response=httpClient.execute(httpGet);
    //4.解析响应,获取数据
    //判断状态码是否是200
    if(response.getStatusLine().getStatusCode()==200){
    HttpEntity httpEntity=response.getEntity();
    String content=EntityUtils.toString(httpEntity,"utf8");
    System.out.println(content.length());
    }
    } catch (IOException e) {
    e.printStackTrace();
    }finally {
    try {
    response.close();
    } catch (IOException e) {
    e.printStackTrace();
    }
    try {
    httpClient.close();
    } catch (IOException e) {
    e.printStackTrace();
    }
    }

    }
    }
    执行结果:

     

     

     

     

     

     

  • 相关阅读:
    清空数据库所有表数据
    sqlserver编号
    Inherits、CodeFile、CodeBehind的区别
    初识NuGet
    ASP.Net各个命名空间及作用
    SQL SERVER数据库性能优化之SQL语句篇
    Exercise 20: Functions And Files
    Exercise 19: Functions And Variables
    Exercise 18: Names, Variables, Code, Functions
    Exercise 17: More Files
  • 原文地址:https://www.cnblogs.com/juddy/p/13111432.html
Copyright © 2011-2022 走看看