zoukankan      html  css  js  c++  java
  • Android HttpClient和URLConnection两种下载HTML源码的方法

    两种方法分别采用HttpClient和URLConnection,同时解决乱码问题。

    经真机测试,好像是HttpClient方式比较稳定,一般都能下载到,但是URLConnection在EDGE网络下经常下不到数据。

    HttpClient方式:

    public String getHtml(String url) throws
    IOException, URISyntaxException{

      URI u=new URI(url);

      DefaultHttpClient httpclient =new DefaultHttpClient();        
      HttpGet httpget =new HttpGet(u);

      ResponseHandler<String> responseHandler = new BasicResponseHandler();
      String content = httpclient.execute(httpget, responseHandler);
      content = new String(content.getBytes("ISO-8859-1"),"UTF-8");        //没这个会乱码
      return content;
    }


    URLConnection方式:

    public String getHTML(String url) {

      try{

        URL newUrl=new URL(url);
        URLConnection connect=newUrl.openConnection();
        connect.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");
        DataInputStream dis=new DataInputStream(connect.getInputStream());
        BufferedReader in = new BufferedReader(new InputStreamReader(dis,"UTF-8"));//目标页面编码为UTF-8

        String html="";
        String readLine=null;
        while((readLine=in.readLine())!=null){
            html=html+readLine;        } 

        in.close();
        return html;
              }

        catch(MalformedURLException me){        } 

      catch(IOException ioe){        }

      return null;}

  • 相关阅读:
    Go 语言机制之逃逸分析
    类型转换和类型断言
    浅析rune数据类型
    Go 文件操作(创建、打开、读、写)
    字符编码笔记:ASCII,Unicode 和 UTF-8
    cmd.exe启动参数详解
    linux下.so、.ko、.a的区别
    Python 和C#的交互
    Innodb表压缩过程中遇到的坑(innodb_file_format)
    更改mysql的加密方式和密码策略
  • 原文地址:https://www.cnblogs.com/mumue/p/2433986.html
Copyright © 2011-2022 走看看