zoukankan html css js c++ java

Jsoup获取部分页面数据失败 org.jsoup.UnsupportedMimeTypeException: Unhandled content type. Must be text/*, application/xml, or application/xhtml+xml.

　　用Jsoup在获取一些网站的数据时，起初获取很顺利，但是在访问某浪的数据是Jsoup报错，应该是请求头里面的请求类型(ContextType)不符合要求。

　　请求代码如下：

    private static void testOuGuanMatch() throws IOException{
        Document doc = Jsoup.connect("我的URL").userAgent("Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.2.15)").timeout(5000).get();
        System.out.println(doc);
    }

　　能看到我这里设置了请求代理和相应时间。

　　报错信息如下：

org.jsoup.UnsupportedMimeTypeException: Unhandled content type. Must be text/*, application/xml, or application/xhtml+xml. Mimetype=application/javascript, URL=....
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:472)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:424)
    at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:178)
    at org.jsoup.helper.HttpConnection.get(HttpConnection.java:167)
    at calendarSpider.SpiderTest.testOuGuanMatch(SpiderTest.java:174)
    at calendarSpider.SpiderTest.main(SpiderTest.java:39)

　　在google上查找到了解决方法：添加ignoreContentType(true)

　　修改后代码：

    private static void testOuGuanMatch() throws IOException{
        Document doc = Jsoup.connect("我的URL").ignoreContentType(true).userAgent("Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.2.15)").timeout(5000).get();
        System.out.println(doc);
    }

　　那这里的ignoreContentType(true)看词就知道忽略ContextType的检查

查看全文

相关阅读:
QEMU编译及使用方法
 C++中的算法
 C++继承
 gcc savetemps选项
 C++ overload、override、overwrite
拷贝构造函数与拷贝赋值
 C++中的顺序容器
 C++中的虚函数(1)
C++中lambda的实现(1)
正确的时间做适合的事

原文地址：https://www.cnblogs.com/parryyang/p/5587929.html