zoukankan      html  css  js  c++  java
  • java之接口文档规范

    一、xxxxxx获取指定任务爬取的所有url的接口

    接口名称:xxxxxx获取指定任务爬取的所有url的接口

    访问链接:

      http://IP:PORT/crwalTask/findUrlExceptionById?ctId=ctIdVal&time=timeVal&limit=limitVal

    传入参数类型:String,int

    参数内容:

      

    返回类型:JSONArray

    返回内容:

      

    调用方法Demo 

     1 public static void main(String[] args) throws Exception {
     2         //爬虫访问接口地址
     3         String req_url = "http://192.168.1.105:8080/crwalTask/findUrlExceptionById?ctId=ctIdVal&time=timeVal&limit=limitVal";
     4         JSONArray jsonArray = httpRequest(req_url,"ba716af7-105c-481b-bf28-2e9231529947",SelectUtil.time,SelectUtil.number);//200
     5         System.out.println(jsonArray);
     6     }    
     7 
     8 public class SelectUtil {
     9     public  static final String time = "2018-03-05".replaceAll(" ", "=");//按时间筛选  格式"yyyy-mm-dd"或"yyyy-mm-dd HH:mm:ss"
    10     public  static final int number = 162;//查询限制数量
    11 }
    12 /**
    13      * 获取指定任务爬取的所有url信息
    14      * @param req_url 访问指定任务爬取的url的链接地址
    15      * @param ctId 指定的任务Id
    16      * @param time 查询时间
    17      * @param limit 查询限制的条数
    18      * @return
    19      */
    20     public static JSONArray httpRequest(String req_url,String ctId,String time,int limit) {
    21         req_url = req_url.replace("ctIdVal",ctId);
    22         req_url = req_url.replace("timeVal",time);
    23         req_url = req_url.replace("limitVal",String.valueOf(limit));
    24         StringBuffer buffer = new StringBuffer();
    25         JSONArray jsonArray = null;
    26         try {  
    27             URL url = new URL(req_url);  
    28             HttpURLConnection httpUrlConn = (HttpURLConnection) url.openConnection();  
    29 
    30             httpUrlConn.setDoOutput(false);  
    31             httpUrlConn.setDoInput(true);  
    32             httpUrlConn.setUseCaches(false);  
    33 
    34             httpUrlConn.setRequestMethod("POST");  
    35             httpUrlConn.connect();  
    36 
    37             // 将返回的输入流转换成字符串  
    38             InputStream inputStream = httpUrlConn.getInputStream();  
    39             InputStreamReader inputStreamReader = new InputStreamReader(inputStream, "utf-8");  
    40             BufferedReader bufferedReader = new BufferedReader(inputStreamReader);  
    41 
    42             String str = null;  
    43             while ((str = bufferedReader.readLine()) != null) {  
    44                 buffer.append(str);  
    45             }  
    46             bufferedReader.close();  
    47             inputStreamReader.close();  
    48             // 释放资源  
    49             inputStream.close();  
    50             inputStream = null;  
    51             httpUrlConn.disconnect();  
    52             if("".equals(buffer.toString())){
    53                 String exception = "["exception","查询的记录数超过240"]";
    54                 
    55                 jsonArray = JSONArray.fromObject(exception);
    56             }else{
    57                 jsonArray = JSONArray.fromObject(buffer.toString());
    58             }
    59         } catch (Exception e) {  
    60             System.out.println(e.getMessage());  
    61         }
    62         
    63         return jsonArray;  
    64     }  
    View Code

    需要的Jar包:

      commons-beanutils-1.9.3.jar

      commons-collections-3.2.2.jar

      commons-lang-2.6.jar

      commons-logging-1.2.jar

      ezmorph-1.0.6.jar

      json-lib-2.4-jdk15.jar

    Sql脚本  

      alter table urlpathmapper add exceptionInfo varchar(2048) comment 'URL运行错误信息'

      alter table urlpathmapper add title varchar(256) comment '爬取标题'

      alter table crawltaskmanage add checkFile varchar(8) comment '文件是否校验 01'

      alter table crawltaskmanage add SimHashValue int(8) comment 'SimHash算法重复度比较值'

  • 相关阅读:
    51 nod 1181 质数中的质数(质数筛法)
    Just oj 2018 C语言程序设计竞赛(高级组)F:Star(结构体排序+最小生成树)
    欧拉函数+费马小定理拓展
    ZOJ 3785 What day is that day?(数论:费马小定理)
    Just oj 2018 C语言程序设计竞赛(高级组)H: CBT?
    树链剖分(入门学习)
    bitset用法
    链式前向星
    Nearest Common Ancestors(LCA板子)
    LCA(最近公共祖先)
  • 原文地址:https://www.cnblogs.com/sqy-yyr/p/9364117.html
Copyright © 2011-2022 走看看