zoukankan      html  css  js  c++  java
  • 记一次逻辑代码的实现(数组内数据按照指定时间差进行分组)

    业务场景

    有如下数据:

      id        intime       outtime
    1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26 1190771865,2019-11-26 16:42:46,2019-11-26 16:42:46 1190771865,2019-11-26 17:23:11,2019-11-26 17:23:11 1190771865,2019-11-26 13:27:26,2019-11-26 13:27:26

    需求:

      针对以上数据进行重组,重组规则为:

        对以上数据进行intime升序排序,后一条数据与前一条数据的intime进行比较

        1、如果第二条与第一条数据的差值大于120min,则直接舍弃第一条数据

        2、后一条数据与前一条数据差值小于120,则保留上一条数据的intime,将这一条的intime当做上一条的outtime,继续往后遍历,知道遍历到最后一条数据

        3、如果后一条数据与前一条数据的差值大于120min,则将该条数据当做新的一条数据,继续循环上面的规则

    代码实现:

    1、将上面数据处理成为一个array,即(aaa,Array(id,intime,outtime))
    注:在这之前已经将每条数据中的进出时间转换为了时间戳
    mergedDataTmp.map(x => (x._1, .distinct.filter(x => x._2<= x._2))) .mapPartitions(iter => { iter.map(x => { var count = 0 var iterNum = 0 val tList = new ListBuffer[(String, (String, String, String))]() val vs = x._2.sortWith((a, b) => a._2 < b._2).toIterator val vsList = vs.toList val vsLength = vsList.length var tmpV = "" for (t <- vsList) { iterNum += 1 if (count == 0) { tList += ((x._1, t)) count += 1 } else { val compareTime = if (!tList.isEmpty) { (DateUtil.dateToTimeStamp(t._2) - DateUtil.dateToTimeStamp(tList.last._2._2)) / 1000 >= 120 * 60 } else { false } if (compareTime && count == 1) { // (如果后一条记录的进时间)-(前一条记录的进时间)>=120min tList.remove(tList.length - 1) tList += ((x._1, t)) } else if (compareTime && count > 1) { // (如果后一条记录的进时间)-(前一条记录的进时间)>=120min val lastRecord = tList.last tList(tList.length - 1) = (x._1, (t._1, lastRecord._2._2, tmpV, t._3)) tList += ((x._1, t)) count = 1 } else { // 如果后一条记录的进时间 - 前一条记录的进时间<120min count += 1 if (iterNum == vsLength) { val lastRecord = tList.last tList(tList.length - 1) = (x._1, (t._1,lastRecord._2._2, t._3)) } tmpV = t._2 } } } tList }) }).flatMap(x => x)

      

  • 相关阅读:
    关于连通性问题的Tarjan算法暂结
    【BZOJ 3925】[Zjoi2015]地震后的幻想乡 期望概率dp+状态压缩+图论知识+组合数学
    Kruskal算法及其类似原理的应用——【BZOJ 3654】tree&&【BZOJ 3624】[Apio2008]免费道路
    【NOIP模拟赛】Drink 二维链表+模拟
    【BZOJ 2957】楼房重建&&Codechef COT5 Count on a Treap&&【NOIP模拟赛】Weed 线段树的分治维护
    【BZOJ 4198】[Noi2015]荷马史诗 哈夫曼编码
    【NOIP模拟赛】chess 建图+spfa统计方案数
    【TMD模拟赛】上低音号 链表
    【TMD模拟赛】黄金拼图 Cao
    【BZOJ 4007】[JLOI2015]战争调度 DP+搜索+状压
  • 原文地址:https://www.cnblogs.com/Gxiaobai/p/12076583.html
Copyright © 2011-2022 走看看