zoukankan      html  css  js  c++  java
  • 使用Scala代码删除hbase数据库当中的数据

    这里只是记录下删除HBase数据的一个简单方法,其他的删除方式大家可以发散思维。代码如下:

    // 根据时间删除错误数据
      private def rmRazorError(table: String)(implicit args: Array[String]): Unit = {
        var isSucc = false
        var msg = ""
        val JOB_NAME = s"$table-$rmDay"
        val jobID =  s"$JOB_NAME-" + workID
        if (SQLLogger.isJobSucc(jobID)) {
          msg = jobID + " has already been executed successfully."
          log.info(msg)
          isSucc = true
          return
        }
        SQLLogger.insJobStart(workID, jobID, JOB_NAME)
        log.info(s"$JOB_NAME start ...")
        val hTable = Config.getHBaseConn.getTable(table)
        hTable.setAutoFlushTo(false)
        try {
          // Get the parameter of work
          val Array(startTime, endTime) = args
          // 删除操作
          val delRow = (r: Result) => {
            val row = r.getRow
            log.info("Deleting row: " + Bytes.toString(row))
            hTable.delete(new Delete(row))
          }
          var tmpTime = startTime
          // foreach to delete the row
          while(tmpTime.compare(endTime) <= 0) {
            //val hTable: HTableInterface = Config.getHBaseConn.getTable(table)
    
            log.info(s"Deleting rows in table: $table" + " using " +tmpTime)
    
            val scan = new Scan()
            val rowFilter1 = new RowFilter(CompareFilter.CompareOp.EQUAL,
              new RegexStringComparator(".*-"+tmpTime+".*"))
            scan.setFilter(rowFilter1)
    
            val rs2 = hTable.getScanner(scan).toIterator
            rs2.foreach(delRow)
            tmpTime = getBeforeOneDay(tmpTime)
          }
          isSucc = true
        } catch {
          case ex:Exception => {isSucc = false; msg = s"$table's job is failed"; finalSucc = false; isSucc = isSucc&&finalSucc}
        } finally {
          hTable.flushCommits()
          hTable.close()
        }
        SQLLogger.insJobEnd(jobID, isSucc, msg)
        log.info(s"$JOB_NAME end.")
      }

    代码当中的table为表的名称,同时拥有两个隐式参数startTime和endTime。该例子是讲startTime到endTime之间的所有的表中的数据给删除掉。删除的依据就是rowKey当中的yyyyMMdd这个时间值,如果你的rowKey当中有这个字段,可以依据此条件进行删除。

    学习、成长
  • 相关阅读:
    HTB-靶机-Charon
    第一篇Active Directory疑难解答概述(1)
    Outlook Web App 客户端超时设置
    【Troubleshooting Case】Exchange Server 组件状态应用排错?
    【Troubleshooting Case】Unable to delete Exchange database?
    Exchange Server 2007的即将生命周期,您的计划是?
    "the hypervisor is not running" 故障
    Exchange 2016 体系结构
    USB PE
    10 months then free? 10个月,然后自由
  • 原文地址:https://www.cnblogs.com/yarcl/p/11046769.html
Copyright © 2011-2022 走看看