zoukankan      html  css  js  c++  java
  • 使用Scala代码删除hbase数据库当中的数据

    这里只是记录下删除HBase数据的一个简单方法,其他的删除方式大家可以发散思维。代码如下:

    // 根据时间删除错误数据
      private def rmRazorError(table: String)(implicit args: Array[String]): Unit = {
        var isSucc = false
        var msg = ""
        val JOB_NAME = s"$table-$rmDay"
        val jobID =  s"$JOB_NAME-" + workID
        if (SQLLogger.isJobSucc(jobID)) {
          msg = jobID + " has already been executed successfully."
          log.info(msg)
          isSucc = true
          return
        }
        SQLLogger.insJobStart(workID, jobID, JOB_NAME)
        log.info(s"$JOB_NAME start ...")
        val hTable = Config.getHBaseConn.getTable(table)
        hTable.setAutoFlushTo(false)
        try {
          // Get the parameter of work
          val Array(startTime, endTime) = args
          // 删除操作
          val delRow = (r: Result) => {
            val row = r.getRow
            log.info("Deleting row: " + Bytes.toString(row))
            hTable.delete(new Delete(row))
          }
          var tmpTime = startTime
          // foreach to delete the row
          while(tmpTime.compare(endTime) <= 0) {
            //val hTable: HTableInterface = Config.getHBaseConn.getTable(table)
    
            log.info(s"Deleting rows in table: $table" + " using " +tmpTime)
    
            val scan = new Scan()
            val rowFilter1 = new RowFilter(CompareFilter.CompareOp.EQUAL,
              new RegexStringComparator(".*-"+tmpTime+".*"))
            scan.setFilter(rowFilter1)
    
            val rs2 = hTable.getScanner(scan).toIterator
            rs2.foreach(delRow)
            tmpTime = getBeforeOneDay(tmpTime)
          }
          isSucc = true
        } catch {
          case ex:Exception => {isSucc = false; msg = s"$table's job is failed"; finalSucc = false; isSucc = isSucc&&finalSucc}
        } finally {
          hTable.flushCommits()
          hTable.close()
        }
        SQLLogger.insJobEnd(jobID, isSucc, msg)
        log.info(s"$JOB_NAME end.")
      }

    代码当中的table为表的名称,同时拥有两个隐式参数startTime和endTime。该例子是讲startTime到endTime之间的所有的表中的数据给删除掉。删除的依据就是rowKey当中的yyyyMMdd这个时间值,如果你的rowKey当中有这个字段,可以依据此条件进行删除。

    学习、成长
  • 相关阅读:
    JSON
    robotframework常见的问题
    robotframework——excel处理方法
    robotframework获得随机数字的方法
    robotframework定位动态元素
    robotframework的关键字区分大小写的问题
    自动化测试相关知识
    ride控制台乱码的解决方法
    xpath定位兄弟元素
    python——运算
  • 原文地址:https://www.cnblogs.com/yarcl/p/11046769.html
Copyright © 2011-2022 走看看