zoukankan      html  css  js  c++  java
  • YOLOv3 K-means获取anchors大小

    YOLOv1和YOLOv2简单看了一下,详细看了看YOLOv3,刚看的时候是蒙圈的,经过一番研究,分步记录一下几个关键的点:

    v2和v3中加入了anchors和Faster rcnn有一定区别,这个anchors如何理解呢?

    个人理解白话篇:

    (1)就是有一批标注bbox数据,标注为左上角坐标和右下角坐标,将bbox聚类出几个类作为事先设置好的anchor的宽高,对应格式就是voc数据集标xml注格式即可。

    代码提取标注数据里的宽高并用图像的宽高进行归一化:

    def load_dataset(path):
    	dataset = []
    	for xml_file in glob.glob("{}/*xml".format(path)):
    		tree = ET.parse(xml_file)
    
    		height = int(tree.findtext("./size/height"))
    		width = int(tree.findtext("./size/width"))
    
    		for obj in tree.iter("object"):
    			xmin = int(obj.findtext("bndbox/xmin")) / width
    			ymin = int(obj.findtext("bndbox/ymin")) / height
    			xmax = int(obj.findtext("bndbox/xmax")) / width
    			ymax = int(obj.findtext("bndbox/ymax")) / height
    
    			dataset.append([xmax - xmin, ymax - ymin])
    
    	return np.array(dataset)
    

      

    (2)具体怎么分的呢?就是用K-means对所有标注的bbox数据根据宽高进行分堆,voc数据被分为9个堆,距离是用的distance = 1-iou

    import numpy as np
    
    '''
    (1)k-means拿到数据里所有的目标框N个,得到所有的宽和高,在这里面随机取得9个作为随机中心
    (2)然后其他所有的bbox根据这9个宽高依据iou(作为距离)进行计算,计算出N行9列个distance吧
    (3)找到每一行中最小的那个即所有的bbox都被分到了9个当中的一个,然后计算9个族中所有bbox的中位数更新中心点。
    (4)直到9个中心不再变即可,这9个中心的x,y就是整个数据的9个合适的anchors==框的宽和高。
    ''' def iou(box, clusters): """ Calculates the Intersection over Union (IoU) between a box and k clusters. :param box: tuple or array, shifted to the origin (i. e. width and height) :param clusters: numpy array of shape (k, 2) where k is the number of clusters :return: numpy array of shape (k, 0) where k is the number of clusters """ #计算每个box与9个clusters的iou # boxes : 所有的[[width, height], [width, height], …… ] # clusters : 9个随机的中心点[width, height] x = np.minimum(clusters[:, 0], box[0]) y = np.minimum(clusters[:, 1], box[1]) if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0: raise ValueError("Box has no area") intersection = x * y # 所有的boxes的面积 box_area = box[0] * box[1] cluster_area = clusters[:, 0] * clusters[:, 1] iou_ = intersection / (box_area + cluster_area - intersection) return iou_ def avg_iou(boxes, clusters): """ Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters. :param boxes: numpy array of shape (r, 2), where r is the number of rows :param clusters: numpy array of shape (k, 2) where k is the number of clusters :return: average IoU as a single float """ return np.mean([np.max(iou(boxes[i], clusters)) for i in range(boxes.shape[0])]) def translate_boxes(boxes): """ Translates all the boxes to the origin. :param boxes: numpy array of shape (r, 4) :return: numpy array of shape (r, 2) """ new_boxes = boxes.copy() for row in range(new_boxes.shape[0]): new_boxes[row][2] = np.abs(new_boxes[row][2] - new_boxes[row][0]) new_boxes[row][3] = np.abs(new_boxes[row][3] - new_boxes[row][1]) return np.delete(new_boxes, [0, 1], axis=1) def kmeans(boxes, k, dist=np.median): """ Calculates k-means clustering with the Intersection over Union (IoU) metric. :param boxes: numpy array of shape (r, 2), where r is the number of rows :param k: number of clusters :param dist: distance function :return: numpy array of shape (k, 2) """ rows = boxes.shape[0] distances = np.empty((rows, k)) last_clusters = np.zeros((rows,)) np.random.seed() # the Forgy method will fail if the whole array contains the same rows #初始化k个聚类中心(从原始数据集中随机选择k个) clusters = boxes[np.random.choice(rows, k, replace=False)] while True: for row in range(rows): # 定义的距离度量公式:d(box,centroid)=1-IOU(box,centroid)。到聚类中心的距离越小越好, # 但IOU值是越大越好,所以使用 1 - IOU,这样就保证距离越小,IOU值越大。 # 计算所有的boxes和clusters的值(row,k) distances[row] = 1 - iou(boxes[row], clusters) #print(distances) # 将标注框分配给“距离”最近的聚类中心(也就是这里代码就是选出(对于每一个box)距离最小的那个聚类中心)。 nearest_clusters = np.argmin(distances, axis=1) # 直到聚类中心改变量为0(也就是聚类中心不变了)。 if (last_clusters == nearest_clusters).all(): break # 计算每个群的中心(这里把每一个类的中位数作为新的聚类中心) for cluster in range(k): #这一句是把所有的boxes分到k堆数据中,比较别扭,就是分好了k堆数据,每堆求它的中位数作为新的点 clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0) last_clusters = nearest_clusters return clusters

     运行代码:

    import glob
    import xml.etree.ElementTree as ET
    
    import numpy as np
    
    from kmeans import kmeans, avg_iou
    
    #ANNOTATIONS_PATH = "Annotations"
    CLUSTERS = 9
    
    def load_dataset(path):
    	dataset = []
    	for xml_file in glob.glob("{}/*xml".format(path)):
    		tree = ET.parse(xml_file)
    
    		height = int(tree.findtext("./size/height"))
    		width = int(tree.findtext("./size/width"))
    
    		for obj in tree.iter("object"):
    			xmin = int(obj.findtext("bndbox/xmin")) / width
    			ymin = int(obj.findtext("bndbox/ymin")) / height
    			xmax = int(obj.findtext("bndbox/xmax")) / width
    			ymax = int(obj.findtext("bndbox/ymax")) / height
    
    			dataset.append([xmax - xmin, ymax - ymin])
    
    	return np.array(dataset)
    
    ANNOTATIONS_PATH ="自己数据路径"
    data = load_dataset(ANNOTATIONS_PATH)
    out = kmeans(data, k=CLUSTERS)
    print("Accuracy: {:.2f}%".format(avg_iou(data, out) * 100))
    #print("Boxes:
     {}".format(out))
    print("Boxes:
     {}-{}".format(out[:, 0]*416, out[:, 1]*416))
    ratios = np.around(out[:, 0] / out[:, 1], decimals=2).tolist()
    print("Ratios:
     {}".format(sorted(ratios)))
    

      自己计算的VOC2007数据集总共9963个标签数据,跟论文中给到的有些许出入,可能是coco和voc2007的区别吧,

    计算如下:

    Accuracy:

    67.22%

    Boxes(自己修改的格式 都4舍5入了,ratios有些许对不上):
    [347,327     40,40    76,77   184,277   89,207 162,134   14,27  44,128   23,72]

    Ratios:
    [0.32, 0.35, 0.43, 0.55, 0.67, 0.99, 1.02, 1.06, 1.21]

  • 相关阅读:
    spark 查看 job history 日志
    Kafka集群安装
    spark总体概况
    hadoop distcp使用
    基于spark1.3.1的spark-sql实战-02
    HiveServer2 入门使用
    基于spark1.3.1的spark-sql实战-01
    Hive基础学习文档和入门教程
    HDFS HA与QJM(Quorum Journal Manager)介绍及官网内容整理
    Akka DEMO
  • 原文地址:https://www.cnblogs.com/lzq116/p/12145673.html
Copyright © 2011-2022 走看看