FLANN库全称是Fast Library for Approximate Nearest Neighbors,它是目前最完整的(近似)最近邻开源库。不但实现了一系列查找算法,还包含了一种自动选取最快算法的机制,在一个度量空间X给定一组点P=p1,p2,…,pn,这些点必须通过以下方式进行预处理,给第一个新的查询点q属于X,快速在P中找到距离q最近的点,即最近邻搜索问题。最近邻搜索的问题是在很多应用领域是一个重大问题,如图像识别、数据压缩、模式识别和分类.在高维空间中解决这个问题似乎是一个非常难以执行任务,没有算法明显优于标准的蛮力搜索。因此越来越多的人把兴趣点转向执行近似最近邻搜索的一类算法,这些方法在很多实际应用和大多数案例中被证明是足够好的近似,比精确搜索算法快很大的数量级
库的地址:www.cs.ubc.ca/research/flann/
很显然在PCL 搜索最近邻也是属于高维度近邻搜索,所以需要用到这样的库来寻找匹配的,比如我们如果想通过使用PCL点云的数据来识别出物体到底是什么,那么很明显是需要一个数据录的,这个数据库的训练该怎么来实现呢?因为我对机器学习了解的比较少,我想通过机器学习的方法来识别是最好的,也是今后可以做的工作,目前来看,我认为的方法就是通过对大量的点云进行训练并提取他们的特征点,比如VFH,然后把大量点云训练存储成八叉树,以便于很好的搜索和匹配,而且在PCL 中使用八叉树是很常见的一种存储结构,也是方便搜索的,然后通过计算我们想要识别物体的点云 的VFH,通过FLANN算法寻找到最接近的,这就是整体的思路
主要流程
- 使用SURF,SIFT特征提取关键点(在PCL中同样适用)
- 计算特征描述子(比如VFH)
- 使用FLANN匹配器进行描述子向量匹配
那么有兴趣可以去仔细看看论文啊,我等渣渣真是多年以来对论文没什么感觉
介绍几种FLANN库中的API接口,如何去使用它,然后通过具体的实例去实现这样一个匹配
(1)flann::Index这是FLANN最近邻指数类。该类用于抽象的不同类型的最近邻搜索索引。距离函数类模板用于计算两个特征空间之间的距离。
namespace flann { template<typename Distance> class Index { typedef typename Distance::ElementType ElementType; typedef typename Distance::ResultType DistanceType; public: Index(const IndexParams& params, Distance distance = Distance() ); Index(const Matrix<ElementType>& points, const IndexParams& params, Distance distance = Distance() ); ~Index(); void buildIndex(); void buildIndex(const Matrix<ElementType>& points); void addPoints(const Matrix<ElementType>& points, float rebuild_threshold = 2); void removePoint(size_t point_id); ElementType* getPoint(size_t point_id); int knnSearch(const Matrix<ElementType>& queries, Matrix<int>& indices, Matrix<DistanceType>& dists, size_t knn, const SearchParams& params); int knnSearch(const Matrix<ElementType>& queries, std::vector< std::vector<int> >& indices, std::vector<std::vector<DistanceType> >& dists, size_t knn, const SearchParams& params); int radiusSearch(const Matrix<ElementType>& queries, Matrix<int>& indices, Matrix<DistanceType>& dists, float radius, const SearchParams& params); int radiusSearch(const Matrix<ElementType>& queries, std::vector< std::vector<int> >& indices, std::vector<std::vector<DistanceType> >& dists, float radius, const SearchParams& params); void save(std::string filename); int veclen() const; int size() const; IndexParams getParameters() const; flann_algorithm_t getType() const; }; }
(2)flann::Index::knnSearch
对一组点查询点,该方法执行K最近邻搜索。该方法有两个实现,一个携带预开辟空间的flann::Matrix对象接收返回的找到邻居的索引号和到其距离;另一个是携带std::vector<std::vector>根据需要自动重新调整大小。
参数解释:
queries: 承载查询点的矩阵,矩阵大小是:查询点数*纬数;
indices: 将承载所有被找到的K最近邻的索引号( 预开辟的大小应该至少是查询点数*knn);
dists: 将承载到所有被找到的K最近邻的距离只(预开辟大小应该至少是查询点数*knn);
knn: 要找的最近邻的个数;
params: 搜索参数。承载搜索时要使用的参数的结构体,结构体类型是SearchParameters。
int Index::knnSearch(const Matrix<ElementType>& queries, Matrix<int>& indices, Matrix<DistanceType>& dists, size_t knn, const SearchParams& params); int Index::knnSearch(const Matrix<ElementType>& queries, std::vector< std::vector<int> >& indices, std::vector<std::vector<DistanceType> >& dists, size_t knn, const SearchParams& params);
在这里就不再一一翻译了
/** Sets the log level used for all flann functions (unless specified in FLANNParameters for each call Params: level = verbosity level */ FLANN_EXPORT void flann_log_verbosity(int level);
设置FLANN距离的类型
/** * Sets the distance type to use throughout FLANN. * If distance type specified is MINKOWSKI, the second argument * specifies which order the minkowski distance should have. */ FLANN_EXPORT void flann_set_distance_type(enum flann_distance_t distance_type, int order);
建立和返回索引的函数
/** Builds and returns an index. It uses autotuning if the target_precision field of index_params is between 0 and 1, or the parameters specified if it's -1. Params: dataset = pointer to a data set stored in row major order rows = number of rows (features) in the dataset cols = number of columns in the dataset (feature dimensionality) speedup = speedup over linear search, estimated if using autotuning, output parameter index_params = index related parameters flann_params = generic flann parameters Returns: the newly created index or a number <0 for error */ FLANN_EXPORT flann_index_t flann_build_index(float* dataset, int rows, int cols, float* speedup, struct FLANNParameters* flann_params);
保存索引到本地磁盘(仅仅是索引被保存)
/** * Saves the index to a file. Only the index is saved into the file, the dataset corresponding to the index is not saved. * * @param index_id The index that should be saved * @param filename The filename the index should be saved to 文件名 * @return Returns 0 on success, negative value on error. 返回0则保存成功 */ FLANN_EXPORT int flann_save_index(flann_index_t index_id, char* filename);
从磁盘中载入索引文件
/** * Loads an index from a file. * * @param filename File to load the index from. * @param dataset The dataset corresponding to the index. * @param rows Dataset tors * @param cols Dataset columns * @return */ FLANN_EXPORT flann_index_t flann_load_index(char* filename, float* dataset, int rows, int cols);
建立一个索引或者说指定一个索引来寻找最近邻域
/** Builds an index and uses it to find nearest neighbors. Params: dataset = pointer to a data set stored in row major order rows = number of rows (features) in the dataset cols = number of columns in the dataset (feature dimensionality) testset = pointer to a query set stored in row major order trows = number of rows (features) in the query dataset (same dimensionality as features in the dataset) indices = pointer to matrix for the indices of the nearest neighbors of the testset features in the dataset (must have trows number of rows and nn number of columns) nn = how many nearest neighbors to return flann_params = generic flann parameters Returns: zero or -1 for error */ FLANN_EXPORT int flann_find_nearest_neighbors(float* dataset, int rows, int cols, float* testset, int trows, int* indices, float* dists, int nn, struct FLANNParameters* flann_params);
对提供的索引来寻找最近邻
/** Searches for nearest neighbors using the index provided Params: index_id = the index (constructed previously using flann_build_index). testset = pointer to a query set stored in row major order trows = number of rows (features) in the query dataset (same dimensionality as features in the dataset) indices = pointer to matrix for the indices of the nearest neighbors of the testset features in the dataset (must have trows number of rows and nn number of columns) dists = pointer to matrix for the distances of the nearest neighbors of the testset features in the dataset (must have trows number of rows and 1 column) nn = how many nearest neighbors to return flann_params = generic flann parameters Returns: zero or a number <0 for error */ FLANN_EXPORT int flann_find_nearest_neighbors_index(flann_index_t index_id, float* testset, int trows, int* indices, float* dists, int nn, struct FLANNParameters* flann_params);
半径搜索的方法在已经建立的索引文件中寻找
/** * Performs an radius search using an already constructed index. * * In case of radius search, instead of always returning a predetermined * number of nearest neighbours (for example the 10 nearest neighbours), the * search will return all the neighbours found within a search radius * of the query point. * * The check parameter in the FLANNParameters below sets the level of approximation * for the search by only visiting "checks" number of features in the index * (the same way as for the KNN search). A lower value for checks will give * a higher search speedup at the cost of potentially not returning all the * neighbours in the specified radius. */ FLANN_EXPORT int flann_radius_search(flann_index_t index_ptr, /* the index */ float* query, /* query point */ int* indices, /* array for storing the indices found (will be modified) */ float* dists, /* similar, but for storing distances */ int max_nn, /* size of arrays indices and dists */ float radius, /* search radius (squared radius for euclidian metric) */ struct FLANNParameters* flann_params);
释放内存
/** Deletes an index and releases the memory used by it. Params: index_id = the index (constructed previously using flann_build_index). flann_params = generic flann parameters Returns: zero or a number <0 for error */ FLANN_EXPORT int flann_free_index(flann_index_t index_id, struct FLANNParameters* flann_params);
利用特征聚类
/** Clusters the features in the dataset using a hierarchical kmeans clustering approach. This is significantly faster than using a flat kmeans clustering for a large number of clusters. Params: dataset = pointer to a data set stored in row major order rows = number of rows (features) in the dataset cols = number of columns in the dataset (feature dimensionality) clusters = number of cluster to compute result = memory buffer where the output cluster centers are storred index_params = used to specify the kmeans tree parameters (branching factor, max number of iterations to use) flann_params = generic flann parameters Returns: number of clusters computed or a number <0 for error. This number can be different than the number of clusters requested, due to the way hierarchical clusters are computed. The number of clusters returned will be the highest number of the form (branch_size-1)*K+1 smaller than the number of clusters requested. */ FLANN_EXPORT int flann_compute_cluster_centers(float* dataset, int rows, int cols, int clusters, float* result, struct FLANNParameters* flann_params);
基本的介绍,可以查看相应的手册
#include <flann/flann.h>
#include <flann/io/hdf5.h>
其中还涉及到另外一种文件的数据结构的使用,一般都会包含两个头文件,第一个头文件是关于FLANN算法实现的声明的函数集,第二个头文件是关于数据结构的一种存储的方式,hdf5是用于存储和分发科学数据的一种自我描述、多对象文件格式,HDF 被设计为:
- 自述性:对于一个HDF 文件里的每一个数据对象,有关于该数据的综合信息(元数据)。在没有任何外部信息的情况下,HDF 允许应用程序解释HDF文件的结构和内容。
- 通用性:许多数据类型都可以被嵌入在一个HDF文件里。例如,通过使用合适的HDF 数据结构,符号、数字和图形数据可以同时存储在一个HDF 文件里。
- 灵活性:HDF允许用户把相关的数据对象组合在一起,放到一个分层结构中,向数据对象添加描述和标签。它还允许用户把科学数据放到多个HDF 文件里。
- 扩展性:HDF极易容纳将来新增加的数据模式,容易与其他标准格式兼容。
- 跨平台性:HDF 是一个与平台无关的文件格式。HDF 文件无需任何转换就可以在不同平台上使用
最好的办法是把HDF 文件看成为一本有表格内容的多章节书。HDF 文件是“数据书”,其中每章都包含一个不同类型的数据内容。正如书籍用一个目录表列出它的章节一样,HDF文件用“data index”(数据索引)列出其数据内容
HDF 文件结构包括一个file id(文件号)、至少一个 data descriptor (数据描述符)、没有或多个 data element(数据内容)数据内容。
以上内容是涉及到关于对PCL中物体的识别的基础知识。
有兴趣者关注微信公众号,更欢迎大家投稿分享,只要是关于点云的都可以分享