zoukankan      html  css  js  c++  java
  • Pitfalls of using opencv GpuMat data in CUDA kernel code

    Please note that cv::cuda::GpuMat and cv::Mat using different memory allocation method. cv::cuda::GpuMat the data in is Nvidia Gpu Ram, but cv::Mat store in normal Ram.

    The cv::Mat allocated memory normally is continuous, but cv::cuda::GpuMat may have gap between row and row data. Because cv::cuda::GpuMat is using cuda function cudaMallocPitch, which make the step size different from COLS.

    So when passing the row data of cv::cuda::GpuMat into a CUDA kernel function, should also pass in the step size into it, so the function can access the row data correctly. If using COLS instead of step, it will easily get wrong, and it is a headache to debug the problem.

    For example:

    __global__
    void kernel_select_cmp_point(
        float* dMap,
        float* dPhase,
        uint8_t* matResult,
        uint32_t step,
        const int ROWS,
        const int COLS,
        const int span) {
        int start = blockIdx.x * blockDim.x + threadIdx.x;
        int stride = blockDim.x * gridDim.x;
    
        for (int row = start; row < ROWS; row += stride) {
            int offsetOfInput = row * step;
            int offsetOfResult = row * step;
        }
    }
  • 相关阅读:
    数据后台查询分页条件查询数据
    避免js缓存
    jquery点击按钮
    网页内容打印
    数据表的导出
    C#实现字符串按多个字符采用Split方法分割
    JQuery
    AJAX
    JSON
    BOM
  • 原文地址:https://www.cnblogs.com/shengguang/p/10794827.html
Copyright © 2011-2022 走看看