zoukankan      html  css  js  c++  java
  • 半径无关快速高斯模糊实现(附完整C代码)

    之前,俺也发过不少快速高斯模糊算法.

    俺一般认为,只要处理一千六百万像素彩色图片,在2.2GHz的CPU上单核单线程超过1秒的算法,都是不快的.

    之前发的几个算法,在俺2.2GHz的CPU上耗时都会超过1秒.

    而众所周知,快速高斯模糊有很多实现方法:

    1.FIR (Finite impulse response)

    https://zh.wikipedia.org/wiki/%E9%AB%98%E6%96%AF%E6%A8%A1%E7%B3%8A

    2.SII (Stacked integral images)

    http://dx.doi.org/10.1109/ROBOT.2010.5509400

    http://arxiv.org/abs/1107.4958

    3.Vliet-Young-Verbeek (Recursive filter)

    http://dx.doi.org/10.1016/0165-1684(95)00020-E

    http://dx.doi.org/10.1109/ICPR.1998.711192

    4.DCT (Discrete Cosine Transform)

    http://dx.doi.org/10.1109/78.295213

    5.box (Box filter)

    http://dx.doi.org/10.1109/TPAMI.1986.4767776

    6.AM(Alvarez, Mazorra)

    http://www.jstor.org/stable/2158018

    7.Deriche (Recursive filter)

    http://hal.inria.fr/docs/00/07/47/78/PDF/RR-1893.pdf

    8.ebox (Extended Box)

    http://dx.doi.org/10.1007/978-3-642-24785-9_38

    9.IIR (Infinite Impulse Response)

    https://software.intel.com/zh-cn/articles/iir-gaussian-blur-filter-implementation-using-intel-advanced-vector-extensions

    10.FA (Fast Anisotropic)

    http://mathinfo.univ-reims.fr/IMG/pdf/Fast_Anisotropic_Gquss_Filtering_-_GeusebroekECCV02.pdf

    ......

    实现高斯模糊的方法虽然很多,但是作为算法而言,核心关键是简单高效.

    目前俺经过实测,IIR是兼顾效果以及性能的不错的方法,也是半径无关(即模糊不同强度耗时基本不变)的实现.

    英特尔官方实现的这份:

    IIR Gaussian Blur Filter Implementation using Intel® Advanced Vector Extensions [PDF 513KB]
    source: gaussian_blur.cpp [36KB]

    采用了英特尔处理器的流(SIMD)指令,算法处理速度极其惊人.

    俺写算法追求干净整洁,高效简单,换言之就是不采用任何硬件加速方案,实现简单高效,以适应不同硬件环境.

    故基于英特尔这份代码,俺对其进行了改写以及优化.

    最终在俺2.20GHz的CPU上,单核单线程,不采用流(SIMD)指令,达到了,处理一千六百万像素的彩色照片仅需700毫秒左右.

    按照惯例,还是贴个效果图比较直观.

    之前也有网友问过这个算法的实现问题.

    想了想,还是将代码共享出来,供大家参考学习.

    完整代码:

    void CalGaussianCoeff(float sigma, float * a0, float * a1, float * a2, float * a3, float * b1, float * b2, float * cprev, float * cnext) {
        float alpha, lamma, k;
    
        if (sigma < 0.5f)
            sigma = 0.5f;
        alpha = (float)exp((0.726) * (0.726)) / sigma;
        lamma = (float)exp(-alpha);
        *b2 = (float)exp(-2 * alpha);
        k = (1 - lamma) * (1 - lamma) / (1 + 2 * alpha * lamma - (*b2));
        *a0 = k; *a1 = k * (alpha - 1) * lamma;
        *a2 = k * (alpha + 1) * lamma;
        *a3 = -k * (*b2);
        *b1 = -2 * lamma;
        *cprev = (*a0 + *a1) / (1 + *b1 + *b2);
        *cnext = (*a2 + *a3) / (1 + *b1 + *b2);
    }
    
    void gaussianHorizontal(unsigned char * bufferPerLine, unsigned char * lpRowInitial, unsigned char  * lpColumn, int width, int height, int Channels, int Nwidth, float a0a1, float a2a3, float b1b2, float  cprev, float cnext)
    {
        int HeightStep = Channels*height;
        int WidthSubOne = width - 1;
        if (Channels == 3)
        {
            float prevOut[3];
            prevOut[0] = (lpRowInitial[0] * cprev);
            prevOut[1] = (lpRowInitial[1] * cprev);
            prevOut[2] = (lpRowInitial[2] * cprev);
            for (int x = 0; x < width; ++x) {
                prevOut[0] = ((lpRowInitial[0] * (a0a1)) - (prevOut[0] * (b1b2)));
                prevOut[1] = ((lpRowInitial[1] * (a0a1)) - (prevOut[1] * (b1b2)));
                prevOut[2] = ((lpRowInitial[2] * (a0a1)) - (prevOut[2] * (b1b2)));
                bufferPerLine[0] = prevOut[0];
                bufferPerLine[1] = prevOut[1];
                bufferPerLine[2] = prevOut[2];
                bufferPerLine += Channels;
                lpRowInitial += Channels;
            }
            lpRowInitial -= Channels;
            lpColumn += HeightStep * WidthSubOne;
            bufferPerLine -= Channels;
            prevOut[0] = (lpRowInitial[0] * cnext);
            prevOut[1] = (lpRowInitial[1] * cnext);
            prevOut[2] = (lpRowInitial[2] * cnext);
    
            for (int x = WidthSubOne; x >= 0; --x) {
                prevOut[0] = ((lpRowInitial[0] * (a2a3)) - (prevOut[0] * (b1b2)));
                prevOut[1] = ((lpRowInitial[1] * (a2a3)) - (prevOut[1] * (b1b2)));
                prevOut[2] = ((lpRowInitial[2] * (a2a3)) - (prevOut[2] * (b1b2)));
                bufferPerLine[0] += prevOut[0];
                bufferPerLine[1] += prevOut[1];
                bufferPerLine[2] += prevOut[2];
                lpColumn[0] = bufferPerLine[0];
                lpColumn[1] = bufferPerLine[1];
                lpColumn[2] = bufferPerLine[2];
                lpRowInitial -= Channels;
                lpColumn -= HeightStep;
                bufferPerLine -= Channels;
            }
        }
        else if (Channels == 4)
        {
            float prevOut[4];
    
            prevOut[0] = (lpRowInitial[0] * cprev);
            prevOut[1] = (lpRowInitial[1] * cprev);
            prevOut[2] = (lpRowInitial[2] * cprev);
            prevOut[3] = (lpRowInitial[3] * cprev);
            for (int x = 0; x < width; ++x) {
                prevOut[0] = ((lpRowInitial[0] * (a0a1)) - (prevOut[0] * (b1b2)));
                prevOut[1] = ((lpRowInitial[1] * (a0a1)) - (prevOut[1] * (b1b2)));
                prevOut[2] = ((lpRowInitial[2] * (a0a1)) - (prevOut[2] * (b1b2)));
                prevOut[3] = ((lpRowInitial[3] * (a0a1)) - (prevOut[3] * (b1b2)));
    
                bufferPerLine[0] = prevOut[0];
                bufferPerLine[1] = prevOut[1];
                bufferPerLine[2] = prevOut[2];
                bufferPerLine[3] = prevOut[3];
                bufferPerLine += Channels;
                lpRowInitial += Channels;
            }
            lpRowInitial -= Channels;
            lpColumn += HeightStep * WidthSubOne;
            bufferPerLine -= Channels;
    
            prevOut[0] = (lpRowInitial[0] * cnext);
            prevOut[1] = (lpRowInitial[1] * cnext);
            prevOut[2] = (lpRowInitial[2] * cnext);
            prevOut[3] = (lpRowInitial[3] * cnext);
    
            for (int x = WidthSubOne; x >= 0; --x) {
                prevOut[0] = ((lpRowInitial[0] * a2a3) - (prevOut[0] * b1b2));
                prevOut[1] = ((lpRowInitial[1] * a2a3) - (prevOut[1] * b1b2));
                prevOut[2] = ((lpRowInitial[2] * a2a3) - (prevOut[2] * b1b2));
                prevOut[3] = ((lpRowInitial[3] * a2a3) - (prevOut[3] * b1b2));
                bufferPerLine[0] += prevOut[0];
                bufferPerLine[1] += prevOut[1];
                bufferPerLine[2] += prevOut[2];
                bufferPerLine[3] += prevOut[3];
                lpColumn[0] = bufferPerLine[0];
                lpColumn[1] = bufferPerLine[1];
                lpColumn[2] = bufferPerLine[2];
                lpColumn[3] = bufferPerLine[3];
                lpRowInitial -= Channels;
                lpColumn -= HeightStep;
                bufferPerLine -= Channels;
            }
        }
        else if (Channels == 1)
        {
            float prevOut = (lpRowInitial[0] * cprev);
    
            for (int x = 0; x < width; ++x) {
                prevOut = ((lpRowInitial[0] * (a0a1)) - (prevOut  * (b1b2)));
                bufferPerLine[0] = prevOut;
                bufferPerLine += Channels;
                lpRowInitial += Channels;
            }
            lpRowInitial -= Channels;
            lpColumn += HeightStep*WidthSubOne;
            bufferPerLine -= Channels;
    
            prevOut = (lpRowInitial[0] * cnext);
    
            for (int x = WidthSubOne; x >= 0; --x) {
                prevOut = ((lpRowInitial[0] * a2a3) - (prevOut  * b1b2));
                bufferPerLine[0] += prevOut;
                lpColumn[0] = bufferPerLine[0];
                lpRowInitial -= Channels;
                lpColumn -= HeightStep;
                bufferPerLine -= Channels;
            }
        }
    }
    
    void gaussianVertical(unsigned char * bufferPerLine, unsigned char * lpRowInitial, unsigned char * lpColInitial, int height, int width, int Channels, float a0a1, float a2a3, float b1b2, float  cprev, float  cnext) {
    
        int WidthStep = Channels*width;
        int HeightSubOne = height - 1;
        if (Channels == 3)
        {
            float prevOut[3];
            prevOut[0] = (lpRowInitial[0] * cprev);
            prevOut[1] = (lpRowInitial[1] * cprev);
            prevOut[2] = (lpRowInitial[2] * cprev);
    
            for (int y = 0; y < height; y++) {
                prevOut[0] = ((lpRowInitial[0] * a0a1) - (prevOut[0] * b1b2));
                prevOut[1] = ((lpRowInitial[1] * a0a1) - (prevOut[1] * b1b2));
                prevOut[2] = ((lpRowInitial[2] * a0a1) - (prevOut[2] * b1b2));
                bufferPerLine[0] = prevOut[0];
                bufferPerLine[1] = prevOut[1];
                bufferPerLine[2] = prevOut[2];
                bufferPerLine += Channels;
                lpRowInitial += Channels;
            }
            lpRowInitial -= Channels;
            bufferPerLine -= Channels;
            lpColInitial += WidthStep * HeightSubOne;
            prevOut[0] = (lpRowInitial[0] * cnext);
            prevOut[1] = (lpRowInitial[1] * cnext);
            prevOut[2] = (lpRowInitial[2] * cnext);
            for (int y = HeightSubOne; y >= 0; y--) {
                prevOut[0] = ((lpRowInitial[0] * a2a3) - (prevOut[0] * b1b2));
                prevOut[1] = ((lpRowInitial[1] * a2a3) - (prevOut[1] * b1b2));
                prevOut[2] = ((lpRowInitial[2] * a2a3) - (prevOut[2] * b1b2));
                bufferPerLine[0] += prevOut[0];
                bufferPerLine[1] += prevOut[1];
                bufferPerLine[2] += prevOut[2];
                lpColInitial[0] = bufferPerLine[0];
                lpColInitial[1] = bufferPerLine[1];
                lpColInitial[2] = bufferPerLine[2];
                lpRowInitial -= Channels;
                lpColInitial -= WidthStep;
                bufferPerLine -= Channels;
            }
        }
        else if (Channels == 4)
        {
            float prevOut[4];
    
            prevOut[0] = (lpRowInitial[0] * cprev);
            prevOut[1] = (lpRowInitial[1] * cprev);
            prevOut[2] = (lpRowInitial[2] * cprev);
            prevOut[3] = (lpRowInitial[3] * cprev);
    
            for (int y = 0; y < height; y++) {
                prevOut[0] = ((lpRowInitial[0] * a0a1) - (prevOut[0] * b1b2));
                prevOut[1] = ((lpRowInitial[1] * a0a1) - (prevOut[1] * b1b2));
                prevOut[2] = ((lpRowInitial[2] * a0a1) - (prevOut[2] * b1b2));
                prevOut[3] = ((lpRowInitial[3] * a0a1) - (prevOut[3] * b1b2));
                bufferPerLine[0] = prevOut[0];
                bufferPerLine[1] = prevOut[1];
                bufferPerLine[2] = prevOut[2];
                bufferPerLine[3] = prevOut[3];
                bufferPerLine += Channels;
                lpRowInitial += Channels;
            }
            lpRowInitial -= Channels;
            bufferPerLine -= Channels;
            lpColInitial += WidthStep*HeightSubOne;
            prevOut[0] = (lpRowInitial[0] * cnext);
            prevOut[1] = (lpRowInitial[1] * cnext);
            prevOut[2] = (lpRowInitial[2] * cnext);
            prevOut[3] = (lpRowInitial[3] * cnext);
            for (int y = HeightSubOne; y >= 0; y--) {
                prevOut[0] = ((lpRowInitial[0] * a2a3) - (prevOut[0] * b1b2));
                prevOut[1] = ((lpRowInitial[1] * a2a3) - (prevOut[1] * b1b2));
                prevOut[2] = ((lpRowInitial[2] * a2a3) - (prevOut[2] * b1b2));
                prevOut[3] = ((lpRowInitial[3] * a2a3) - (prevOut[3] * b1b2));
                bufferPerLine[0] += prevOut[0];
                bufferPerLine[1] += prevOut[1];
                bufferPerLine[2] += prevOut[2];
                bufferPerLine[3] += prevOut[3];
                lpColInitial[0] = bufferPerLine[0];
                lpColInitial[1] = bufferPerLine[1];
                lpColInitial[2] = bufferPerLine[2];
                lpColInitial[3] = bufferPerLine[3];
                lpRowInitial -= Channels;
                lpColInitial -= WidthStep;
                bufferPerLine -= Channels;
            }
        }
        else if (Channels == 1)
        {
            float prevOut = 0;
            prevOut = (lpRowInitial[0] * cprev);
            for (int y = 0; y < height; y++) {
                prevOut = ((lpRowInitial[0] * a0a1) - (prevOut * b1b2));
                bufferPerLine[0] = prevOut;
                bufferPerLine += Channels;
                lpRowInitial += Channels;
            }
            lpRowInitial -= Channels;
            bufferPerLine -= Channels;
            lpColInitial += WidthStep*HeightSubOne;
            prevOut = (lpRowInitial[0] * cnext);
            for (int y = HeightSubOne; y >= 0; y--) {
                prevOut = ((lpRowInitial[0] * a2a3) - (prevOut * b1b2));
                bufferPerLine[0] += prevOut;
                lpColInitial[0] = bufferPerLine[0];
                lpRowInitial -= Channels;
                lpColInitial -= WidthStep;
                bufferPerLine -= Channels;
            }
        }
    }
    //本人博客:http://tntmonks.cnblogs.com/ 转载请注明出处.
    void  GaussianBlurFilter(unsigned char * input, unsigned char * output, int Width, int Height, int Stride, float GaussianSigma) {
    
        int Channels = Stride / Width;
        float a0, a1, a2, a3, b1, b2, cprev, cnext;
    
        CalGaussianCoeff(GaussianSigma, &a0, &a1, &a2, &a3, &b1, &b2, &cprev, &cnext);
    
        float a0a1 = (a0 + a1);
        float a2a3 = (a2 + a3);
        float b1b2 = (b1 + b2); 
    
        int bufferSizePerThread = (Width > Height ? Width : Height) * Channels;
        unsigned char * bufferPerLine = (unsigned char*)malloc(bufferSizePerThread);
        unsigned char * tempData = (unsigned char*)malloc(Height * Stride);
        if (bufferPerLine == NULL || tempData == NULL)
        {
            if (tempData)
            {
                free(tempData);
            }
            if (bufferPerLine)
            {
                free(bufferPerLine);
            }
            return;
        }
        for (int y = 0; y < Height; ++y) {
            unsigned char * lpRowInitial = input + Stride * y;
            unsigned char * lpColInitial = tempData + y * Channels;
            gaussianHorizontal(bufferPerLine, lpRowInitial, lpColInitial, Width, Height, Channels, Width, a0a1, a2a3, b1b2, cprev, cnext);
        }
        int HeightStep = Height*Channels;
        for (int x = 0; x < Width; ++x) {
            unsigned char * lpColInitial = output + x*Channels;
            unsigned char * lpRowInitial = tempData + HeightStep * x;
            gaussianVertical(bufferPerLine, lpRowInitial, lpColInitial, Height, Width, Channels, a0a1, a2a3, b1b2, cprev, cnext);
        }
    
        free(bufferPerLine);
        free(tempData);
    }

    调用方法:

      GaussianBlurFilter(输入图像数据,输出图像数据,宽度,高度,通道数,强度)

      注:支持通道数分别为 1 ,3 ,4.

    关于IIR相关知识,参阅 百度词条 "IIR数字滤波器"

    http://baike.baidu.com/view/3088994.htm

    天下武功,唯快不破。
    本文只是抛砖引玉一下,若有其他相关问题或者需求也可以邮件联系俺探讨。

    邮箱地址是:
    gaozhihan@vip.qq.com

    题外话:

    很多网友一直推崇使用opencv,opencv的确十分强大,但是若是想要有更大的发展空间以及创造力.

    还是要一步一个脚印去实现一些最基本的算法,扎实的基础才是构建上层建筑的基本条件.

    俺目前只是把opencv当资料库来看,并不认为opencv可以用于绝大多数的商业项目.

    若本文帮到您,厚颜无耻求微信扫码打个赏.

  • 相关阅读:
    基本架构思想
    tensorflow|tf.train.slice_input_producer|tf.train.Coordinator|tf.train.start_queue_runners
    tfsenflow队列|tf.train.slice_input_producer|tf.train.Coordinator|tf.train.start_queue_runners
    tensorflow队列tf.FIFOQueue | enqueue | enqueue_many | dequeue | dequeue_many
    Tensorflow-gpu1.13.1 和 Tensorflow-gpu2.0.0共存之安装教程
    pandas相关操作
    Numpy的基本运算及操作
    Numpy基础之创建与属性
    Series序列
    os.walk|图片数据集
  • 原文地址:https://www.cnblogs.com/cpuimage/p/5291660.html
Copyright © 2011-2022 走看看