zoukankan      html  css  js  c++  java
  • MATLAB消除曲线毛刺Outlier Detection and Removal [hampel]

    MATLAB消除曲线毛刺Outlier Detection and Removal [hampel]
    
    function [YY, I, Y0, LB, UB, ADX, NO] = hampel(X, Y, DX, T, varargin)
    
    % HAMPEL Hampel Filter.
    
    % HAMPEL(X,Y,DX,T,varargin) returns the Hampel filtered values of the
    
    % elements in Y. It was developed to detect outliers in a time series,
    
    % but it can also be used as an alternative to the standard median
    
    % filter.
    
    %
    
    % References
    
    % Chapters 1.4.2, 3.2.2 and 4.3.4 in Mining Imperfect Data: Dealing with
    
    % Contamination and Incomplete Records by Ronald K. Pearson.
    
    %
    
    % Acknowledgements
    
    % I would like to thank Ronald K. Pearson for the introduction to moving
    
    % window filters. Please visit his blog at:
    
    % http://exploringdatablog.blogspot.com/2012/01/moving-window-filters-and
    
    % -pracma.html
    
    %
    
    % X,Y are row or column vectors with an equal number of elements.
    
    % The elements in Y should be Gaussian distributed.
    
    %
    
    % Input DX,T,varargin must not contain NaN values!
    
    %
    
    % DX,T are optional scalar values.
    
    % DX is a scalar which defines the half width of the filter window.
    
    % It is required that DX > 0 and DX should be dimensionally equivalent to
    
    % the values in X.
    
    % T is a scalar which defines the threshold value used in the equation
    
    % |Y - Y0| > T*S0.
    
    %
    
    % Standard Parameters for DX and T:
    
    % DX = 3*median(X(2:end)-X(1:end-1));
    
    % T = 3;
    
    %
    
    % varargin covers addtional optional input. The optional input must be in
    
    % the form of 'PropertyName', PropertyValue.
    
    % Supported PropertyNames:
    
    % 'standard': Use the standard Hampel filter.
    
    % 'adaptive': Use an experimental adaptive Hampel filter. Explained under
    
    % Revision 1 details below.
    
    %
    
    % Supported PropertyValues: Scalar value which defines the tolerance of
    
    % the adaptive filter. In the case of standard Hampel filter this value
    
    % is ignored.
    
    %
    
    % Output YY,I,Y0,LB,UB,ADX are column vectors containing Hampel filtered
    
    % values of Y, a logical index of the replaced values, nominal data,
    
    % lower and upper bounds on the Hampel filter and the relative half size
    
    % of the local window, respectively.
    
    %
    
    % NO is a scalar that specifies the Number of Outliers detected.
    
    %
    
    % Examples
    
    % 1. Hampel filter removal of outliers
    
    % X = 1:1000; % Pseudo Time
    
    % Y = 5000 + randn(1000, 1); % Pseudo Data
    
    % Outliers = randi(1000, 10, 1); % Index of Outliers
    
    % Y(Outliers) = Y(Outliers) + randi(1000, 10, 1); % Pseudo Outliers
    
    % [YY,I,Y0,LB,UB] = hampel(X,Y);
    
    %
    
    % plot(X, Y, 'b.'); hold on; % Original Data
    
    % plot(X, YY, 'r'); % Hampel Filtered Data
    
    % plot(X, Y0, 'b--'); % Nominal Data
    
    % plot(X, LB, 'r--'); % Lower Bounds on Hampel Filter
    
    % plot(X, UB, 'r--'); % Upper Bounds on Hampel Filter
    
    % plot(X(I), Y(I), 'ks'); % Identified Outliers
    
    %
    
    % 2. Adaptive Hampel filter removal of outliers
    
    % DX = 1; % Window Half size
    
    % T = 3; % Threshold
    
    % Threshold = 0.1; % AdaptiveThreshold
    
    % X = 1:DX:1000; % Pseudo Time
    
    % Y = 5000 + randn(1000, 1); % Pseudo Data
    
    % Outliers = randi(1000, 10, 1); % Index of Outliers
    
    % Y(Outliers) = Y(Outliers) + randi(1000, 10, 1); % Pseudo Outliers
    
    % [YY,I,Y0,LB,UB] = hampel(X,Y,DX,T,'Adaptive',Threshold);
    
    %
    
    % plot(X, Y, 'b.'); hold on; % Original Data
    
    % plot(X, YY, 'r'); % Hampel Filtered Data
    
    % plot(X, Y0, 'b--'); % Nominal Data
    
    % plot(X, LB, 'r--'); % Lower Bounds on Hampel Filter
    
    % plot(X, UB, 'r--'); % Upper Bounds on Hampel Filter
    
    % plot(X(I), Y(I), 'ks'); % Identified Outliers
    
    %
    
    % 3. Median Filter Based on Filter Window
    
    % DX = 3; % Filter Half Size
    
    % T = 0; % Threshold
    
    % X = 1:1000; % Pseudo Time
    
    % Y = 5000 + randn(1000, 1); % Pseudo Data
    
    % [YY,I,Y0] = hampel(X,Y,DX,T);
    
    %
    
    % plot(X, Y, 'b.'); hold on; % Original Data
    
    % plot(X, Y0, 'r'); % Median Filtered Data
    
    %
    
    % Version: 1.5
    
    % Last Update: 09.02.2012
    
    %
    
    % Copyright (c) 2012:
    
    % Michael Lindholm Nielsen
    
    %
    
    % --- Revision 5 --- 09.02.2012
    
    % (1) Corrected potential error in internal median function.
    
    % (2) Removed internal "keyboard" command.
    
    % (3) Optimized internal Gauss filter.
    
    %
    
    % --- Revision 4 --- 08.02.2012
    
    % (1) The elements in X and Y are now temporarily sorted for internal
    
    % computations.
    
    % (2) Performance optimization.
    
    % (3) Added Example 3.
    
    %
    
    % --- Revision 3 --- 06.02.2012
    
    % (1) If the number of elements (X,Y) are below 2 the output YY will be a
    
    % copy of Y. No outliers will be detected. No error will be issued.
    
    %
    
    % --- Revision 2 --- 05.02.2012
    
    % (1) Changed a calculation in the adaptive Hampel filter. The threshold
    
    % parameter is now compared to the percentage difference between the
    
    % j'th and the j-1 value. Also notice the change from Threshold = 1.1
    
    % to Threshold = 0.1 in example 2 above.
    
    % (2) Checks if DX,T or varargin contains NaN values.
    
    % (3) Now capable of ignoring NaN values in X and Y.
    
    % (4) Added output Y0 - Nominal Data.
    
    %
    
    % --- Revision 1 --- 28.01.2012
    
    % (1) Replaced output S (Local Scaled Median Absolute Deviation) with
    
    % lower (LB) and upper (UB) bounds on the Hampel filter.
    
    % (2) Added option to use an experimental adaptive Hampel filter.
    
    % The Principle behind this filter is described below.
    
    % a) The filter changes the local window size until the change in the
    
    % local scaled median absolute deviation is below a threshold value
    
    % set by the user. In the above example (2) this parameter is set to
    
    % 0.1 corresponding to a maximum acceptable change of 10% in the
    
    % local scaled median absolute deviation. This process leads to three
    
    % locally optimized parameters Y0 (Local Nominal Data Reference
    
    % value), S0 (Local Scale of Natural Variation), ADX (Local Adapted
    
    % Window half size relative to DX).
    
    % b) The optimized parameters are then smoothed by a Gaussian filter with
    
    % a standard deviation of DX=2*median(XSort(2:end) - XSort(1:end-1)).
    
    % This means that local values are weighted highest, but nearby data
    
    % (which should be Gaussian distributed) is also used in refining
    
    % ADX, Y0, S0.
    
    %
    
    % --- Revision 0 --- 26.01.2012
    
    % (1) Release of first edition.
    
     
    
    %% Error Checking
    
    % Check for correct number of input arguments
    
    if nargin < 2
    
    error('Not enough input arguments.');
    
    end
    
     
    
    % Check that the number of elements in X match those of Y.
    
    if ~isequal(numel(X), numel(Y))
    
    error('Inputs X and Y must have the same number of elements.');
    
    end
    
     
    
    % Check that X is either a row or column vector
    
    if size(X, 1) == 1
    
    X = X'; % Change to column vector
    
    elseif size(X, 2) == 1
    
    else
    
    error('Input X must be either a row or column vector.')
    
    end
    
     
    
    % Check that Y is either a row or column vector
    
    if size(Y, 1) == 1
    
    Y = Y'; % Change to column vector
    
    elseif size(Y, 2) == 1
    
    else
    
    error('Input Y must be either a row or column vector.')
    
    end
    
     
    
    % Sort X
    
    SortX = sort(X);
    
     
    
    % Check that DX is of type scalar
    
    if exist('DX', 'var')
    
    if ~isscalar(DX)
    
    error('DX must be a scalar.');
    
    elseif DX < 0
    
    error('DX must be larger than zero.');
    
    end
    
    else
    
    DX = 3*median(SortX(2:end) - SortX(1:end-1));
    
    end
    
     
    
    % Check that T is of type scalar
    
    if exist('T', 'var')
    
    if ~isscalar(T)
    
    error('T must be a scalar.');
    
    end
    
    else
    
    T = 3;
    
    end
    
     
    
    % Check optional input
    
    if isempty(varargin)
    
    Option = 'standard';
    
    elseif numel(varargin) < 2
    
    error('Optional input must also contain threshold value.');
    
    else
    
    % varargin{1}
    
    if ischar(varargin{1})
    
    Option = varargin{1};
    
    else
    
    error('PropertyName must be of type char.');
    
    end
    
    % varargin{2}
    
    if isscalar(varargin{2})
    
    Threshold = varargin{2};
    
    else
    
    error('PropertyValue value must be a scalar.');
    
    end
    
    end
    
     
    
    % Check that DX,T does not contain NaN values
    
    if any(isnan(DX) | isnan(T))
    
    error('Inputs DX and T must not contain NaN values.');
    
    end
    
     
    
    % Check that varargin does not contain NaN values
    
    CheckNaN = cellfun(@isnan, varargin, 'UniformOutput', 0);
    
    if any(cellfun(@any, CheckNaN))
    
    error('Optional inputs must not contain NaN values.');
    
    end
    
     
    
    % Detect/Ignore NaN values in X and Y
    
    IdxNaN = isnan(X) | isnan(Y);
    
    X = X(~IdxNaN);
    
    Y = Y(~IdxNaN);
    
     
    
    %% Calculation
    
    % Preallocation
    
    YY = Y;
    
    I = false(size(Y));
    
    S0 = NaN(size(YY));
    
    Y0 = S0;
    
    ADX = repmat(DX, size(Y));
    
     
    
    if numel(X) > 1
    
    switch lower(Option)
    
    case 'standard'
    
    for i = 1:numel(Y)
    
    % Calculate Local Nominal Data Reference value
    
    % and Local Scale of Natural Variation
    
    [Y0(i), S0(i)] = localwindow(X, Y, DX, i);
    
    end
    
    case 'adaptive'
    
    % Preallocate
    
    Y0Tmp = S0;
    
    S0Tmp = S0;
    
    DXTmp = (1:numel(S0))'*DX; % Integer variation of Window Half Size
    
     
    
    % Calculate Initial Guess of Optimal Parameters Y0, S0, ADX
    
    for i = 1:numel(Y)
    
    % Setup/Reset temporary counter etc.
    
    j = 1;
    
    S0Rel = inf;
    
    while S0Rel > Threshold
    
    % Calculate Local Nominal Data Reference value
    
    % and Local Scale of Natural Variation using DXTmp window
    
    [Y0Tmp(j), S0Tmp(j)] = localwindow(X, Y, DXTmp(j), i);
    
     
    
    % Calculate percent difference relative to previous value
    
    if j > 1
    
    S0Rel = abs((S0Tmp(j-1) - S0Tmp(j))/(S0Tmp(j-1) + S0Tmp(j))/2);
    
    end
    
     
    
    % Iterate counter
    
    j = j + 1;
    
    end
    
    Y0(i) = Y0Tmp(j - 2); % Local Nominal Data Reference value
    
    S0(i) = S0Tmp(j - 2); % Local Scale of Natural Variation
    
    ADX(i) = DXTmp(j - 2)/DX; % Local Adapted Window size relative to DX
    
    end
    
     
    
    % Gaussian smoothing of relevant parameters
    
    DX = 2*median(SortX(2:end) - SortX(1:end-1));
    
    ADX = smgauss(X, ADX, DX);
    
    S0 = smgauss(X, S0, DX);
    
    Y0 = smgauss(X, Y0, DX);
    
    otherwise
    
    error('Unknown option ''%s''.', varargin{1});
    
    end
    
    end
    
     
    
    %% Prepare Output
    
    UB = Y0 + T*S0; % Save information about local scale
    
    LB = Y0 - T*S0; % Save information about local scale
    
    Idx = abs(Y - Y0) > T*S0; % Index of possible outlier
    
    YY(Idx) = Y0(Idx); % Replace outliers with local median value
    
    I(Idx) = true; % Set Outlier detection
    
    NO = sum(I); % Output number of detected outliers
    
     
    
    % Reinsert NaN values detected at error checking stage
    
    if any(IdxNaN)
    
    [YY, I, Y0, LB, UB, ADX] = rescale(IdxNaN, YY, I, Y0, LB, UB, ADX);
    
    end
    
     
    
    %% Built-in functions
    
    function [Y0, S0] = localwindow(X, Y, DX, i)
    
    % Index relevant to Local Window
    
    Idx = X(i) - DX <= X & X <= X(i) + DX;
    
     
    
    % Calculate Local Nominal Data Reference Value
    
    Y0 = median(Y(Idx));
    
     
    
    % Calculate Local Scale of Natural Variation
    
    S0 = 1.4826*median(abs(Y(Idx) - Y0));
    
    end
    
     
    
    function M = median(YM)
    
    % Isolate relevant values in Y
    
    YM = sort(YM);
    
    NYM = numel(YM);
    
     
    
    % Calculate median
    
    if mod(NYM,2) % Uneven
    
    M = YM((NYM + 1)/2);
    
    else % Even
    
    M = (YM(NYM/2)+YM(NYM/2+1))/2;
    
    end
    
    end
    
     
    
    function G = smgauss(X, V, DX)
    
    % Prepare Xj and Xk
    
    Xj = repmat(X', numel(X), 1);
    
    Xk = repmat(X, 1, numel(X));
    
     
    
    % Calculate Gaussian weight
    
    Wjk = exp(-((Xj - Xk)/(2*DX)).^2);
    
     
    
    % Calculate Gaussian Filter
    
    G = Wjk*V./sum(Wjk,1)';
    
    end
    
     
    
    function varargout = rescale(IdxNaN, varargin)
    
    % Output Rescaled Elements
    
    varargout = cell(nargout, 1);
    
    for k = 1:nargout
    
    Element = varargin{k};
    
     
    
    if islogical(Element)
    
    ScaledElement = false(size(IdxNaN));
    
    elseif isnumeric(Element)
    
    ScaledElement = NaN(size(IdxNaN));
    
    end
    
     
    
    ScaledElement(~IdxNaN) = Element;
    
    varargout(k) = {ScaledElement};
    
    end
    
    end
    
    end
    
     
    View Code

    hampel.m

  • 相关阅读:
    QFramework 使用指南 2020(二):下载与版本介绍
    QFramework 使用指南 2020 (一): 概述
    Unity 游戏框架搭建 2018 (二) 单例的模板与最佳实践
    Unity 游戏框架搭建 2018 (一) 架构、框架与 QFramework 简介
    Unity 游戏框架搭建 2017 (二十三) 重构小工具 Platform
    Unity 游戏框架搭建 2017 (二十二) 简易引用计数器
    Unity 游戏框架搭建 2017 (二十一) 使用对象池时的一些细节
    你确定你会写 Dockerfile 吗?
    小白学 Python 爬虫(8):网页基础
    老司机大型车祸现场
  • 原文地址:https://www.cnblogs.com/gisalameda/p/8330145.html
Copyright © 2011-2022 走看看