zoukankan      html  css  js  c++  java
  • 理解 sklearn.preprocessing.MinMaxScaler

    公式

    非常有用的工具,可以把数据集的不同特征缩放到固定范围。

    先从简单的说起,[0,1]缩放,公式

    (X_{scaled} = frac{x-x_{min}}{x_{max}-x_{min}})

    MinMaxScaler可以缩放到任意范围[MIN,MAX],因此更一般化的公式是

    (X_{std} = frac{x-x_{min}}{x_{max}-x_{min}})
    (X_{scaled} = frac{X_{std}}{MAX-MIN} + MIN)

    (MIN)(MAX)为0和1时,公式等价于[0,1]缩放。

    代码

    再来看源代码。

    def transform(self, X):
            """Scale features of X according to feature_range.
            Parameters
            ----------
            X : array-like of shape (n_samples, n_features)
                Input data that will be transformed.
            Returns
            -------
            Xt : array-like of shape (n_samples, n_features)
                Transformed data.
            """
            check_is_fitted(self)
    
            X = check_array(X, copy=self.copy, dtype=FLOAT_DTYPES,
                            force_all_finite="allow-nan")
    
            X *= self.scale_
            X += self.min_
            return X
    
    """
        min_ : ndarray of shape (n_features,)
            Per feature adjustment for minimum. Equivalent to
            ``min - X.min(axis=0) * self.scale_``
        scale_ : ndarray of shape (n_features,)
            Per feature relative scaling of the data. Equivalent to
            ``(max - min) / (X.max(axis=0) - X.min(axis=0))``
            .. versionadded:: 0.17
               *scale_* attribute.
        data_min_ : ndarray of shape (n_features,)
            Per feature minimum seen in the data
            .. versionadded:: 0.17
               *data_min_*
        data_max_ : ndarray of shape (n_features,)
            Per feature maximum seen in the data
            .. versionadded:: 0.17
               *data_max_*
    """
    

    这里的scale_相当于(frac{MAX-MIN}{x_{max}-x_{min}}),所以min_相当于(MIN-x_{min}*frac{MAX-MIN}{x_{max}-x_{min}}),这两个参数主要是方便以下逆变换

        def inverse_transform(self, X):
            """Undo the scaling of X according to feature_range.
            Parameters
            ----------
            X : array-like of shape (n_samples, n_features)
                Input data that will be transformed. It cannot be sparse.
            Returns
            -------
            Xt : array-like of shape (n_samples, n_features)
                Transformed data.
            """
            check_is_fitted(self)
    
            X = check_array(X, copy=self.copy, dtype=FLOAT_DTYPES,
                            force_all_finite="allow-nan")
    
            X -= self.min_
            X /= self.scale_
            return X
    
  • 相关阅读:
    数学+高精度 ZOJ 2313 Chinese Girls' Amusement
    最短路(Bellman_Ford) POJ 1860 Currency Exchange
    贪心 Gym 100502E Opening Ceremony
    概率 Gym 100502D Dice Game
    判断 Gym 100502K Train Passengers
    BFS POJ 3278 Catch That Cow
    DFS POJ 2362 Square
    DFS ZOJ 1002/HDOJ 1045 Fire Net
    组合数学(全排列)+DFS CSU 1563 Lexicography
    stack UVA 442 Matrix Chain Multiplication
  • 原文地址:https://www.cnblogs.com/yaos/p/14083355.html
Copyright © 2011-2022 走看看