zoukankan      html  css  js  c++  java
  • 机器学习sklearn(十): 数据处理(五)自定义转换器

    在机器学习中,想要将一个已有的 Python 函数转化为一个转换器来协助数据清理或处理。可以使用 FunctionTransformer 从任意函数中实现一个转换器。例如,在一个管道中构建一个实现日志转换的转化器,这样做:

    >>> import numpy as np
    >>> from sklearn.preprocessing import FunctionTransformer
    >>> transformer = FunctionTransformer(np.log1p, validate=True)
    >>> X = np.array([[0, 1], [2, 3]])
    >>> transformer.transform(X)
    array([[0.        , 0.69314718],
           [1.09861229, 1.38629436]])

    通过设置check_reverse=True并在转换之前调用fit,可以确保funcinverse_func是彼此的拟过程。请注意,请注意一个warning会被抛出,并且可以使用filterwarnings将其转为一个error

    使用一个 FunctionTransformer 类来做定制化特征选择的示例,请见 Using FunctionTransformer to select columns 。

    class sklearn.preprocessing.FunctionTransformer(func=Noneinverse_func=None*validate=Falseaccept_sparse=Falsecheck_inverse=Truekw_args=Noneinv_kw_args=None)

    Constructs a transformer from an arbitrary callable.

    A FunctionTransformer forwards its X (and optionally y) arguments to a user-defined function or function object and returns the result of this function. This is useful for stateless transformations such as taking the log of frequencies, doing custom scaling, etc.

    Note: If a lambda is used as the function, then the resulting transformer will not be pickleable.

    New in version 0.17.

    Read more in the User Guide.

    Parameters
    funccallable, default=None

    The callable to use for the transformation. This will be passed the same arguments as transform, with args and kwargs forwarded. If func is None, then func will be the identity function.

    inverse_funccallable, default=None

    The callable to use for the inverse transformation. This will be passed the same arguments as inverse transform, with args and kwargs forwarded. If inverse_func is None, then inverse_func will be the identity function.

    validatebool, default=False

    Indicate that the input X array should be checked before calling func. The possibilities are:

    • If False, there is no input validation.

    • If True, then X will be converted to a 2-dimensional NumPy array or sparse matrix. If the conversion is not possible an exception is raised.

    Changed in version 0.22: The default of validate changed from True to False.

    accept_sparsebool, default=False

    Indicate that func accepts a sparse matrix as input. If validate is False, this has no effect. Otherwise, if accept_sparse is false, sparse matrix inputs will cause an exception to be raised.

    check_inversebool, default=True

    Whether to check that or func followed by inverse_func leads to the original inputs. It can be used for a sanity check, raising a warning when the condition is not fulfilled.

    New in version 0.20.

    kw_argsdict, default=None

    Dictionary of additional keyword arguments to pass to func.

    New in version 0.18.

    inv_kw_argsdict, default=None

    Dictionary of additional keyword arguments to pass to inverse_func.

    New in version 0.18.

    Methods

    fit(X[, y])

    Fit transformer by checking X.

    fit_transform(X[, y])

    Fit to data, then transform it.

    get_params([deep])

    Get parameters for this estimator.

    inverse_transform(X)

    Transform X using the inverse function.

    set_params(**params)

    Set the parameters of this estimator.

    transform(X)

    Transform X using the forward function.

    Examples

    >>> import numpy as np
    >>> from sklearn.preprocessing import FunctionTransformer
    >>> transformer = FunctionTransformer(np.log1p)
    >>> X = np.array([[0, 1], [2, 3]])
    >>> transformer.transform(X)
    array([[0.       , 0.6931...],
           [1.0986..., 1.3862...]])
  • 相关阅读:
    windows服务创建与管理
    html前端技术:??
    C#整数三种强制类型转换int、Convert.ToInt32()、int.Parse()的区别
    convert转化成特定日期格式
    关于android性能,内存优化(转载)
    不错的Android博客
    十步优化SQL Server中的数据访问(转载)
    数据库SQL优化大总结之 百万级数据库优化方案(转载)
    SQL Server数据库性能优化之SQL语句篇(转载)
    50种方法优化SQL Server数据库查询(转载)
  • 原文地址:https://www.cnblogs.com/qiu-hua/p/14903451.html
Copyright © 2011-2022 走看看