zoukankan      html  css  js  c++  java
  • 利用sklearn的LabelEncoder对标签进行数字化编码

    from sklearn.preprocessing import LabelEncoder
    
    def gen_label_encoder():
        labels = ['BB', 'CC']  
        le = LabelEncoder()
        le.fit(labels)
        print 'le.classes_', le.classes_
        for label in le.classes_:
            print label, le.transform([label])[0]
        joblib.dump(le, 'data/label_encoder.h5')

    LabelEncoder的说明:

     1 class LabelEncoder(BaseEstimator, TransformerMixin):
     2     """Encode labels with value between 0 and n_classes-1.
     3 
     4     Read more in the :ref:`User Guide <preprocessing_targets>`.
     5 
     6     Attributes
     7     ----------
     8     classes_ : array of shape (n_class,)
     9         Holds the label for each class.
    10 
    11     Examples
    12     --------
    13     `LabelEncoder` can be used to normalize labels.
    14 
    15     >>> from sklearn import preprocessing
    16     >>> le = preprocessing.LabelEncoder()
    17     >>> le.fit([1, 2, 2, 6])
    18     LabelEncoder()
    19     >>> le.classes_
    20     array([1, 2, 6])
    21     >>> le.transform([1, 1, 2, 6]) #doctest: +ELLIPSIS
    22     array([0, 0, 1, 2]...)
    23     >>> le.inverse_transform([0, 0, 1, 2])
    24     array([1, 1, 2, 6])
    25 
    26     It can also be used to transform non-numerical labels (as long as they are
    27     hashable and comparable) to numerical labels.
    28 
    29     >>> le = preprocessing.LabelEncoder()
    30     >>> le.fit(["paris", "paris", "tokyo", "amsterdam"])
    31     LabelEncoder()
    32     >>> list(le.classes_)
    33     ['amsterdam', 'paris', 'tokyo']
    34     >>> le.transform(["tokyo", "tokyo", "paris"]) #doctest: +ELLIPSIS
    35     array([2, 2, 1]...)
    36     >>> list(le.inverse_transform([2, 2, 1]))
    37     ['tokyo', 'tokyo', 'paris']
    38 
    39     See also
    40     --------
    41     sklearn.preprocessing.OneHotEncoder : encode categorical integer features
    42         using a one-hot aka one-of-K scheme.
    43     """
  • 相关阅读:
    用KNN算法分类CIFAR-10图片数据
    特征处理(Feature Processing)
    实际问题中如何使用机器学习模型
    CS229 6.18 CNN 的反向传导算法
    【Leetcode】【Medium】Single Number II
    【Leetcode】【Medium】Single Number
    【Leetcode】【Easy】Merge Two Sorted Lists
    【Leetcode】【Easy】Valid Sudoku
    【Leetcode】【Easy】Implement strStr()
    【Leetcode】【Easy】Roman to Integer
  • 原文地址:https://www.cnblogs.com/bymo/p/7404541.html
Copyright © 2011-2022 走看看