zoukankan      html  css  js  c++  java
  • np.nan is an invalid document, expected byte or unicode string.

    ValueError                                Traceback (most recent call last)
    <ipython-input-12-1dc462ae8893> in <module>()
         15     print('cv prepared!')
         16     return df_x.astype(np.float64)
    ---> 17 df_test = get_feature(test_data,all_table,ready_cols,vec_col)
         18 df_train = get_feature(train_data,all_table,ready_cols,vec_col)
    
    <ipython-input-12-1dc462ae8893> in get_feature(df, all_data, cols, vec_col)
          9     cv=CountVectorizer()
         10     for feature in vec_col:
    ---> 11         cv.fit(all_data[feature])
         12         df_a = cv.transform(df[feature])
         13         df_x = sparse.hstack((df_x, df_a))
    

     

    def get_feature(df,all_data,cols,vec_col):
      enc = OneHotEncoder()
      df_x=np.int64(df[cols])
      cv=CountVectorizer()
      for feature in vec_col:
        cv.fit(all_data[feature])
        df_a = cv.transform(df[feature])
        df_x = sparse.hstack((df_x, df_a))
        print('Done Feature '+ str(feature))
      print('cv prepared!')
      return df_x.astype(np.float64)

    原因分析:我的all_data中存在nan的数据,我在数据读入的时候使用了all_table.fillna(-1),我理解只会填充空值,但是all_table中原本为nan的值,不会改变。改为all_table.fillna(-1),可执行。

  • 相关阅读:
    C
    B
    A
    F
    C
    H
    Fang Fang hdu 5455
    Fire Net hdu1045(DFS)
    Sudoku HDU 5547(DFS)
    UVA 10200 Prime Time (打表)
  • 原文地址:https://www.cnblogs.com/smartwhite/p/9749168.html
Copyright © 2011-2022 走看看