zoukankan      html  css  js  c++  java
  • 《Python数据科学手册》异常校正


    由于一些模块的变迁,导致复现《python数据科学手册》代码(尤其第5章-机器学习)时,经常报错。

    以下是我个人的一些校证。

    如果诸位在学习《python数据科学手册》的过程中,遇到什么疑难,欢迎留言。
    1. scikit-learn.cross_validation 模块变迁
    自 `scikit-learn 0.20 `版起,已经用`model_selection`模块代替`cross_validation`模块。因此,复现代码时,`from sklearn.cross_validation import xxx` 时,会报出`ModuleNotFoundError: No module named 'sklearn.cross_validation'`的错误。P307:

    
    In[15]:from sklearn.cross_validation import train_test_split # Error
    In[15]:from sklearn.model_selection import train_test_split # Amend
    
    In[20]:
    from sklearn.mixture import GMM # Error
    from sklearn.mixture import GaussianMixture # Amend
    
    In[5]: from sklearn.cross_validation import train_test_split # Error
    In[5]: from sklearn.model_selection import train_test_split # Amend
    
    # 用 model_selection 替换 cross_validation In[7]: from sklearn.cross_validation import cross_val_scroe # Error In[7]: from sklearn.model_selection import cross_val_scroe # Amend
    In[8]: from sklearn.cross_validation import cross_val_scroe # Error scores = cross_val_score(model, X, y, cv=LeaveOneOut(len(X)) # Error In[8]: from sklearn.model_selection import cross_val_scroe # Amend scores = cross_val_score(model, X, y, cv=LeaveOneOut() # Amend,去掉 len(X)

    2. scikit-learn.learning_curve 模块变迁
    自 `scikit-learn 0.20 `版起,已经用`model_selection`模块代替`learning_curve`模块。因此,复现代码时,`from sklearn.learning_curve import xxx` 时,会报出`ModuleNotFoundError: No module named 'sklearn.learning_curve'`的错误。

    P321:
    In[13]:
    from sklearn.learning_curve import validation_curve  # Error
    from sklearn.model_selection import validation_curve  # Amend

    P325: In[17]: from sklearn.learning_curve import learning_curve # Error from sklearn.model_selection import learning_curve # Amend

    3. scikit-learn.grid_search 模块变迁

    自 `scikit-learn 0.20 `版起,已经用`model_selection`模块代替`grid_search`模块。因此,复现代码时,`from sklearn.grid_search import xxx` 时,会报出`ModuleNotFoundError: No module named 'sklearn.grid_search'`的错误。

    P326
    
    In[18]:from sklearn.grid_search import GridSearchCV # Error
    In[18]:from sklearn.model_selection import GridSearchCV # Amend
    
    In[21]: plt.plot(X_test.ravel(), y_test, hold=True); # Error
    In[21]: plt.plot(X_test.ravel(), y_test); # Amend, 去掉 hold=True

    4. 其他错误

    P248:
    In[3]:
    ax = plt.axes(axisbg='#E6E6E6') # Error
    ax = plt.axes(facecolor='#E6E6E6') # Amend, axisbg -> facecolor
    
    
    P275:
    In[6]:
    plt.hist(data[col], normed=True, alpha=0.5) # Error
    plt.hist(data[col], density=True, alpha=0.5) # Amend, normed -> density
    
    
    P248:
    In[3]:
    ax = plt.axes(axisbg='#E6E6E6') # Error
    ax = plt.axes(facecolor='#E6E6E6') # Amend, axisbg -> facecolor
    
    
    P275:
    In[6]:
    plt.hist(data[col], normed=True, alpha=0.5) # Error
    plt.hist(data[col], density=True, alpha=0.5) # Amend, normed -> density
    
    
    P279:
    In[13]:
    sns.pairplot(iris, hue='species', size=2.5) # Error
    sns.pairplot(iris, hue='species', height=2.5) # Amend, size -> height
    
    
    P301:
    In[2]:
    sns.parirplot(iris, hue='species', size=1.5); # Error
    sns.parirplot(iris, hue='species', height=1.5); # Amend, size 改为 height
    
    
    P349:
    In[14]:
    weather = pd.read_csv('599021.csv', index_col='DATE', parse_dates=True) # Error
    weather = pd.read_csv('599021.csv', index_col='DATE', parse_dates=True) # Amend, 599021.csv -> BicycleWeather.csv
    
    In[15]: daily = counts.resample('d', how='sum') # Error
    In[15]: daily = counts.resample('d').sum() # Amend
    
    
    P361:
    In[14]: clf = SVC(kernel='rbf', C=1E6) # Error
    In[14]: clf = SVC(kernel='rbf', C=1E6, gamma='auto') # Amend, add gamma='auto'
    
    
    P363
    In[20]:
    from sklearn.decomposition import RandomizedPCA # Error
    pac = RandomizedPCA(n_components=150, whiten=True, random_state=42) # Error
    
    from sklearn.decomposition import PCA # Amend, RandomizedPCA -> PCA
    pac = PCA(n_components=150, whiten=True, random_state=42) # Amend, RandomizedPCA -> PCA
    
    
    P364:
    In[21]: from sklearn.cross_validation import train_test_split # Error
    In[21]: from sklearn.model_selection import train_test_split # Amend
    
    In[22]: from sklearn.grid_search import GridSearchCV # Error
    grid = GridSearchCV(model, param_grid) # Error
    In[22]: from sklearn.model_selection import GridSearchCV # Amend
    grid = GridSearchCV(model, param_grid, cv=3) # Amend, add cv=3
    
    
    P279:
    In[13]:
    sns.pairplot(iris, hue='species', size=2.5) # Error
    sns.pairplot(iris, hue='species', height=2.5) # Amend, size -> height
    
    
    P301:
    In[2]:
    sns.parirplot(iris, hue='species', size=1.5); # Error
    sns.parirplot(iris, hue='species', height=1.5); # Amend, size 改为 height
    
    
    P349:
    In[14]:
    weather = pd.read_csv('599021.csv', index_col='DATE', parse_dates=True) # Error
    weather = pd.read_csv('599021.csv', index_col='DATE', parse_dates=True) # Amend, 599021.csv -> BicycleWeather.csv
    
    In[15]: daily = counts.resample('d', how='sum') # Error
    In[15]: daily = counts.resample('d').sum() # Amend
    
    
    P361:
    In[14]: clf = SVC(kernel='rbf', C=1E6) # Error
    In[14]: clf = SVC(kernel='rbf', C=1E6, gamma='auto') # Amend, add gamma='auto'
    
    
    P248:
    In[3]:
    ax = plt.axes(axisbg='#E6E6E6') # Error
    ax = plt.axes(facecolor='#E6E6E6') # Amend, axisbg -> facecolor
    
    
    P275:
    In[6]:
    plt.hist(data[col], normed=True, alpha=0.5) # Error
    plt.hist(data[col], density=True, alpha=0.5) # Amend, normed -> density
    
    
    P279:
    In[13]:
    sns.pairplot(iris, hue='species', size=2.5) # Error
    sns.pairplot(iris, hue='species', height=2.5) # Amend, size -> height
    
    
    P301:
    In[2]:
    sns.parirplot(iris, hue='species', size=1.5); # Error
    sns.parirplot(iris, hue='species', height=1.5); # Amend, size 改为 height
    
    
    P349:
    In[14]:
    weather = pd.read_csv('599021.csv', index_col='DATE', parse_dates=True) # Error
    weather = pd.read_csv('599021.csv', index_col='DATE', parse_dates=True) # Amend, 599021.csv -> BicycleWeather.csv
    
    In[15]: daily = counts.resample('d', how='sum') # Error
    In[15]: daily = counts.resample('d').sum() # Amend
    
    
    P361:
    In[14]: clf = SVC(kernel='rbf', C=1E6) # Error
    In[14]: clf = SVC(kernel='rbf', C=1E6, gamma='auto') # Amend, add gamma='auto'
    
    
    P363:
    In[20]:
    from sklearn.decomposition import RandomizedPCA # Error
    pac = RandomizedPCA(n_components=150, whiten=True, random_state=42) # Error
    
    from sklearn.decomposition import PCA # Amend, RandomizedPCA -> PCA
    pac = PCA(n_components=150, whiten=True, random_state=42) # Amend, RandomizedPCA -> PCA
    
    
    P364:
    In[21]: from sklearn.cross_validation import train_test_split # Error
    In[21]: from sklearn.model_selection import train_test_split # Amend
    
    In[22]: from sklearn.grid_search import GridSearchCV # Error
    grid = GridSearchCV(model, param_grid) # Error
    In[22]: from sklearn.model_selection import GridSearchCV # Amend
    grid = GridSearchCV(model, param_grid, cv=3) # Amend, add cv=3
    
    P363
    In[20]:
    from sklearn.decomposition import RandomizedPCA # Error
    pac = RandomizedPCA(n_components=150, whiten=True, random_state=42) # Error
    
    from sklearn.decomposition import PCA # Amend, RandomizedPCA -> PCA
    pac = PCA(n_components=150, whiten=True, random_state=42) # Amend, RandomizedPCA -> PCA
    
    P364:
    In[21]: from sklearn.cross_validation import train_test_split # Error
    In[21]: from sklearn.model_selection import train_test_split # Amend
    
    In[22]: from sklearn.grid_search import GridSearchCV # Error
    grid = GridSearchCV(model, param_grid) # Error
    In[22]: from sklearn.model_selection import GridSearchCV # Amend
    grid = GridSearchCV(model, param_grid, cv=3) # Amend, add cv=3
  • 相关阅读:
    Individual Reading Assignment
    Individual P1: Summary
    Individual P1: Preparation
    M1m2分析报告
    第二次阅读作业--12061161 赵梓皓
    代码互审报告
    结对编程————电梯整理报告
    读书问题之《编程之美》 -----12061161 赵梓皓
    SE Class's Individual Project--12061161 赵梓皓
    博客测试
  • 原文地址:https://www.cnblogs.com/xiangsui/p/11665914.html
Copyright © 2011-2022 走看看