zoukankan      html  css  js  c++  java
  • day08-逻辑回归

    • 逻辑回归主要处理二分类问题
    • 逻辑回归是在线性回归的基础上引入sigmoid函数
    • 逻辑回归主要优势是可以预测二分类中,是和否的概率,例如广告点击率就是点击广告的概率和不点击广告的概率
    
    # coding=utf-8
    from sklearn.metrics import mean_squared_error,classification_report
    from sklearn.linear_model import LogisticRegression
    from sklearn.preprocessing import StandardScaler
    from sklearn.model_selection import train_test_split
    import pandas as pd
    import numpy as np
    
    def ljhg():
    
        # 准备数据
        columns = ['Sample code number','Clump Thickness', 'Uniformity of Cell Size','Uniformity of Cell Shape','Marginal Adhesion','Single Epithelial Cell Size','Bare Nuclei','Bland Chromatin','Normal Nucleoli','Mitoses','Class']
        data = pd.read_csv("../data/breast-cancer-wisconsin.data",names=columns)
    
        # 缺失值处理
        data = data.replace(to_replace="?",value=np.nan)
        data = data.dropna()
    
        # 数据分割
        x_train,x_test,y_train,y_test = train_test_split(data[columns[1:10]],data[columns[10]],test_size=0.25)
    
        # 数据标准化
        std = StandardScaler()
        x_train = std.fit_transform(x_train)
        x_test = std.transform(x_test)
    
    
        # 逻辑回归
        lg = LogisticRegression()
    
        lg.fit(x_train,y_train)
    
        print("权重为:",lg.coef_)
        print("预测的准确率(本场景无意义)为:",lg.score(x_test,y_test))
        print("预测的召回率为:",classification_report(y_test,lg.predict(x_test),labels=[2,4],target_names=["良性","恶性"]))
    
    
    
        return None
    
    
    if __name__ == '__main__':
        ljhg()
    
    

    结果:

    
    权重为: [[0.93399495 0.49202418 0.68596862 0.85106892 0.2800929  1.21018499
      0.98731303 0.75633944 0.93642056]]
    预测的准确率(本场景无意义)为: 0.9590643274853801
    预测的召回率为:               precision    recall  f1-score   support
    
              良性       0.97      0.97      0.97       115
              恶性       0.93      0.95      0.94        56
    
        accuracy                           0.96       171
       macro avg       0.95      0.96      0.95       171
    weighted avg       0.96      0.96      0.96       171
    
    
    Process finished with exit code 0
    
    
    
    
  • 相关阅读:
    cocos2d与cocos2d-X中的draw和update
    poj1673
    hdu2128之BFS
    常用的js效验
    OMCS的语音视频带宽占用
    UML类图详细介绍
    [置顶] 获取激活码,激活myeclipse
    CBO学习----03--选择率(Selectivity)
    notepad++ 文件对比插件
    永远不要在Linux 执行的 10 个最危险的命令
  • 原文地址:https://www.cnblogs.com/wuren-best/p/14283523.html
Copyright © 2011-2022 走看看