zoukankan      html  css  js  c++  java
  • Python for Data Science

    Chapter 6 - Other Popular Machine Learning Methods

    Segment 5 - Naive Bayes Classifiers

    Naive Bayes Classifiers

    Naive Bayes is a machine learning method you can use to predict the likelihood that an event will occur given evidence that's present in your data.

    Conditional Probability

    [P(B|A) = frac{P(A and B)}{P(A)} ]

    Tree Types of Naive Bayes Model

    • Multinomial
    • Bernoulli
    • Gaussian

    Naive Bayes Use Cases

    • Spam Detection
    • Customer Classification
    • Credit Risk Protection
    • Health Risk Protection

    Naive Bayes Assumptions

    Predictors are independent of each other.

    A proiri assumption: the assumption the past conditions still hold true; when we make predictions from historical values we will get incorrect results if present circumstances have changed.

    • All regression models maintain a priori assumption as well
    import numpy as np
    import pandas as pd
    import urllib
    import sklearn
    
    from sklearn.model_selection import train_test_split
    from sklearn import metrics
    from sklearn.metrics import accuracy_score
    
    from sklearn.naive_bayes import BernoulliNB
    from sklearn.naive_bayes import GaussianNB
    from sklearn.naive_bayes import MultinomialNB
    

    Naive Bayes

    Using Naive Bayes to predict spam

    url = "https://archive.ics.uci.edu/ml/machine-learning-databases/spambase/spambase.data"
    
    import urllib.request
    
    raw_data = urllib.request.urlopen(url)
    dataset = np.loadtxt(raw_data, delimiter=',')
    print(dataset[0])
    
    [  0.      0.64    0.64    0.      0.32    0.      0.      0.      0.
       0.      0.      0.64    0.      0.      0.      0.32    0.      1.29
       1.93    0.      0.96    0.      0.      0.      0.      0.      0.
       0.      0.      0.      0.      0.      0.      0.      0.      0.
       0.      0.      0.      0.      0.      0.      0.      0.      0.
       0.      0.      0.      0.      0.      0.      0.778   0.      0.
       3.756  61.    278.      1.   ]
    
    X = dataset[:,0:48]
    
    y = dataset[:,-1]
    
    X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=.2, random_state=17)
    
    BernNB = BernoulliNB(binarize=True)
    BernNB.fit(X_train, y_train)
    print(BernNB)
    
    y_expect = y_test
    y_pred = BernNB.predict(X_test)
    
    print(accuracy_score(y_expect, y_pred))
    
    BernoulliNB(binarize=True)
    0.8577633007600435
    
    MultiNB = MultinomialNB()
    MultiNB.fit(X_train, y_train)
    print(MultiNB)
    
    y_pred = MultiNB.predict(X_test)
    
    print(accuracy_score(y_expect, y_pred))
    
    MultinomialNB()
    0.8816503800217155
    
    GausNB = GaussianNB()
    GausNB.fit(X_train, y_train)
    print(GausNB)
    
    y_pred = GausNB.predict(X_test)
    
    print(accuracy_score(y_expect, y_pred))
    
    GaussianNB()
    0.8197611292073833
    
    BernNB = BernoulliNB(binarize=0.1)
    BernNB.fit(X_train, y_train)
    print(BernNB)
    
    y_expect = y_test
    y_pred = BernNB.predict(X_test)
    
    print(accuracy_score(y_expect, y_pred))
    
    BernoulliNB(binarize=0.1)
    0.9109663409337676
    相信未来 - 该面对的绝不逃避,该执著的永不怨悔,该舍弃的不再留念,该珍惜的好好把握。
  • 相关阅读:
    洛谷P2798 爆弹虐场
    洛谷P1164 小A点菜(01背包求方案数)
    洛谷P1312 Mayan游戏
    洛谷P1514 引水入城
    2017-10-12 NOIP模拟赛
    洛谷P1038 神经网络
    洛谷P1607 [USACO09FEB]庙会班车Fair Shuttle
    洛谷P1378 油滴扩展
    Ionic+Angular实现中英国际化(附代码下载)
    Ionic+Angular+Express实现前后端交互使用HttpClient发送get请求数据并加载显示(附代码下载)
  • 原文地址:https://www.cnblogs.com/keepmoving1113/p/14349367.html
Copyright © 2011-2022 走看看