zoukankan      html  css  js  c++  java
  • Python for Data Science

    Chapter 5 - Basic Math and Statistics

    Segment 5 - Starting with parametric methods in pandas and scipy

    import pandas as pd
    import numpy as np
    
    import matplotlib.pyplot as plt
    import seaborn as sb
    from pylab import rcParams
    
    import scipy
    from scipy.stats.stats import pearsonr
    
    %matplotlib inline
    rcParams['figure.figsize'] = 8,4
    plt.style.use('seaborn-whitegrid')
    

    The Pearson Correlation

    address = '~/Data/mtcars.csv'
    
    cars = pd.read_csv(address)
    cars.columns = ['car_names','mpg','cyl','disp','hp','drat','wt','qsec','vs','am','gear','carb']
    
    sb.pairplot(cars)
    
    <seaborn.axisgrid.PairGrid at 0x7ff9164e46d8>
    

    output_5_1__

    X = cars[['mpg','hp','qsec','wt']]
    sb.pairplot(X)
    
    <seaborn.axisgrid.PairGrid at 0x7ff91133c438>
    

    output_6_1__

    Using scipy to calculate the Pearson correlation coefficient

    mpg = cars['mpg']
    hp = cars['hp']
    qsec = cars['qsec']
    wt = cars['wt']
    
    pearsonr_coefficient, p_value = pearsonr(mpg, hp)
    print('PeasonR Correlation Coefficient %0.3f'%(pearsonr_coefficient))
    
    PeasonR Correlation Coefficient -0.776
    
    pearsonr_coefficient, p_value = pearsonr(mpg, qsec)
    print('PeasonR Correlation Coefficient %0.3f'%(pearsonr_coefficient))
    
    PeasonR Correlation Coefficient 0.419
    
    pearsonr_coefficient, p_value = pearsonr(mpg, wt)
    print('PeasonR Correlation Coefficient %0.3f'%(pearsonr_coefficient))
    
    PeasonR Correlation Coefficient -0.868
    

    Using pandas to calculate the Pearson correlation coefficient

    corr = X.corr()
    corr
    
    mpg hp qsec wt
    mpg 1.000000 -0.776168 0.418684 -0.867659
    hp -0.776168 1.000000 -0.708223 0.658748
    qsec 0.418684 -0.708223 1.000000 -0.174716
    wt -0.867659 0.658748 -0.174716 1.000000

    Using Seaborn to visualize the Pearson correlation coefficient

    sb.heatmap(corr, xticklabels=corr.columns.values, yticklabels=corr.columns.values)
    
    <matplotlib.axes._subplots.AxesSubplot at 0x7ff90c978358>
    

    png

  • 相关阅读:
    数据库的三大范式以及五大约束
    解析PHP面向对象的三大特征
    php中的数组遍历的几种方式
    PHP中的函数声明与使用
    使用mui框架打开页面的几种不同方式
    JS中精选this关键字的指向规律你记住了吗
    同一功能三种不同实现方式你选哪个
    转!!NPM报错 Error: EPERM: operation not permitted, unlink......解决办法和清除缓存。
    转!!关于http请求 浏览器 中文编码
    CentOS6.5下Apache防止目录遍历
  • 原文地址:https://www.cnblogs.com/keepmoving1113/p/14285185.html
Copyright © 2011-2022 走看看