zoukankan      html  css  js  c++  java
  • Kaggle_Data Visualization of scatter plot

    step0 输入和配置python库文件

    import pandas as pd
    pd.plotting.register_matplotlib_converters()
    import matplotlib.pyplot as plt
    %matplotlib inline
    import seaborn as sns

    设置代码核查

    import os
    if not os.path.exists("../input/candy.csv"):
        os.symlink("../input/data-for-datavis/candy.csv", "../input/candy.csv") 
    from learntools.core import binder
    binder.bind(globals())
    from learntools.data_viz_to_coder.ex4 import *

    step1 载入数据

    # Path of the file to read
    candy_filepath = "../input/candy.csv"
    
    # Fill in the line below to read the file into a variable candy_data
    candy_data = pd.read_csv(candy_filepath,index_col="id")
    
    # Run the line below with no changes to check that you've loaded the data correctly
    step_1.check()

    step2 review数据

    打印前五行数据

    candy_data.head()
    competitornamechocolatefruitycaramelpeanutyalmondynougatcrispedricewaferhardbarpluribussugarpercentpricepercentwinpercent
    id             
    0 100 Grand Yes No Yes No No Yes No Yes No 0.732 0.860 66.971725
    1 3 Musketeers Yes No No No Yes No No Yes No 0.604 0.511 67.602936
    2 Air Heads No Yes No No No No No No No 0.906 0.511 52.341465
    3 Almond Joy Yes No No Yes No No No Yes No 0.465 0.767 50.347546
    4 Baby Ruth Yes No Yes Yes Yes No No Yes No 0.604 0.767 56.914547
    # Fill in the line below: Which candy was more popular with survey respondents:
    # '3 Musketeers' or 'Almond Joy'?  (Please enclose your answer in single quotes.)
    more_popular = '3 Musketeers'
    
    # Fill in the line below: Which candy has higher sugar content: 'Air Heads'
    # or 'Baby Ruth'? (Please enclose your answer in single quotes.)
    more_sugar = "Air Heads"
    
    # Check your answers
    step_2.check()

    Step3 The role of sugar

    绘制sugarpercent和winpercent之间的散点图

    # Scatter plot showing the relationship between 'sugarpercent' and 'winpercent'
    plt.figure(figsize=(12,6))
    sns.scatterplot(x=candy_data["sugarpercent"],y=candy_data["winpercent"])
    
    # Check your answer
    step_3.a.check()

    step4 绘制回归曲线

    sns.regplot()

    # Scatter plot w/ regression line showing the relationship between 'sugarpercent' and 'winpercent'
    plt.figure(figsize=(12,6)) # Your code here
    sns.regplot(x=candy_data["sugarpercent"],y=candy_data["winpercent"])
    
    # Check your answer
    step_4.a.check()

    step5 chocolate

    # Scatter plot showing the relationship between 'pricepercent', 'winpercent', and 'chocolate'
    sns.scatterplot(x=candy_data["pricepercent"],y=candy_data["winpercent"],hue=candy_data["chocolate"])
    
    # Check your answer
    step_5.check()

    step6 investigate chocolate

    创建带有两行回归行的散点图

    # Color-coded scatter plot w/ regression lines
    sns.lmplot(x="pricepercent",y="winpercent",hue="chocolate",data=candy_data)
    sns.lmplot(x="pricepercent",y="winpercent",data=candy_data)
    # Check your answer
    step_6.a.check()


    Step 7: Everybody loves chocolate

    创建有类型的散点图去强调chocolate与winpercent之间的关系。把chocolate放在水平轴上,把winpercent放在y轴上。

    # Scatter plot showing the relationship between 'chocolate' and 'winpercent'
    sns.swarmplot(x=candy_data["chocolate"],y=candy_data["winpercent"])
    
    # Check your answer
    step_7.a.check()

  • 相关阅读:
    在日本被禁止的コンプガチャ設計
    Starling常见问题解决办法
    Flixel引擎学习笔记
    SQLSERVER中修复状态为Suspect的数据库
    T4 (Text Template Transformation Toolkit)实现简单实体代码生成
    创建Linking Server in SQL SERVER 2008
    Linq to Sql 与Linq to Entities 生成的SQL Script与分页实现
    Linq to Entity 的T4 模板生成代码
    在VisualStudio2008 SP1中调试.net framework 源代码
    使用HttpModules实现Asp.net离线应用程序
  • 原文地址:https://www.cnblogs.com/yuyukun/p/12885106.html
Copyright © 2011-2022 走看看