zoukankan      html  css  js  c++  java
  • S&P_01_Analyzing one categorical varialbe

    1. Analyzing categorical data

    1.1 Identifying individuals, variables and categorical variables in a data set

    Two types of variables are used in statistics: Quantitative and Categorical (also called qualitative). Quantitative variables are numerical variables: counts, percents, or numbers. Categorical variables are descriptions of groups or things, like “breeds of dog” or “voting preference”.

    Quantitative variables can be counted, like the numbers on the deck of cards. 2,3,4,5,6... those were all quantitative. In other words they are numerical values. 

    General rule: if you can add it, it’s quantitative. For example, a G.P.A. of 3.3 and a G.P.A. of 4.0 can be added together (3.3 + 4.0 = 7.3), so that means it’s quantitative. 

    A deck of cards also has qualitative values. The qualitative values are descriptions. we have spades, clubs,diamonds, hearts, etc. 

    As a general rule, if you can’t add something, then it’s categorical. For example, you can’t add cat + dog, or Republican + Democrat.'

    1.2 Distributions in 2-way tables. 

    There are our buckets for the amount of time studying. And also we create buckets for the percent correct. And then, we figure out what % of our entire student population falls into each of these categoreies. So for example, 2% of our students studied 21 to 40 minutes and got between 80 and 100% on the exam. This is a 2-way table. it's describing a joint distribution. You can view these as 2 variables. The time studied and the % correct.

     

     All we did is we totaled up each of these rows to 100. We total this rows and write it in the margin. This describes the distribution of the scores in the class.  20% of the students got 80 to 100% correct on that test. You don't know the breakdown by how much they actually studied. 

    There is another marginal distribution. the distribution of the amount of time people studied in the class. We could total up each of these columns. And this marginal distribution of the time studed. 

    The distribution of one variable given a bucket that you are falling into another variable. This is called a conditional distribution. becuase you are getting a distribution conditioned on a value of another variable.

  • 相关阅读:
    Asp.NetCore3.1 WebApi 获取配置json文件中的数据
    Dapper同时操作任意多张表的实现
    将视图批量新增到PowerDesigner中并以model图表的形式展示
    .NetCore3.1获取文件并重新命名以及大批量更新及写入数据
    .NetCore 简单的使用中间件
    比较复杂的SQL转Linq
    Asp.NetCore3.1版本的CodeFirst与经典的三层架构与AutoFac批量注入
    Git与GitLab的分支合并等简单的测试操作
    Winform的控件以及DataGridView的一般使用
    在Linux系统中运行并简单的测试RabbitMq容器
  • 原文地址:https://www.cnblogs.com/tlfox2006/p/9394253.html
Copyright © 2011-2022 走看看