zoukankan      html  css  js  c++  java
  • Chapter 07-Basic statistics(Part4 t-tests&&nonparametric tests of group difference)

    一. t-tests

    这一部分我们使用分布在MASS包中的UScrime数据集。它是关于美国47个州在1960年时,关于惩罚制度对犯罪率的影响。

    Prob:监禁(坐牢)的概率;

    U1:14到24岁的城市那你的失业率;

    U2:35到39岁的城市男子的失业率;

    So:an indicator variable for Southern states

    1. 独立的t-test(independent t-test)

    t.test(y~x,data)

    t.tset(y1,y2)

    例01:

    > library(MASS)
    > t.test(Prob~So,data=UScrime)
    
    	Welch Two Sample t-test
    
    data:  Prob by So
    t = -3.8954, df = 24.925, p-value = 0.0006506
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     -0.03852569 -0.01187439
    sample estimates:
    mean in group 0 mean in group 1 
         0.03851265      0.06371269 

    注意:可以摒弃南方的州和非南方的州有相同的犯罪率,因为p<0.01。

    2.依赖的t-test

    t.test(y1,y2,paired=TRUE)

    ·y1和y2是两个有依赖关系的组的数值向量

    例02:


    >
    library(MASS) > sapply(UScrime[c("U1","U2")],function(x)(c(mean=mean(x),sd=sd(x)))) U1 U2 mean 95.46809 33.97872 sd 18.02878 8.44545 > with(UScrime,t.test(U1,U2,paired=TRUE)) Paired t-test data: U1 and U2 t = 32.4066, df = 46, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 57.67003 65.30870 sample estimates: mean of the differences 61.48936

    二. nonparametric tests of group difference

    1. 比较两组

    如果两组是独立的,应该使用Wilcoxon rank sum去评估自变量是否是来自相同概率分布的样本。

    wilcox.test(y~x,data)

    wilcox.test(y1,y2)

    例03:

    > with(UScrime,by(Prob,So,median))
    So: 0
    [1] 0.038201
    -------------------------------------------------------- 
    So: 1
    [1] 0.055552
    > wilcox.test(Prob~So,data=UScrime)
    
    	Wilcoxon rank sum test
    
    data:  Prob by So
    W = 81, p-value = 8.488e-05
    alternative hypothesis: true location shift is not equal to 0

    例04:

    > sapply(UScrime[c("U1","U2")],median)
    U1 U2 
    92 34 
    > with(UScrime,wilcox.test(U1,U2,paired=TRUE))
    
    	Wilcoxon signed rank test with continuity correction
    
    data:  U1 and U2
    V = 1128, p-value = 2.464e-09
    alternative hypothesis: true location shift is not equal to 0
    

    2.比较多于两组

    Kruskal-Wallis test:

    kruskal.test(y~A,data)

    ·A:a grouping variable with two or more levels, if just two levels, equivalent to Mann-Whitney;

    ·y:a numeric outcome variable;

    Friedman test:

    friedman.test(y~A|B,data)

    ·B: a blocking variable that identifies matched observations.

    npmc包中的npmc()函数:期待输入两列的数据,分别叫var(the dependent variable)和class(the grouping variable).

  • 相关阅读:
    explicit
    boolalpha 和 noboolalpha
    C++ 头文件一览
    C++ I/O库总结
    Error:collect2:ld returned 1 exit status (总结)
    常用目录的作用
    硬盘分区与硬软链接
    POJ3694 Network(Tarjan双联通分图 LCA 桥)
    2016"百度之星"
    2016"百度之星"
  • 原文地址:https://www.cnblogs.com/wangshenwen/p/3278731.html
Copyright © 2011-2022 走看看