zoukankan      html  css  js  c++  java
  • 【R统计】主成分分析2——主成分回归

    习题: 

    对某地区的某消费品的销售量Y进行调查,它与下面四个变量有关:x1居民可支配收入,x2该类消费品平均价格指数,x3社会该消费品保有量,x4其他消费品平均价格指数,历史资料如下表所示。试用主成分回归方法建立销售量Y与其他四个变量x1,x2, x3 和 x4的回归方程。

    数据资料data.txt:

    	x1	x2	x3	x4	y
    1	82.9	92	17.1	94	8.4
    2	88.0	93	21.3	96	9.6
    3	99.9	96	25.1	97	10.4
    4	105.3	94	29.0	97	11.4
    5	117.7	100	34.0	100	12.2
    6	131.0	101	40.0	101	14.2
    7	148.2	105	44.0	104	15.8
    8	161.8	112	49.0	109	17.9
    9	174.2	112	51.0	111	19.6
    10	184.7	112	53.0	111	20.8

    脚本

    #270
    #230
    
    conomy <- read.table("data.txt",header = TRUE, sep = "	");
    
    #### 作线性回归
    lm.sol<-lm(y~x1+x2+x3, data=conomy);
    summary(lm.sol);
    Call:
    lm(formula = y ~ x1 + x2 + x3, data = conomy);
    # Residuals:
         # Min       1Q   Median       3Q      Max 
    # -0.44365 -0.20719  0.04925  0.18879  0.47673 
    
    # Coefficients:
                # Estimate Std. Error t value Pr(>|t|)   
    # (Intercept)  0.23574    5.39534   0.044  0.96657   
    # x1           0.14167    0.02587   5.477  0.00155 **
    # x2          -0.02763    0.07265  -0.380  0.71685   
    # x3          -0.04743    0.05903  -0.803  0.45235   
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    
    # Residual standard error: 0.349 on 6 degrees of freedom
    # Multiple R-squared:  0.9957,    Adjusted R-squared:  0.9935 
    # F-statistic: 462.5 on 3 and 6 DF,  p-value: 1.744e-07
    
    
    #### 作主成分分析
    conomy.pr<-princomp(~x1+x2+x3, data=conomy, cor=T);
    summary(conomy.pr, loadings=TRUE);
    # Importance of components:
                             # Comp.1     Comp.2      Comp.3
    # Standard deviation     1.720206 0.17628306 0.099081994
    # Proportion of Variance 0.986369 0.01035857 0.003272414
    # Cumulative Proportion  0.986369 0.99672759 1.000000000
    
    # Loadings:
       # Comp.1 Comp.2 Comp.3
    # x1  0.579  0.180  0.795
    # x2  0.576 -0.781 -0.243
    # x3  0.577  0.598 -0.556
    
    #### 预测测样本主成分, 并作主成分分析
    pre<-predict(conomy.pr);
    conomy$z1<-pre[,1];
    conomy$z2<-pre[,2];
    lm.sol<-lm(y~z1+z2, data=conomy);
    # summary(lm.sol);
    # Call:
    # lm(formula = y ~ z1 + z2, data = conomy)
    # Residuals:
         # Min       1Q   Median       3Q      Max 
    # -0.79867 -0.45194  0.06536  0.36712  0.83831 
    
    # Coefficients:
                # Estimate Std. Error t value Pr(>|t|)    
    # (Intercept)  14.0300     0.1897  73.972 2.17e-11 ***
    # z1            2.3763     0.1103  21.552 1.17e-07 ***
    # z2            0.6977     1.0759   0.648    0.537    
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    
    # Residual standard error: 0.5998 on 7 degrees of freedom
    # Multiple R-squared:  0.9852,    Adjusted R-squared:  0.9809 
    # F-statistic: 232.5 on 2 and 7 DF,  p-value: 3.975e-07
    
    #### 作变换, 得到原坐标下的关系表达式
    beta<-coef(lm.sol); A<-loadings(conomy.pr);
    x.bar<-conomy.pr$center; x.sd<-conomy.pr$scale;
    coef<-(beta[2]*A[,1]+ beta[3]*A[,2])/x.sd;
    beta0 <- beta[1]- sum(x.bar * coef);
    c(beta0, coef);
    # (Intercept)          x1          x2          x3 
    # -7.75109994  0.04347167  0.10678004  0.14573976 
    
    ### 结论:y=-7.75109994+0.04347167x1+ 0.10678004x2+0.14573976x3

    文源代码和习题均来自于教材《统计建模与R软件》(ISBN:9787302143666,作者:薛毅)。

  • 相关阅读:
    洛谷——P2018 消息传递
    洛谷——P2827 蚯蚓
    洛谷——P1120 小木棍 [数据加强版]
    洛谷——P1168 中位数
    洛谷——P1850 换教室
    Kali-linux使用Metasploit基础
    Kali-linux使用Metasploitable操作系统
    Kali-linux使用OpenVAS
    Kali-linux使用Nessus
    Kali-linux绘制网络结构图
  • 原文地址:https://www.cnblogs.com/liulele/p/9083273.html
Copyright © 2011-2022 走看看