zoukankan      html  css  js  c++  java
  • for large number of missing value imputation in R

    using the package “MICE”

    install.packages('mice')
    library('mice')
    pMiss <- function(x){sum(is.na(x))/length(x)}

    apply(AFE_psi[,c(12:70)], 1, pMiss)
    apply(AFE_psi[,c(12:70)], 2, pMiss)
    md.pattern(AFE_psi[,c(12:70)])
    install.packages('VIM')
    library('VIM')
    aggr_plot <- aggr(AFE_psi[c(1:20),c(12:70)], col=c('navyblue','red'),
    numbers=TRUE,
    sortVars=TRUE,
    labels=names(AFE_psi[c(1:20),c(12:70)]),
    cex.axis=.9,
    gap=3,
    ylab=c("AFE_data_missing_value_histgram","modes"))
    marginplot(AFE_psi[,c(16,17)])###2-2samples_box_plot
    tempData <- mice(AFE_psi[,c(12:70)],m=5,maxit=6,meth='pmm',seed=600)

    summary(tempData)

    可以看到我的59个样本中每个样本的missing value 有多少个

    把imputation之后的的dataset用不同的结果数据还原回去

    completedData <- complete(tempData,1)##using the first column data to imputate the original dataframe
    completedData_2 <- complete(tempData,2)##using the second column data to imputate the original dataframe
    查看初始数据和插补数据的分布情况

    to compare both distribution to infer the imputated data's fidelity 
    library(lattice)
    xyplot(tempData,C000WYB3 ~ C00184B5+C00192B6+C001KHSR,pch=18,cex=1)######for the first four samples xyplot
    densityplot(tempData)
    stripplot(tempData, pch = 20, cex = 1.2)

     前五个样本之间的缺失值存在或者不存在的情况之下,boxplot的分布情况,越一致数据越不受缺失值的影响。

    前四个样本之间imputation data 之前之后的fitting situation,洋红色代表imputation之后的data蓝色的代表的是直接observed的data

    最后用第二列数据做还原的原始data frame如下:

    ######this chapter mainly using the method of imputationpr——edictive mean matching method,and mice package in R

  • 相关阅读:
    centos7下升级SSH
    docker: read tcp 192.168.7.235:36512->54.230.212.9:443: read: connection reset by peer.
    Rancher学习笔记----在UI界面添加主机页面无法正常显示
    Rancher3----安装部署rancher
    Rancher2-----了解什么是rancher以及简单部署
    unity坑-编译错误
    游戏UI系统设计
    使用采样器声明
    着色器数据类型和精度
    着色器编译目标等级
  • 原文地址:https://www.cnblogs.com/beckygogogo/p/9244617.html
Copyright © 2011-2022 走看看