zoukankan      html  css  js  c++  java
  • R语言学习笔记(十七):data.table包中melt与dcast函数的使用

    melt函数可以将宽数据转化为长数据

    dcast函数可以将长数据转化为宽数据

    > DT = fread("melt_default.csv")
    > DT
       family_id age_mother dob_child1 dob_child2 dob_child3
    1:         1         30 1998-11-26 2000-01-29         NA
    2:         2         27 1996-06-22         NA         NA
    3:         3         26 2002-07-11 2004-04-05 2007-09-02
    4:         4         32 2004-10-10 2009-08-27 2012-07-21
    5:         5         29 2000-12-05 2005-02-28         NA
    > DT.m1 <- melt(DT, measure.vars = c("dob_child1", "dob_child2", "dob_child3"),
    +               variable.name = "child", value.name = "dob")
    > DT.m1
        family_id age_mother      child        dob
     1:         1         30 dob_child1 1998-11-26
     2:         2         27 dob_child1 1996-06-22
     3:         3         26 dob_child1 2002-07-11
     4:         4         32 dob_child1 2004-10-10
     5:         5         29 dob_child1 2000-12-05
     6:         1         30 dob_child2 2000-01-29
     7:         2         27 dob_child2         NA
     8:         3         26 dob_child2 2004-04-05
     9:         4         32 dob_child2 2009-08-27
    10:         5         29 dob_child2 2005-02-28
    11:         1         30 dob_child3         NA
    12:         2         27 dob_child3         NA
    13:         3         26 dob_child3 2007-09-02
    14:         4         32 dob_child3 2012-07-21
    15:         5         29 dob_child3         NA
    > dcast(DT.m1, family_id + age_mother ~ child, value.var = "dob")
       family_id age_mother dob_child1 dob_child2 dob_child3
    1:         1         30 1998-11-26 2000-01-29         NA
    2:         2         27 1996-06-22         NA         NA
    3:         3         26 2002-07-11 2004-04-05 2007-09-02
    4:         4         32 2004-10-10 2009-08-27 2012-07-21
    5:         5         29 2000-12-05 2005-02-28         NA
    

    对于较为复杂的数据可以这样做

    > DT <- fread("melt_enhanced.csv")
    > DT
       family_id age_mother dob_child1 dob_child2 dob_child3 gender_child1 gender_child2 gender_child3
    1:         1         30 1998-11-26 2000-01-29         NA             1             2            NA
    2:         2         27 1996-06-22         NA         NA             2            NA            NA
    3:         3         26 2002-07-11 2004-04-05 2007-09-02             2             2             1
    4:         4         32 2004-10-10 2009-08-27 2012-07-21             1             1             1
    5:         5         29 2000-12-05 2005-02-28         NA             2             1            NA
    > DT.m2 <- melt(DT, measure = patterns("^dob","^gender"), value.name = c("dob", "gender"))
    > DT.m2
        family_id age_mother variable        dob gender
     1:         1         30        1 1998-11-26      1
     2:         2         27        1 1996-06-22      2
     3:         3         26        1 2002-07-11      2
     4:         4         32        1 2004-10-10      1
     5:         5         29        1 2000-12-05      2
     6:         1         30        2 2000-01-29      2
     7:         2         27        2         NA     NA
     8:         3         26        2 2004-04-05      2
     9:         4         32        2 2009-08-27      1
    10:         5         29        2 2005-02-28      1
    11:         1         30        3         NA     NA
    12:         2         27        3         NA     NA
    13:         3         26        3 2007-09-02      1
    14:         4         32        3 2012-07-21      1
    15:         5         29        3         NA     NA
    > DT.c2 <- dcast(DT.m2, family_id + age_mother ~ variable, value.var = c("dob","gender"))
    > DT.c2
       family_id age_mother      dob_1      dob_2      dob_3 gender_1 gender_2 gender_3
    1:         1         30 1998-11-26 2000-01-29         NA        1        2       NA
    2:         2         27 1996-06-22         NA         NA        2       NA       NA
    3:         3         26 2002-07-11 2004-04-05 2007-09-02        2        2        1
    4:         4         32 2004-10-10 2009-08-27 2012-07-21        1        1        1
    5:         5         29 2000-12-05 2005-02-28         NA        2        1       NA
    
  • 相关阅读:
    A1023 Have Fun with Numbers (20分)(大整数四则运算)
    A1096 Consecutive Factors (20分)(质数分解)
    A1078 Hashing (25分)(哈希表、平方探测法)
    A1015 Reversible Primes (20分)(素数判断,进制转换)
    A1081 Rational Sum (20分)
    A1088 Rational Arithmetic (20分)
    A1049 Counting Ones (30分)
    A1008 Elevator (20分)
    A1059 Prime Factors (25分)
    A1155 Heap Paths (30分)
  • 原文地址:https://www.cnblogs.com/xihehe/p/8304673.html
Copyright © 2011-2022 走看看