zoukankan      html  css  js  c++  java
  • R Programming week 3-Loop functions

    Looping on the Command Line

    Writing for, while loops is useful when programming but not particularly easy when working interactively on the command line. There are some functions which implement looping to make life easier

    lapply: Loop over a list and evaluate a function on each elementsapply: Same as lapply but try to simplify the result

    apply: Apply a function over the margins of an array

    tapply: Apply a function over subsets of a vector mapply: Multivariate version of lapply

    An auxiliary function split is also useful, particularly in conjunction with lapply

    lapply

    lapply takes three arguments: (1) a list X; (2) a function (or the name of a function) FUN; (3) other arguments via its ... argument. If X is not a list, it will be coerced to a list using as.list.

    ## function (X, FUN, ...)

    ## {

    ## FUN <- match.fun(FUN)

    ## if (!is.vector(X) || is.object(X))

    ## X <- as.list(X)

    ## .Internal(lapply(X, FUN))

    ## }

    ## <bytecode: 0x7ff7a1951c00>

    ## <environment: namespace:base>

    The actual looping is done internally in C code.

    lapply always returns a list, regardless of the class of the input.

    x <- list(a = 1:5, b = rnorm(10))

    lapply(x, mean)

    x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5)) lapply(x, mean)

    > x <- 1:4 > lapply(x, runif)

    lapply and friends make heavy use of anonymous function

    > x <- list(a = matrix(1:4, 2, 2), b = matrix(1:6, 3, 2))

    > x

    $a

    [,1] [,2]

    [1,] 1 3

    [2,] 2 4

    $b

    [,1] [,2]

    [1,] 1 4

    [2,] 2 5

    [3,] 3 6

    An anonymous function for extracting the first column of each matrix.

    > lapply(x, function(elt) elt[,1])

    $a

    [1] 1 2

    $b

    [1] 1 2 3

    sapply

    > x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5))

    > lapply(x, mean)

    apply

    apply is used to a evaluate a function (often an anonymous one) over the margins of an array.

    It is most often used to apply a function to the rows or columns of a matrix

    It can be used with general arrays, e.g. taking the average of an array of matrices

    It is not really faster than writing a loop, but it works in one line!

    > str(apply)

    function (X, MARGIN, FUN, ...)

    X is an array

    MARGIN is an integer vector indicating which margins should be “retained”.

    FUN is a function to be applied

    ... is for other arguments to be passed to FUN

    > x <- matrix(rnorm(200), 20, 10)

    > apply(x, 2, mean)

    [1] 0.04868268 0.35743615 -0.09104379

    [4] -0.05381370 -0.16552070 -0.18192493

    [7] 0.10285727 0.36519270 0.14898850

    [10] 0.26767260

    col/row sums and means

    For sums and means of matrix dimensions, we have some shortcuts.

    rowSums = apply(x, 1, sum)

    rowMeans = apply(x, 1, mean)

    colSums = apply(x, 2, sum)

    colMeans = apply(x, 2, mean)

    The shortcut functions are much faster, but you won’t notice unless you’re using a large matrix.

    Other Ways to Apply

    Quantiles of the rows of a matrix.

    > x <- matrix(rnorm(200), 20, 10)

    > apply(x, 1, quantile, probs = c(0.25, 0.75))

    mapply

    mapply is a multivariate apply of sorts which applies a function in parallel over a set of arguments.

    > str(mapply)

    function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE,USE.NAMES = TRUE)

    FUN is a function to apply ... contains arguments to apply over MoreArgs is a list of other arguments to FUN.

    SIMPLIFY indicates whether the result should be simplified

    The following is tedious to type

    list(rep(1, 4), rep(2, 3), rep(3, 2), rep(4, 1))

    Instead we can do

    Vectorizing a Function

    > noise <- function(n, mean, sd) {

    + rnorm(n, mean, sd)

    + }

    > noise(5, 1, 2)

    [1] 2.4831198 2.4790100 0.4855190 -1.2117759

    [5] -0.2743532

    > noise(1:5, 1:5, 2)

    [1] -4.2128648 -0.3989266 4.2507057 1.1572738

    [5] 3.7413584

    Instant Vectorization

    > mapply(noise, 1:5, 1:5, 2)

    Which is the same as

    list(noise(1, 1, 2), noise(2, 2, 2), noise(3, 3, 2), noise(4, 4, 2), noise(5, 5, 2))

    tapply

    tapply is used to apply a function over subsets of a vector. I don’t know why it’s called tapply.

    > str(tapply) function (X, INDEX, FUN = NULL, ..., simplify = TRUE)

    X is a vector

    INDEX is a factor or a list of factors (or else they are coerced to factors)

    FUN is a function to be applied

    ... contains other arguments to be passed FUN

    simplify, should we simplify the result?

    Take group means.

    > x <- c(rnorm(10), runif(10), rnorm(10, 1))

    > f <- gl(3, 10)

    > f

    [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3

    [24] 3 3 3 3 3 3 3

    Levels: 1 2 3

    > tapply(x, f, mean)

    1 2 3

    0.1144464 0.5163468 1.2463678

    Take group means without simplification.

    > tapply(x, f, mean, simplify = FALSE)

    $‘1‘

    [1] 0.1144464

    $‘2‘

    [1] 0.5163468

    $‘3‘

    [1] 1.246368

    Find group ranges.

    > tapply(x, f, range)

    $‘1‘

    [1] -1.097309 2.694970

    $‘2‘

    [1] 0.09479023 0.79107293

    $‘3‘

    [1] 0.4717443 2.5887025

    split

    split takes a vector or other objects and splits it into groups determined by a factor or list of factors.

    > str(split) function (x, f, drop = FALSE, ...)

    x is a vector (or list) or data frame

    f is a factor (or coerced to one) or a list of factors

    drop indicates whether empty factors levels should be dropped

    A common idiom is split followed by an lapply.

    > lapply(split(x, f), mean)

    Splitting a Data Frame

    > library(datasets)

    > head(airquality)

    > s <- split(airquality, airquality$Month)

    > lapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))

    > sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))

    > sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")], na.rm = TRUE))

    Splitting on More than One Level

    > x <- rnorm(10)

    > f1 <- gl(2, 5)

    > f2 <- gl(5, 2)

    Interactions can create empty levels.

    > str(split(x, list(f1, f2)))

    split

    Empty levels can be dropped

    > str(split(x, list(f1, f2), drop = TRUE))

    List of 6

    $ 1.1: num [1:2] -0.378 0.445

    $ 1.2: num [1:2] 1.4066 0.0166

    $ 1.3: num -0.355

    $ 2.3: num 0.315

    $ 2.4: num [1:2] -0.907 0.723

    $ 2.5: num [1:2] 0.732 0.360

    欢迎关注

  • 相关阅读:
    Swift实现单例
    UIViewContentMode说明
    打开或关闭Mac的隐藏文件的命令。
    struts2-2.3.20以上版本无法正常启动
    eclipse下导入jdk源码
    js 与css script
    eclipse导入jquery包后报错
    ${pageContext.request.contextPath} :JSP取得绝对路径方法
    小米暑期实习在线笔试2015-04-25
    android动态污点分析
  • 原文地址:https://www.cnblogs.com/jpld/p/4446804.html
Copyright © 2011-2022 走看看