zoukankan      html  css  js  c++  java
  • R Programming week 3-Loop functions

    Looping on the Command Line

    Writing for, while loops is useful when programming but not particularly easy when working interactively on the command line. There are some functions which implement looping to make life easier

    lapply: Loop over a list and evaluate a function on each elementsapply: Same as lapply but try to simplify the result

    apply: Apply a function over the margins of an array

    tapply: Apply a function over subsets of a vector mapply: Multivariate version of lapply

    An auxiliary function split is also useful, particularly in conjunction with lapply

    lapply

    lapply takes three arguments: (1) a list X; (2) a function (or the name of a function) FUN; (3) other arguments via its ... argument. If X is not a list, it will be coerced to a list using as.list.

    ## function (X, FUN, ...)

    ## {

    ## FUN <- match.fun(FUN)

    ## if (!is.vector(X) || is.object(X))

    ## X <- as.list(X)

    ## .Internal(lapply(X, FUN))

    ## }

    ## <bytecode: 0x7ff7a1951c00>

    ## <environment: namespace:base>

    The actual looping is done internally in C code.

    lapply always returns a list, regardless of the class of the input.

    x <- list(a = 1:5, b = rnorm(10))

    lapply(x, mean)

    x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5)) lapply(x, mean)

    > x <- 1:4 > lapply(x, runif)

    lapply and friends make heavy use of anonymous function

    > x <- list(a = matrix(1:4, 2, 2), b = matrix(1:6, 3, 2))

    > x

    $a

    [,1] [,2]

    [1,] 1 3

    [2,] 2 4

    $b

    [,1] [,2]

    [1,] 1 4

    [2,] 2 5

    [3,] 3 6

    An anonymous function for extracting the first column of each matrix.

    > lapply(x, function(elt) elt[,1])

    $a

    [1] 1 2

    $b

    [1] 1 2 3

    sapply

    > x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5))

    > lapply(x, mean)

    apply

    apply is used to a evaluate a function (often an anonymous one) over the margins of an array.

    It is most often used to apply a function to the rows or columns of a matrix

    It can be used with general arrays, e.g. taking the average of an array of matrices

    It is not really faster than writing a loop, but it works in one line!

    > str(apply)

    function (X, MARGIN, FUN, ...)

    X is an array

    MARGIN is an integer vector indicating which margins should be “retained”.

    FUN is a function to be applied

    ... is for other arguments to be passed to FUN

    > x <- matrix(rnorm(200), 20, 10)

    > apply(x, 2, mean)

    [1] 0.04868268 0.35743615 -0.09104379

    [4] -0.05381370 -0.16552070 -0.18192493

    [7] 0.10285727 0.36519270 0.14898850

    [10] 0.26767260

    col/row sums and means

    For sums and means of matrix dimensions, we have some shortcuts.

    rowSums = apply(x, 1, sum)

    rowMeans = apply(x, 1, mean)

    colSums = apply(x, 2, sum)

    colMeans = apply(x, 2, mean)

    The shortcut functions are much faster, but you won’t notice unless you’re using a large matrix.

    Other Ways to Apply

    Quantiles of the rows of a matrix.

    > x <- matrix(rnorm(200), 20, 10)

    > apply(x, 1, quantile, probs = c(0.25, 0.75))

    mapply

    mapply is a multivariate apply of sorts which applies a function in parallel over a set of arguments.

    > str(mapply)

    function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE,USE.NAMES = TRUE)

    FUN is a function to apply ... contains arguments to apply over MoreArgs is a list of other arguments to FUN.

    SIMPLIFY indicates whether the result should be simplified

    The following is tedious to type

    list(rep(1, 4), rep(2, 3), rep(3, 2), rep(4, 1))

    Instead we can do

    Vectorizing a Function

    > noise <- function(n, mean, sd) {

    + rnorm(n, mean, sd)

    + }

    > noise(5, 1, 2)

    [1] 2.4831198 2.4790100 0.4855190 -1.2117759

    [5] -0.2743532

    > noise(1:5, 1:5, 2)

    [1] -4.2128648 -0.3989266 4.2507057 1.1572738

    [5] 3.7413584

    Instant Vectorization

    > mapply(noise, 1:5, 1:5, 2)

    Which is the same as

    list(noise(1, 1, 2), noise(2, 2, 2), noise(3, 3, 2), noise(4, 4, 2), noise(5, 5, 2))

    tapply

    tapply is used to apply a function over subsets of a vector. I don’t know why it’s called tapply.

    > str(tapply) function (X, INDEX, FUN = NULL, ..., simplify = TRUE)

    X is a vector

    INDEX is a factor or a list of factors (or else they are coerced to factors)

    FUN is a function to be applied

    ... contains other arguments to be passed FUN

    simplify, should we simplify the result?

    Take group means.

    > x <- c(rnorm(10), runif(10), rnorm(10, 1))

    > f <- gl(3, 10)

    > f

    [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3

    [24] 3 3 3 3 3 3 3

    Levels: 1 2 3

    > tapply(x, f, mean)

    1 2 3

    0.1144464 0.5163468 1.2463678

    Take group means without simplification.

    > tapply(x, f, mean, simplify = FALSE)

    $‘1‘

    [1] 0.1144464

    $‘2‘

    [1] 0.5163468

    $‘3‘

    [1] 1.246368

    Find group ranges.

    > tapply(x, f, range)

    $‘1‘

    [1] -1.097309 2.694970

    $‘2‘

    [1] 0.09479023 0.79107293

    $‘3‘

    [1] 0.4717443 2.5887025

    split

    split takes a vector or other objects and splits it into groups determined by a factor or list of factors.

    > str(split) function (x, f, drop = FALSE, ...)

    x is a vector (or list) or data frame

    f is a factor (or coerced to one) or a list of factors

    drop indicates whether empty factors levels should be dropped

    A common idiom is split followed by an lapply.

    > lapply(split(x, f), mean)

    Splitting a Data Frame

    > library(datasets)

    > head(airquality)

    > s <- split(airquality, airquality$Month)

    > lapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))

    > sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))

    > sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")], na.rm = TRUE))

    Splitting on More than One Level

    > x <- rnorm(10)

    > f1 <- gl(2, 5)

    > f2 <- gl(5, 2)

    Interactions can create empty levels.

    > str(split(x, list(f1, f2)))

    split

    Empty levels can be dropped

    > str(split(x, list(f1, f2), drop = TRUE))

    List of 6

    $ 1.1: num [1:2] -0.378 0.445

    $ 1.2: num [1:2] 1.4066 0.0166

    $ 1.3: num -0.355

    $ 2.3: num 0.315

    $ 2.4: num [1:2] -0.907 0.723

    $ 2.5: num [1:2] 0.732 0.360

    欢迎关注

  • 相关阅读:
    在Html中使用echarts图表
    html+css模拟微信对话
    解决React 的<img >src使用require的方式图片显示不出来,展示的是[object Module]的问题
    easygui入门
    python安装easygui
    关于gcc、make和CMake的区别
    FreeRTOS使用心得。
    C/C++整数输出位不足前补0方法
    AngularJS前端分页 + PageHelper后端分页
    AngularJS常见指令
  • 原文地址:https://www.cnblogs.com/jpld/p/4446804.html
Copyright © 2011-2022 走看看