R读书笔记一:变量,向量,数组,矩阵,数据框,读写文件,控制流
1、创建向量和矩阵
函数c( ), length( ), mode( ), rbind( ), cbind( )
1)创建向量,求向量长度,向量类型。
> x1=c(2,4,6,8,0) > x2=c(1,3,5,7,9) > length(x1) [1] 5 > mode(x1) [1] "numeric"
> x1 [1] 2 4 6 8 0 > x1[3] [1] 6 > a1=c(1:100) > length(a1) [1] 100
2)创建矩阵,向量合成矩阵。
> rbind(x1,x2) [,1] [,2] [,3] [,4] [,5] x1 2 4 6 8 0 x2 1 3 5 7 9 > cbind(x1,x2) x1 x2 [1,] 2 1 [2,] 4 3 [3,] 6 5 [4,] 8 7 [5,] 0 9 > m1=rbind(x1,x2) > m1 [,1] [,2] [,3] [,4] [,5] x1 2 4 6 8 0 x2 1 3 5 7 9
2、求平均值,和,连乘,最值,方差,标准差
函数mean( ), sum( ), min( ), max( ), var( ), sd( ), prod( )
> x=c(1:100) > mean(x) [1] 50.5 > sum(x) [1] 5050 > max(x) [1] 100 > min(x) [1] 1 > var(X) 错误于is.data.frame(x) : 找不到对象'X' > var(x) [1] 841.6667 > prod(x) [1] 9.332622e+157 > sd(x) [1] 29.01149
3、寻求帮助
> help(matrix)
> help(mode)
4、一些函数
1)which( )函数
> a=c(2,3,4,2,5,1,6,3,2,5,8,5,7,3) > which.max(a) [1] 11 > which.min(a) [1] 6 > a[which.max(a)] [1] 8 > which(a==2) [1] 1 4 9 > a[which(a==2)] [1] 2 2 2 > which(a>5) [1] 7 11 13 > a[which(a>5)] [1] 6 8 7
2)seq( )函数
> seq(5,20) [1] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 > seq(5,121,by=2) [1] 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 [23] 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 [45] 93 95 97 99 101 103 105 107 109 111 113 115 117 119 121 > seq(5,121,by=2,length=10) 错误于seq.default(5, 121, by = 2, length = 10) : 太多参数 > seq(5,121,length=10) [1] 5.00000 17.88889 30.77778 43.66667 56.55556 69.44444 82.33333 95.22222 [9] 108.11111 121.00000
3)字母序列letters
> letters[1:30] [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" [23] "w" "x" "y" "z" NA NA NA NA
5、生成矩阵
> a1=c(1:12) > matrix(a1,nrow=3,ncol=4) [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12 > matrix(a1,nrow=4,ncol=3) [,1] [,2] [,3] [1,] 1 5 9 [2,] 2 6 10 [3,] 3 7 11 [4,] 4 8 12 > matrix(a1,nrow=4,ncol=3,byrow=T) [,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6 [3,] 7 8 9 [4,] 10 11 12
6、数据框
> x1=c(10,13,45,26,23,12,24,78,23,43,31,56) > x2=c(20,65,32,32,27,87,60,13,42,51,77,35) > x=data.frame(x1,x2) > x x1 x2 1 10 20 2 13 65 3 45 32 4 26 32 5 23 27 6 12 87 7 24 60 8 78 13 9 23 42 10 43 51 11 31 77 12 56 35 > x=data.frame('重量'=x1,'运费'=x2) > x 重量 运费 1 10 20 2 13 65 3 45 32 4 26 32 5 23 27 6 12 87 7 24 60 8 78 13 9 23 42 10 43 51 11 31 77 12 56 35
7、综合学习
模拟产生统计专业同学的名单(学号区分),记录数学分析,线性代数,概率统计三科成绩,然后进行一些统计分析。
生成学号:
> num=seq(10378001,10378100) > num [1] 10378001 10378002 10378003 10378004 10378005 10378006 10378007 10378008 10378009 [10] 10378010 10378011 10378012 10378013 10378014 10378015 10378016 10378017 10378018 [19] 10378019 10378020 10378021 10378022 10378023 10378024 10378025 10378026 10378027 [28] 10378028 10378029 10378030 10378031 10378032 10378033 10378034 10378035 10378036 [37] 10378037 10378038 10378039 10378040 10378041 10378042 10378043 10378044 10378045 [46] 10378046 10378047 10378048 10378049 10378050 10378051 10378052 10378053 10378054 [55] 10378055 10378056 10378057 10378058 10378059 10378060 10378061 10378062 10378063 [64] 10378064 10378065 10378066 10378067 10378068 10378069 10378070 10378071 10378072 [73] 10378073 10378074 10378075 10378076 10378077 10378078 10378079 10378080 10378081 [82] 10378082 10378083 10378084 10378085 10378086 10378087 10378088 10378089 10378090 [91] 10378091 10378092 10378093 10378094 10378095 10378096 10378097 10378098 10378099 [100] 10378100
生成三科成绩:
> x1=round(runif(100,min=80,max=100)) > x1 [1] 98 83 95 81 86 96 89 94 86 98 94 96 85 98 96 82 94 91 98 82 98 99 [23] 86 91 88 81 89 95 89 83 86 99 80 87 99 82 86 84 97 98 93 82 82 91 [45] 83 97 81 90 84 99 82 80 99 83 96 83 100 91 81 83 84 95 94 90 93 84 [67] 95 97 95 90 96 87 90 88 84 92 84 84 92 93 97 83 96 89 80 90 92 98 [89] 87 82 83 87 82 98 86 93 86 94 98 86
> x2=round(rnorm(100,mean=80,sd=7)) > x2 [1] 92 78 79 85 87 80 83 79 87 86 83 85 79 82 85 90 79 78 78 76 79 86 81 85 86 90 75 87 76 [30] 90 84 72 83 78 73 87 81 85 71 73 80 69 71 73 78 85 69 85 91 89 76 79 80 85 79 89 78 79 [59] 79 70 76 81 82 94 79 75 91 79 80 82 85 73 86 73 83 78 80 91 91 85 89 91 77 74 69 81 72 [88] 80 74 73 75 82 80 74 87 84 83 70 80 86
> x3=round(rnorm(100,mean=83,sd=18)) > x3 [1] 92 49 74 61 80 88 67 75 86 95 84 64 74 88 61 97 59 54 57 77 36 100 [23] 69 100 93 84 68 100 94 92 86 100 83 100 100 87 96 92 86 81 74 82 80 81 [45] 81 100 88 92 95 67 91 70 61 84 83 80 93 100 78 92 90 54 72 50 92 91 [67] 68 100 100 100 62 76 89 94 100 100 83 92 88 80 51 91 93 100 99 98 49 100 [89] 65 75 66 59 74 96 56 77 91 81 92 89 > x3[which(x3>100)]=100 > x3 [1] 92 49 74 61 80 88 67 75 86 95 84 64 74 88 61 97 59 54 57 77 36 100 [23] 69 100 93 84 68 100 94 92 86 100 83 100 100 87 96 92 86 81 74 82 80 81 [45] 81 100 88 92 95 67 91 70 61 84 83 80 93 100 78 92 90 54 72 50 92 91 [67] 68 100 100 100 62 76 89 94 100 100 83 92 88 80 51 91 93 100 99 98 49 100 [89] 65 75 66 59 74 96 56 77 91 81 92 89
合成数据框并保存到硬盘:
> x=data.frame(num,x1,x2,x3) > x num x1 x2 x3 1 10378001 98 92 92 2 10378002 83 78 49 3 10378003 95 79 74 4 10378004 81 85 61 5 10378005 86 87 80 6 10378006 96 80 88 7 10378007 89 83 67 8 10378008 94 79 75 9 10378009 86 87 86 10 10378010 98 86 95 11 10378011 94 83 84 12 10378012 96 85 64 13 10378013 85 79 74 14 10378014 98 82 88 15 10378015 96 85 61 16 10378016 82 90 97 17 10378017 94 79 59 18 10378018 91 78 54 19 10378019 98 78 57 20 10378020 82 76 77 21 10378021 98 79 36 22 10378022 99 86 100 23 10378023 86 81 69 24 10378024 91 85 100 25 10378025 88 86 93 26 10378026 81 90 84 27 10378027 89 75 68 28 10378028 95 87 100 29 10378029 89 76 94 30 10378030 83 90 92 31 10378031 86 84 86 32 10378032 99 72 100 33 10378033 80 83 83 34 10378034 87 78 100 35 10378035 99 73 100 36 10378036 82 87 87 37 10378037 86 81 96 38 10378038 84 85 92 39 10378039 97 71 86 40 10378040 98 73 81 41 10378041 93 80 74 42 10378042 82 69 82 43 10378043 82 71 80 44 10378044 91 73 81 45 10378045 83 78 81 46 10378046 97 85 100 47 10378047 81 69 88 48 10378048 90 85 92 49 10378049 84 91 95 50 10378050 99 89 67 51 10378051 82 76 91 52 10378052 80 79 70 53 10378053 99 80 61 54 10378054 83 85 84 55 10378055 96 79 83 56 10378056 83 89 80 57 10378057 100 78 93 58 10378058 91 79 100 59 10378059 81 79 78 60 10378060 83 70 92 61 10378061 84 76 90 62 10378062 95 81 54 63 10378063 94 82 72 64 10378064 90 94 50 65 10378065 93 79 92 66 10378066 84 75 91 67 10378067 95 91 68 68 10378068 97 79 100 69 10378069 95 80 100 70 10378070 90 82 100 71 10378071 96 85 62 72 10378072 87 73 76 73 10378073 90 86 89 74 10378074 88 73 94 75 10378075 84 83 100 76 10378076 92 78 100 77 10378077 84 80 83 78 10378078 84 91 92 79 10378079 92 91 88 80 10378080 93 85 80 81 10378081 97 89 51 82 10378082 83 91 91 83 10378083 96 77 93 84 10378084 89 74 100 85 10378085 80 69 99 86 10378086 90 81 98 87 10378087 92 72 49 88 10378088 98 80 100 89 10378089 87 74 65 90 10378090 82 73 75 91 10378091 83 75 66 92 10378092 87 82 59 93 10378093 82 80 74 94 10378094 98 74 96 95 10378095 86 87 56 96 10378096 93 84 77 97 10378097 86 83 91 98 10378098 94 70 81 99 10378099 98 80 92 100 10378100 86 86 89 > write.table(x,file="e:\mark.txt",col.names=F,row.names=F,sep=" ")
计算各科平均分:
> colMeans(x) num x1 x2 x3 10378050.50 89.64 79.48 82.39 > colMeans(x)[c("x1","x2","x3")] x1 x2 x3 89.64 79.48 82.39 > apply(x,2,mean) num x1 x2 x3 10378050.50 89.64 79.48 82.39
求各科最高、最低分、总分:
> apply(x,2,max) num x1 x2 x3 10378100 100 95 100 > apply(x,2,min) num x1 x2 x3 10378001 80 62 40 > apply(x[c("x1","x2","x3")],1,sum) [1] 245 258 275 256 262 276 255 235 255 255 255 229 232 246 275 215 254 265 247 274 243 256 [23] 246 255 243 241 275 242 262 217 243 278 277 249 230 222 238 267 208 229 238 236 262 257 [45] 251 228 250 256 276 272 245 241 270 261 237 272 226 231 247 258 268 281 260 276 250 236 [67] 247 274 257 224 241 211 278 234 278 246 252 264 270 256 285 239 273 230 264 221 224 262 [89] 257 252 243 239 271 243 261 222 273 264 261 265
总分最高同学:
> which.max(apply(x[c("x1","x2","x3")],1,sum)) [1] 43 > x$num[which.max(apply(x[c("x1","x2","x3")],1,sum))] [1] 10378043