《r语言实战》菜鸟学习笔记（二）

zoukankan html css js c++ java

《r语言实战》菜鸟学习笔记（二）
这一部分将要说明R语言的数据类型以及数据输入方面的内容

因子

R语言中变量可以归结为名义型，有序型和连续变量、

名义型：没有顺序之分的变量。如天气阴晴等

有序型：有顺序关系，但不是数量关系。心情好坏适中等

连续型：就是同是有数量和顺序。当然这里的连续型并不是数学中的连续，也包括离散数据

名义型和有序型在R中称为因子。

下面介绍factor（）函数
diabetes <- c("Type1", "Type2", "Type1", "Type1") diabetes <- factor(diabetes) #上面factor将此向量存储为（1,2,1,1），并在内部关联1=Type 2=Type2.
有序型需要在factor（）函数中制定 ordered=TRUE
STATUS <= C("Poor", "Improved", "Excellent", "Poor") status <- factor(status, ordered=TRUE) #向量编码为（3,2,1,3）
但是如何保证 1=Poor，2=Improved,3=Excelent呢，如下方法
status <- facotr(status, order=TRUE, levels = c("Poor", "Improved","Excellent"))
但是有序因子和普通的因子有什么区别呢？请看下面程序：
patientID <- c(1,2,3,4) age<- c(25,34,28,52) diabetes <- c("Type1", "Type2", "Type1", "Type1") status <- c("Poor", "Improved", "excellent", "Poor") diabetes <- factor(diabetes) status <- factor(status, order = TRUE) patientdata <- data.frame(patientID, age, diabetes, status) str(patientdata) #以下内容是输出

'data.frame': 4 obs. of 4 variables:
$ patientID: num 1 2 3 4
$ age : num 25 34 28 52
$ diabetes : Factor w/ 2 levels "Type1","Type2": 1 2 1 1
$ status : Ord.factor w/ 3 levels "excellent"<"Improved"<..: 3 2 1 3

summary(patientdata)
#以下是输出（没有对齐）

patientID age diabetes status
Min. :1.00 Min. :25.00 Type1:3 excellent:1
1st Qu.:1.75 1st Qu.:27.25 Type2:1 Improved :1
Median :2.50 Median :31.00 Poor :2
Mean :2.50 Mean :34.75
3rd Qu.:3.25 3rd Qu.:38.50
Max. :4.00 Max. :52.00
其中diabetes和status显示了频数.

列表

不要小看列表，R语言中的列表可以包含向量、矩阵、数据框、其实其他列表。

mylist <-list(object1,....)

mylist <-list(name1 = object1,name2=object2,...)

举个例子
g <- "My First List" h <- c(25, 26, 18, 39) j <- matrix(1:10, nrow = 5) k <- c("one", "two", "three") mylist <- list(title=g, ages=h, j,k) mylist #以下是运行结果

$title
[1] "My First List"

$ages
[1] 25 26 18 39

[[3]]
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10

[[4]]
[1] "one" "two" "three"
元素分别为：字符串，数值型向量，矩阵和字符型向量.

tips：

1. R中没有标量

2. R的下标从1开始

3. 变量无法声明

数据的输入

1. 使用键盘输入
mydata <- data.frame(age=numeric(0),gender=character(0), weight=numeric(0)) mydata <- edit(mydata)#或者fix(mydata)
2. 带分割符号的文本文件
mydataframe <- read.table(file, header=logical_value, sep=“delimiter", row.names="name")
其中file是带有分隔符的ascii文本文件，header是一个表明首行是否包含了变量名的逻辑值，sep用来指定分割数据的分隔符，row.names是一个可选参数，用以指定一个或者多个表示行标识符的变量。

举例：
grade <- read.table("studentgrades.csv", header=TRUE,sep=",", row.names="STUDENTID"
查看全文

相关阅读:
beta冲刺（6/7）
beta冲刺（5/7）
beta冲刺（4/7）
beta（3/7）
beta冲刺（2/7）
beta冲刺（1/7）
团队项目测评博客
 东华理工18级计科五班团队作业六
 东华理工18级计科五班团队作业五
 东华理工18级计科五班团队作业四

原文地址：https://www.cnblogs.com/shyustc/p/4004014.html

《r语言实战》菜鸟学习笔记（二）

因子

列表

数据的输入