zoukankan      html  css  js  c++  java
  • R语言学习笔记(四)

    R语言支持很多图形,并且有些图形是非常少见的,可能也因为自己不是专业弄数据分析的所以就孤陋寡闻了,总结下目前学习到的图形。

    条形图

    这个图比较常见,很多数据统计软件都支持这种图形,这种图形可以很好的展示数据的汇总结果,可以简洁明了的方式表达数据背后的含义

    > library(vcd)
    > counts<-table(Arthritis$Improved)
    > counts
    
    
    None Some Marked 
    42 14 28 
    > barplot(counts,main="Simple Bar Plot",xlab="Improvement",ylab=""Frequency)
    Error: unexpected symbol in "barplot(counts,main="Simple Bar Plot",xlab="Improvement",ylab=""Frequency"
    > barplot(counts,main="Simple Bar Plot",xlab="Improvement",ylab="Freqency")
    > 
    > barplot(counts,main="Horizontal Bar Plot",xlab="Frequency",ylab="Improvement",horiz=TRUE)
    

      

    堆砌图

    这个图是条形图的进化版本,它可以表达出更加丰富的含义,如果说条形图只能表达两个维度的结果,那么堆砌图则能表达三个维度的数据分析结果

    library(vcd)
    > counts<-table(Arthritis$Improved,Arthritis$Treatment)
    > counts
    
    Placebo Treated
    None 29 13
    Some 7 7
    Marked 7 21
    > barplot(counts,main="Stacked Bar Plot",xlab="Treatment",ylab="Frequency",col=c("red","yellow","green"),legend=rownames(counts))
    

      



    分组条形图

    和上面的堆砌图一样的效果,只是数据的展现方式不一样。

    > barplot(counts,main="Stacked Bar Plot",xlab="Treatment",ylab="Frequency",col=c("red","yellow","green"),legend=rownames(counts),beside=TRUE)
    

      

    均值图

    个人觉得和条形图类型,就图形而言,没有显著的差别。

    states<-data.frame(state.region,state.x77)
    means<-aggregate(states$Illiteracy,by=list(state.region),FUN=mean)
    > means
    Group.1 x
    1 Northeast 1.000000
    2 South 1.737500
    3 North Central 0.700000
    4 West 1.023077
    
    > means<-means[order(means$x),]
    > means
    Group.1 x
    3 North Central 0.700000
    1 Northeast 1.000000
    4 West 1.023077
    2 South 1.737500
    > barplot(means$x,names.arg = means$Group.1)
    > title("Mean Illiteracy Rate")
    > 
    > 
    > par(mar=c(5,8,4,2))
    > par(las=2)
    > counts<-table(Arthritis$Improved)
    > barplot(counts,main="Treatment Outcome", horiz=TRUE, cex.name=0.8, names.arg = c("No Improvement","Some Improvement", "Marked Improvement"))
    >
    

      

    荆状图

    和堆砌图类似,但是所有分组的高度都是一样的,唯一不同的则是分组中的色块面积大小,用来分析数据在某种情况下所占比例比较合适。

    > library(vcd)
    > counts<-table(Treatment,Improved)
    Error in table(Treatment, Improved) : object 'Treatment' not found
    > attach(Arthritis)
    > counts<-table(Treatment,Improved)
    > spine(counts,main="Spinogram Example")
    > counts
    Improved
    Treatment None Some Marked
    Placebo 29 7 7
    Treated 13 7 21
    

      

    饼图

    最常见的图,不多说了

    library(plotrix)
    
    > par(mfrow=c(2,2))
    > slices<-c(10,12,4,16,8)
    > lbls<-c("US","UK","Australia","Germany","France")
    > pie(slices,labels=lbls,main="Simple Pie Chart")
    > 
    > pct<-round(slices/sum(slices)*100)
    > lbls2<-paste(lbls," ",pct,"%",sep="")
    > lbls2
    [1] "US 20%" "UK 24%" "Australia 8%" "Germany 32%" "France 16%"
    
    > pie(slices,labels=lbls,explode=0.1,main="3D Pie Chart ")
    > pie(slices,labels=lbls2,col=rainbow(length(lbls2)),main="Pie Chart wit Precentage")
    > pie3D(slices,labels=lbls,explode=0.1,main="3D Pie Chart ")
    > mytable<-table(state.region)
    > pie(mytable,labels=lbls3,main="Pie Chart from a Table
     (with sample sizes)")
    

      

    扇形图

    和饼图类型,不过这个图形还是比较少见的

    > library(plotrix)
    > slices<-c(10,12,4,16,8)
    > lbls<-c("US","UK","Australia","Germany","France")
    > fan.plot(slices,labels=lbls,main="Fan Plot")
    

      

    直方图

    柱图,最常见的图,和之前提到的条形图类似。

    > par(mfrow=c(2,2))
    
    > hist(mtcars$mpg)
    > 
    > hist(mtcars$mpg,breaks=12,col="red",xlab="Miles Per Gallon",main="Colored histogram with 12 bins")
    > 
    > 
    > hist(mtcars$mpg,freq=FALSE,col="red",xlab="Miles Per Gallon",main="Histogram, rug plot, density curve")
    > rug(jitter(mycars$mpg)) #轴须图
    > lines(density(mtcars$mpg),col="blue",lwd=2) #密度曲线
    
    > x<-mtcars$mpg
    > h<-hist(x,breaks=12,col="red",xlab="Miles Per Gallon",main="Histogram with normal curve and box")
    > xfit<-seq(min(x),max(x),length=40)
    > yfit<-dnorm(xfit,mean=mean(x),sd=sd(x))
    > yfit<-yfit*diff(h$mids[1:2])*length(x)
    > lines(xfit,yfit,col="blue",lwd=2)
    > box()
    > mtcars$mpg
    [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 10.4
    [17] 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7 15.0 21.4
    

      

    核密度图

    这个图形比较少见,有点像原始版本的热点图,用来显示变量的密度关系。

    > library(sm)
    >par(mfrow=c(2,1))
    > d<-density(mtcars$mpg)
    > plot(d)
    
    
    > d<-density(mtcars$mpg)
    > plot(d,main="Kernel Density of Miles Per Gallon")
    > polygon(d,col="red",border="blue")
    > attach(mtcars)
    > cyl.f<-factor(cyl,levels=c(4,6,8),labels=c("4 cylinder","6 cylinder","8 cylinder"))
    > sm.density.compare(mpg,cyl,xlab="Miles Per Gallon")
    > title(main="MPG Distribution by Car Cylinders")
    > 
    > colfill<-c(2:(1+length(levels(cyl.f)))) #这行代码没效果
    > legend(locator(1),levels(cyl.f),fill=colfill)
    

      

    箱线图

    这个图也比较有意思,它主要关注一组观察变量的5个指标:Min,1/4,mean,4/3,Max。第一次发现这么有意思的分析方式,不过在日常的统计中,这5ge指标应该是经常被使用的,所以箱线图也是非常实用的一种图形。

    boxplot(mtcars$mpg,main="Box plot",ylab="Miles per Gallon")
    >
    
    > boxplot(mpg~cyl,data=mtcars,main="Car Mileage Data", xlab="Number of Cylinders",ylab="Miles Per Gallon")
    
    boxplot(mpg~cyl,data=mtcars,notch=TRUE,varwidth=TRUE,col="red",main="Car Mileage Data",xlab="Number of Cylinders",ylab="Miles Per Gallon") #有对称效果的箱线图,该图形包含了变量密度信息
    
    #分组箱线图
    mtcars$cyl.f<-factor(mtcars$cyl,levels=c(4,6,8),labels=c("4","6","8"))
    > mtcars$cyl.f
    mtcars$am.f<-factor(mtcars$am,levels=c(0,1),labels=c("auto","standard"))
    > mtcars$am.f
    [1] standard standard standard auto auto auto auto auto auto 
    [10] auto auto auto auto auto auto auto auto standard
    [19] standard standard auto auto auto auto auto standard standard
    [28] standard standard standard standard standard
    Levels: auto standard
    > boxplot(mpg~am.f*cyl.f,data=mtcars,varwidth=TRUE,col=c("gold","darkgreen"),main="MPG Distribution by Auto Type",xlab="Auto Type",ylab="Miles Per Gallon")
    >
    

      

    小提琴图

    和箱线图的分析套路类似,但是提供更加明确的变量密度分布信息。

    > library(vioplot)
    x1<-mtcars$mpg[mtcars$cyl==4]
    > x2<-mtcars$mpg[mtcars$cyl==6]
    > x3<-mtcars$mpg[mtcars$cyl==8]
    > vioplot(x1,x2,x3,names=c("4 cyl","6 cyl","8 cyl"),col="gold")
    > title("Violin Plots of Miles Per Gallon",ylab="Miles Per Gallon",xlab="Number of Cylinders")
    

      

    点图

    也是一种比较常见的图,它的进化版本应该是散点图

    > dotchart(mtcars$mpg, labels=row.names(mtcars),cex=.7,main="Gas Mileage for Car Models",xlab="Miles Per Gallon")
    >
    
    #分组散点图
    > x<-mtcars[order(mtcars$mpg),]
    > x$cyl<-factor(x$cyl)
    > x$color[x$cyl==4] <- "red"
    > x$color[x$cyl==6] <- "blue"
    > x$color[x$cyl==8]<- "darkgreen"
    > dotchart(x$mpg,labels=row.names(x),cex=.7,groups=x$cyl,gcolor="black",color=x$color,pch=19,main="Gas Mileage for Car Models
    grouped by cylinder", xlab="Miles Per Gallon")
    

      

  • 相关阅读:
    C 和 C++ 的标准库分别有自己的 locale 操作方法,C 标准库的 locale 设定函数是 setlocale(),而 C++ 标准库有 locale 类和流对象的 imbue() 方法(gcc使用zh_CN.GBK,或者zh_CN.UTF-8,VC++使用Chinese_People's Republic of China.936或者65001.)
    QCache 缓存(模板类,类似于map,逻辑意义上的缓存,方便管理,和CPU缓存无关。自动获得被插入对象的所有权,超过一定数量就会抛弃某些值)
    QBuffer简单操作(被看做一个标准的可随机访问的文件,支持信号)
    Qt里的原子操作QAtomicInteger
    进程、线程、协程、例程、过程
    net Core 2.2
    如何看源码
    code review规则
    NET Core中使用Dapper操作Oracle存储过程
    实现一个Promise
  • 原文地址:https://www.cnblogs.com/GhostBear/p/7592318.html
Copyright © 2011-2022 走看看