zoukankan      html  css  js  c++  java
  • 用R语言对购物篮数据进行关联分析及可视化

    数据格式:

    1001,Choclates
    1001,Pencil
    1001,Marker
    1002,Pencil
    1002,Choclates
    1003,Pencil
    1003,Coke
    1003,Eraser
    1004,Pencil
    1004,Choclates
    1004,Cookies
    1005,Marker
    1006,Pencil
    1006,Marker
    1007,Pencil
    1007,Choclates

    R Source Code:

    #Install the R package arules
    install.packages("arules");
    #load the arules package
    library("arules");
    # read the transaction file as a Transaction class
    # file – csv/txt
    # format – single/basket (For ‘basket’ format, each line in the transaction data file represents a transaction
    #           where the items (item labels) are separated by the characters specified by sep. For ‘single’ format,
    #           each line corresponds to a single item, containing at least ids for the transaction and the item. )
    # rm.duplicates – TRUE/FALSE
    # cols -   For the ‘single’ format, cols is a numeric vector of length two giving the numbers of the columns (fields)
    #           with the transaction and item ids, respectively. For the ‘basket’ format, cols can be a numeric scalar
    #           giving the number of the column (field) with the transaction ids. If cols = NULL
    # sep – “,” for csv, “	” for tab delimited
    txn = read.transactions(file="D:\Transactions_sample.csv", rm.duplicates= FALSE, format="single",sep=",",cols =c(1,2));
    # Run the apriori algorithm
    basket_rules <- apriori(txn,parameter = list(sup = 0.5, conf = 0.9,target="rules"));
    # Check the generated rules using inspect
    inspect(basket_rules);
    #If huge number of rules are generated specific rules can read using index
    inspect(basket_rules[1]);
    
    #To visualize the item frequency in txn file
    itemFrequencyPlot(txn);
    #To see how the transaction file is read into txn variable.
    inspect(txn);
    
    library(arulesViz)
    #arulesViz中有很多图形,介绍几个好看的,画图的对象都是rules
    plot(rules, shading="order", control=list(main = "Two-key plot"))
    plot(rules, method="grouped")
    plot(rules, method="graph")

    参考文献:

    [1] http://prdeepakbabu.wordpress.com/2010/11/13/market-basket-analysisassociation-rule-mining-using-r-package-arules/

    [2] http://www.maenchi.com/?p=172

  • 相关阅读:
    freemarker的${!}
    什么是分布式消息中间件?
    Webservice工作原理及实例
    Nginx的一些基本功能
    dubbo与zookeeper的关系
    为什么推荐Zookeeper作注册中心
    ORACLE和MYSQL的简单区别
    SQL优化|Java面试题
    玩转 lua in Redis
    解决KafKa数据存储与顺序一致性保证
  • 原文地址:https://www.cnblogs.com/kodyan/p/3738392.html
Copyright © 2011-2022 走看看