zoukankan      html  css  js  c++  java
  • 协同过滤推荐系统

      二、ALS的应用设计

      1.输入数据

      (1)评分文件(rating.dat)

      该数据有四个字段,格式为UserID::MovieID::Rating::Timestamp,分别为用户编号、电影编号、评分、评分时间戳。

      其中,用户编号范围(1~6040)、电影编号(1~3952)、电影评分(0~5)、评分时间戳(单位:秒)另外,每个用户至少有20个电影评分。

    1::720::3::978300760
    1::1270::5::978300055
    1::527::5::978824195
    1::2340::3::978300103
    1::48::5::978824351
    1::1097::4::978301953
    1::1721::4::978300055
    1::1545::4::978824139

      (2)用户信息文件(users.dat)

      该数据有五个字段,格式为UserID::Gender::Age::Occupation::Zip-code,分别为用户编号、性别、年龄、职业、邮编。

      其中用户范围(1~6040)、性别(M为男性、F为女性)、年龄(单位:岁)、职业(21中职业分类的一种)、地区邮编

    12::M::25::12::32793
    13::M::45::1::93304
    14::M::35::0::60126
    15::M::25::7::22903
    16::F::35::0::20670
    17::M::50::1::95350
    18::F::18::3::95825
    19::M::1::10::48073

      (3)电影信息(movies.dat)

      该数据有三个字段,格式为MovieID::Title::Genres,分别为电影编号、电影名、电影类别。

      其中电影编号范围(1~3952)、电影名(由IMDB提供的标准电影名,包括上映年份)、电影分类(实际分类名)

    305::Ready to Wear (Pret-A-Porter) (1994)::Comedy
    306::Three Colors: Red (1994)::Drama
    307::Three Colors: Blue (1993)::Drama
    308::Three Colors: White (1994)::Drama
    309::Red Firecracker, Green Firecracker (1994)::Drama

      2.运行程序

      (1)启动IEDA,新建Scala工程--配置Project SDK与Scala SDK--新建包--导入Spark依赖包(File--+Java--选中Spark安装目录下jars文件夹下所有文件)--新建Scala Class--将代码复制到代码编辑区--Edit Configuration--Application(Name,Main Class,Program arguments(输入数据文件所在目录))--Run movieALS。(这个程序跑起来日志同样太长,用之前的方法仅显示Error级别的日志)

      (2)控制台提示输入

    Got 1000209 ratings from 6040 users on 3706 movies.
    Please rate the following movie (1-5 (best), or 0 if not seen):
    Raiders of the Lost Ark (1981): 2
    Fargo (1996): 1
    Sixth Sense, The (1999): 5
    Princess Bride, The (1987): 4
    Terminator, The (1984): 3
    Toy Story (1995): 1
    Gladiator (2000): 0
    Blade Runner (1982): 5
    Who Framed Roger Rabbit? (1988): 2
    One Flew Over the Cuckoo's Nest (1975): 2
    Abyss, The (1989): 3

      (3)最后运行结果

    Training: 602251, validation: 198919, test: 199049
    RMSE (validation) = 0.8800459390646345 for the model trained with rank = 8, lambda = 0.1, and numIter = 10.
    RMSE (validation) = 0.8721775968513282 for the model trained with rank = 8, lambda = 0.1, and numIter = 20.
    RMSE (validation) = 3.7558695311242833 for the model trained with rank = 8, lambda = 10.0, and numIter = 10.
    RMSE (validation) = 3.7558695311242833 for the model trained with rank = 8, lambda = 10.0, and numIter = 20.
    RMSE (validation) = 0.8775399600881826 for the model trained with rank = 12, lambda = 0.1, and numIter = 10.
    RMSE (validation) = 0.8712666782228532 for the model trained with rank = 12, lambda = 0.1, and numIter = 20.
    RMSE (validation) = 3.7558695311242833 for the model trained with rank = 12, lambda = 10.0, and numIter = 10.
    RMSE (validation) = 3.7558695311242833 for the model trained with rank = 12, lambda = 10.0, and numIter = 20.
    The best model was trained with rank = 12 and lambda = 0.1, and numIter = 20, and its RMSE on the test set is 0.8688556104046699.
    The best model improves the baseline by 21.97%.
    Movies recommended for you:
     1: Anatomy (Anatomie) (2000)
     2: Bandits (1997)
     3: Welcome to Woop-Woop (1997)
     4: Across the Sea of Time (1995)
     5: Down to You (2000)
     6: Window to Paris (1994)
     7: In the Mouth of Madness (1995)
     8: Fall (1997)
     9: Zachariah (1971)
    10: Six-String Samurai (1998)
    11: If Lucy Fell (1996)
    12: Fifth Element, The (1997)
    13: Faraway, So Close (In Weiter Ferne, So Nah!) (1993)
    14: Steal Big, Steal Little (1995)
    15: What Happened Was... (1994)
    16: Rosencrantz and Guildenstern Are Dead (1990)
    17: Ayn Rand: A Sense of Life (1997)
    18: Guantanamera (1994)
    19: Big Blue, The (Le Grand Bleu) (1988)
    20: Coldblooded (1995)
    21: Eighth Day, The (Le Huiti�me jour ) (1996)
    22: Mother Night (1996)
    23: Matrix, The (1999)
    24: Loss of Sexual Innocence, The (1999)
    25: Chambermaid on the Titanic, The (1998)
    26: Wisdom (1986)
    27: Beautiful Thing (1996)
    28: Fight Club (1999)
    29: I Am Cuba (Soy Cuba/Ya Kuba) (1964)
    30: Once Upon a Time... When We Were Colored (1995)
    31: Dune (1984)
    32: Julien Donkey-Boy (1999)
    33: Postman, The (1997)
    34: Total Eclipse (1995)
    35: Gladiator (2000)
    36: Leather Jacket Love Story (1997)
    37: Lost Highway (1997)
    38: Bewegte Mann, Der (1994)
    39: Splendor (1999)
    40: Babyfever (1994)
    41: Love Serenade (1996)
    42: Hamlet (1996)
    43: Ghost in the Shell (Kokaku kidotai) (1995)
    44: After Life (1998)
    45: But I'm a Cheerleader (1999)
    46: Committed (2000)
    47: Blue in the Face (1995)
    48: Taffin (1988)
    49: Perfect Blue (1997)
    50: Cross of Iron (1977)
    
    Process finished with exit code 0

      3.代码分析

  • 相关阅读:
    【codeforces 791D】 Bear and Tree Jumps
    【codeforces 791C】Bear and Different Names
    【codeforces 791B】Bear and Friendship Condition
    【codeforces 791A】Bear and Big Brother
    【t017】YL杯超级篮球赛
    Java Web整合开发(80) -- EJB & WebService
    搜索与排序
    T2821 天使之城 codevs
    T1155 金明的预算方案 codevs
    后缀表达式
  • 原文地址:https://www.cnblogs.com/BigJunOba/p/9362969.html
Copyright © 2011-2022 走看看