zoukankan      html  css  js  c++  java
  • Experiments on the NYC dataset(updated 7th,Aug)

    Experiments on the NYC datasets,

    here is the dataset link: https://sites.google.com/site/yangdingqi/home/foursquare-dataset

    Forgive me being lazy and uploading a manuscript photo about the preprocessing of the data:

    The codes are available on the github, here is the link:
    Binary Tests

    Take into each user's check in time

    And This is the result I run the code on cluster:

    unique user&venue checkin combination in test 18205
    unique user&venue checkin combination in test 72819
    max num in matrix 1.0
    max num in train 1.0
    I am beginning to model
    model has been fitted
    this is the binary model
    Time used: 4.789567
    Train_auc is 0.999504
    Test_aus is 0.654491
    /home/s2013258/.local/lib/python3.5/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
      "This module will be removed in 0.20.", DeprecationWarning)
    
    
    
    
    unique user&venue checkin combination in test 18205
    unique user&venue checkin combination in test 72819
    max num in matrix 257
    max num in train 205
    I am beginning to model
    model has been fitted
    this is the model that consider the checkin times
    Time used: 4.782983
    Train_auc is 0.999508
    Test_aus is 0.655189
    /home/s2013258/.local/lib/python3.5/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
      "This module will be removed in 0.20.", DeprecationWarning)

    As for the hybrid model, I have nort tried it yet, TBC.....

    ##Hybrid Model

    Got some unexpected results!

    The github link is the same. already updated it.

    Here i s the result running on cluster:

    unique user&venue checkin combination in test 18205
    unique user&venue checkin combination in test 72819
    max num in matrix 170
    max num in train 257
    I am beginning to model
    model has been fitted
    this is the model that consider the checkin times
    Time used: 4.2123550000000005
    Train_auc is 0.999521
    Test_aus is 0.653367
    Collabrative Filtering testAUC is: 0.682076
    Hybrid train auc is 0.518521
    Hybrig test auc is 0.514115
    /home/s2013258/.local/lib/python3.5/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
      "This module will be removed in 0.20.", DeprecationWarning)

    The train AUC and the train AUC in the hybrid models are both way much lower than the ordinary CF.

    In such a non-cold-start problem, maybe the item feature labels are unnessary?

    But the act that model bias been set to zero do improve the AUC significanty.

  • 相关阅读:
    Dumpbin 工具的使用
    ffmpeg Windows下采集摄像头一帧数据,并保存为bmp图片
    directdraw显示yuv视频,出现屏保时,yuv显示不出来,表面丢失
    DirectX截图黑屏的解决办法
    VS2008 Project : error PRJ0019: 某个工具从以下位置返回了错误代码: "正在执行生成后事件..."解决方案
    RoundingMode 几个参数详解
    IDEA导入eclipse项目并部署运行完整步骤(转发)
    Intellij idea操作maven时控制台中文乱码
    java 替换json字符串中间的引号保留两边的引号,避免json校验失败
    分布式ID解决方案
  • 原文地址:https://www.cnblogs.com/fassy/p/7281663.html
Copyright © 2011-2022 走看看