Outline of the research(updated 8th,Aug)

zoukankan html css js c++ java

Outline of the research(updated 8th,Aug)

1.test on four square data

a) Binary matrix

b) Matrix counting in the times

c) Hybrid Model

d) consider the context (for example:time, location)

2. Similarity metrics:

since we are going to use "Friends' checkins" to predict, it is important to choose friends. And we wnat to see whether there are some biases in exsting similarity metrics.

We plan to use artifiicial matrix to evaluate the metrics.

3.Comparision between old-fashioned CF and MF:

ajust the parameters and see the gap between the best of old-fashioned CF and MF.

Updated Aug,8th

After reading some papers, I began to believe that there is little possibility that CF will outranged MF, so the point why we are experimenting on this may be that we want to know the probabilities that a user goes to a place he has gone to before. How much is the influence from his similarity friends in this decision?

But still what is the point in doing this? Is their any possibility we could dig something here?

And of course, if we are going to test the possibility that a person go to some place he has gone to before, then the construction of the train and test data matrix may be a little different from what we did before. We will not use the leave-out-k method, insted we will seperate the data by a time line.

Here is something on the data split by time:
http://scikit-learn.org/stable/modules/cross_validation.html

But I do not see much worth looking into in this direction. So I will put it in on hold and go to the next step to see the domain specific biases.

4.Domain Specific Biases:

We preprocessing the train data considering some geographical factors. There are already some works in it, some build a muticenter joint model which consider the user interest and the geographical influence seperately and multily the two probablities int the final step.

What we trying to do here: filter some negtive examples, and one step further strengthen some positive examples

for example:

A user in Enschede does not go to a chinese restaurant in Amsterdam does not mean that he does not like chinese restaurant, that is to say, it should not be cosidered as a negative example.

Besides, ther should be some positive example which should be strengthened. If a user go to a place far from his activitiy center(say Rotterdam) and he goes to a chinese restaurant , it probably means that this user does have a strong favor for chinese restaurant.

So it became some kind of using data far from activity center here?

Note: the blue color are the content written latest.

查看全文

相关阅读:
推荐一个golang的json库
 TinyMind 多标签图像分类竞赛之路
 动态环境下的slam问题如何解决？
ubuntu16.04下安装opencv3.4.1及其扩展模块
 Ubuntu 16.04 编译OpenCV 问题解决stdlib.h: No such file or directory
linux下升级gcc版本（gcc-7）
基于LSD的直线提取算法
 PL-SLAM
用U盘制作并安装WIN10 64位原版系统的详细教程（该方法应该适用于任何一版的原版操作系统）
Win10正式版U盘安装教程

原文地址：https://www.cnblogs.com/fassy/p/7276864.html