Intro to Online machine learning

zoukankan html css js c++ java

Intro to Online machine learning
Online vs. offline

online
- Input processed piece by piece in a serial fashion
- Each new piece of information generates an event
- Not neccessarily low latency（不一定是低延迟的）
offline
- Input processed in batches
- Not neccessarily high latency
NOTE: Online doesn't mean fast, online doesn't mean streaming, online only means that it processes information as soon as it is received.

总结：在线学习主要指的是接收数据后马上进行学习，而不是得到所有数据后分批（batch）处理数据（更新模型参数）。

vs. Incremental Learning : 在“Incremental Learning from Scratch for Task-Oriented Dialogue Systems.”论文中增量学习指的是在测试的时候也要更新模型，即训练得到的模型参数不是freezing的。即：增量学习可以每当新增数据时，并不需要重建所有的知识库，而是在原有知识库的基础上（利用已经学习到的模型），仅做由于新增数据所引起的更新。但是增量学习可以one by one (or batch by batch)地处理数据。

Lamda vs. Kappa (Machine Learning)

Lamda
- Learning happens offline
- Model used by streaming engine to make decision online
Kappa
- Learning happens online
- Online decision model updates for each new record seen
总结：Lamda指的是离线训练，测试的时候一个一个示例的测试（online）；Kappa指的是在线训练（每接收到一个实例）就要更新模型参数，并且测试的时候也要更新参数（增量学习？）。

Statistical vs. Adversarial

Traditional
- Common statistical methods: supervised and unsupervised
- Graded by statistical fitness tests and out of core testing e.g. MSE, MAPE, R2
Adversarial
- Algorithm versus environment e.g. vs Spammers, vs Hackers, vs Nature
- Graded by directionally can some tests and really A/B testing: adversaries may get smarter over time
总结：对抗学习指的是与环境之间的交互（两个模型之间的），通过另一个模型来学习。

real-time
- Subjective
- A good buzzword for something that:
- Doesn't fall intot any of the above categories cleanly

- Doesn't fall intot any of the above category you want it to fall into

- You're not really sure which buzzword to use, so you need a 'safe' word that no one can call you on

总结：实时可能指的是days，weeks。根据接收数据的时间决定。

(from https://www.youtube.com/watch?v=O3gd6elZOlA)

补充阅读和理解：

1.

We can distinguish two learning modes: offline learning and online learning. In offline learning, the whole training data must be available at the time of model training. Only when training is completed can the model be used for predicting. In contrast, online algorithms process data sequentially. They produce a model and put it in operation without having the complete training dataset available at the beginning. The model is continuously updated during operation as more training data arrives.

Less restrictive than online algorithms are incremental algorithms that process input examples one by one (or batch by batch) and update the decision model after receiving each example. Incremental algorithms may have random access to previous examples or representative/selected examples. In such a case, these algorithms are called in- cremental algorithms with partial memory. Typically, in incremental algorithms, for any new presentation of data, the update operation of the model is based on the previous one. Streaming algorithms are online algorithms for processing high-speed continuous flows of data. In streaming, examples are processed sequentially as well and can be examined in only a few passes (typically just one). These algorithms use limited memory and limited processing time per item.

(from Gama, João, et al. "A survey on concept drift adaptation." ACM Computing Surveys (CSUR)46.4 (2014): 44.)

2.

https://www.zhihu.com/question/38713098

3.

https://blog.csdn.net/zyazky/article/details/51942135
查看全文

相关阅读:
.net 中数据库的查询参数写法。微软其他开发环境同理
 【开源】女人值钱计算器,C++
远程桌面的端口修改
 .net 中使用ActiveX控件的自动创建的包装器的问题（自动生成的Interop.Ax*Lib.dll）
ASP.NET站点的同时部署给不同的客户。通过数据库配置站点的Top、版权、站点名称
 当下常见的十大（现在补充了，是十一大手机操作系统）手机（平板）操作系统
 项目外包，类似QQ这样界面的客户端，要求界面漂亮，功能是帮助客户完成在线业务的功能。
stdoled.dll 的问题
 dotnet调用外部dll中，参数数据类型的问题
 前两天用VC6做的修改远程桌面端口的命令行小程序，源码。

原文地址：https://www.cnblogs.com/tristatl/p/13098013.html

Intro to Online machine learning

Online vs. offline

Lamda vs. Kappa (Machine Learning)

Statistical vs. Adversarial

real-time