MapReduce模式MapReduce patterns - 走看看

zoukankan html css js c++ java

MapReduce模式MapReduce patterns
After having modified and run a job in the last post, we can now examine which are the most frequent patterns we encounter in MapReduce programming.
Although there are many of them, I think that the most important ones are:
- Summarization
- Filtering
- Structural
Let's examine them in detail.

Summarization
By summarization we mean all the jobs that perform numerical computation over a set of data, like:
- indexing
- mean (or other statistical functions) computation
- min/max computation
- count (we've seen the WordCount example)
Filtering
Filtering is the act of retrieving only a subset of a bigger dataset. Most used cases are retrieving all data belonging to a single user or the top-N elements (by some criteria) of the dataset. Another frequent use of filtering is for sampling a dataset: when we're dealing with a lot of data , is usually a good idea to subset the original data by choosing some elements randomly to verify the behaviour of our job.

Structural
When you need to operate on the structure of the data; most used case is a join on different data, like the ones we're used to on a RDBMS.

In the next posts, we'll see in more detail how to deal with these patterns.

from: http://andreaiacono.blogspot.com/2014/03/mapreduce-patterns.html
查看全文

相关阅读:
2019-9-2-win10-uwp-判断本地ip
2018-8-10-使用-Resharper-特性
 2018-8-10-WPF-checkbox文字下掉
 2018-8-10-调试-ms-源代码
 2018-8-10-cant-found-Microsoft.VSSDK.BuildTools.15.0.26201
2019-9-18-WPF-如何调试-binding
2018-8-10-WPF-控件继承树
 2018-8-10-sublime-Text-正则替换
 植物大战僵尸阳光冷却地址
 cs1.6 人物地址查询

原文地址：https://www.cnblogs.com/GarfieldEr007/p/5281211.html

Copyright © 2011-2022 走看看