MapReduce Patterns - Counting with Counters-yolaiyoqu-ChinaUnix博客

Chinaunix首页 | 论坛 | 博客

首页　| 　博文目录　| 　关于我

博客访问： 150920
博文数量： 28
博客积分： 1646
博客等级：上尉
技术积分： 405
用户组：普通用户
注册时间： 2007-03-12 14:28

文章分类

全部博文（28）

Android（0）
MySQL（0）
算法与数据结构（5）
云计算（9）

openstack（2）

Hadoop（7）
C（0）
Java（0）
Python（0）
Linux（6）
面试（8）

每日一题（5）
未分配的博文（0）

文章存档

2013年（28）

我的朋友

最近访客

推荐博文

相关博文

MapReduce Patterns - Counting with Counters

分类： HADOOP

2013-03-28 14:28:08

Pattern Name	Counting with Counters
Category	Summarization Patterns
Description	This pattern utilizes the MapReduce framework’s counters utility to calculate a global sum entirely on the map side without producing any output.
Intent	An efficient means to retrieve count summarizations of large data sets.
Motivation	This pattern describes how to utilize these custom counters to gather count or summarization metrics from your data sets. The major benefit of using counter is all the counting can be done during the map phase.
Applicability	Counting with counters should be used when: ? You have a desire to gather counts of summations over large data sets. ? The number of counters you are going to create is small – in the double digits.
Structure	? The Mapper processes each input record at a time to increment counters based on certain criteria. These counters are then aggregated by the TaskTrackers running the tasks and incrementally reported to the JobTracker for overall aggregation upon job success. The counters from any failed tasks are disregarded by the JobTracker in the final summation. ? As this job is map only, there is no combiner, partitioner, or reducer required.
Consequences	The final output is a set of counters grabbed from the job framework. There is no actual output from the analytic itself. However, the job requires an output directory to execute. This directory will exist and contain a number of empty part files equivalent to the number of map tasks. This directory should be deleted on job completion.
Known uses	Count number of records Count a small number of unique instances Summations
Resemblances
Performance analysis	Using counters is very fast, as data is simply read in through the mapper and no output is written. Performance depends largely on the number of map tasks being executed and how much time it takes to process each record.
Examples	Number of users per state

阅读(1823) | 评论(0) | 转发(0) |

0

上一篇：MapReduce Patterns - Inverted Index Summarizations

下一篇：8道经典逻辑推理题

给主人留下些什么吧！~~

关于我们 | 关于IT168 | 联系方式 | 广告合作 | 法律声明 | 免费注册

Copyright 2001-2010 ChinaUnix.net All Rights Reserved 北京皓辰网域网络信息技术有限公司. 版权所有

感谢所有关心和支持过ChinaUnix的朋友们