分类: 大数据

2013-08-28 09:32:52

很久没接触flume了,刚掀开官网一看,发现flume已然不是以前的那个flume了,其实早在flume技术群就听到NG这个字眼,以前没特注意,今天做了些对比,发现flume确实有了投胎换骨般的改变。首先介绍下Flume OG & Flume NG这两个概念

Flume OG:Flume original generation 即Flume 0.9.x版本

Flume NG:Flume next generation ,即Flume 1.x版本


对于Flume OG ,可以说他是一个分布式日志收集系统,有Mater概念,依赖于zookeeper,以下是其架构图


而对于Flume NG,它摒弃了Master和zookeeper,collector也没有了,web配置台也没有了,只剩下source,sink和channel,此时一个Agent的概念包括source,channel和sink,完全由一个分布式系统变成了传输工具。不同机器之间的数据传输不再是OG那样由agent->collector,而是由一个Agent端的sink流向另一个agent的source。其新的架构如下



Flume NG is a huge departure from Flume OG (original generation, or "original gangsta," if you prefer) in its implementation although many of the original concepts are the same. If you're already familiar with Flume, here's what you need to know.

  • You still have sources and sinks and they still do the same thing. They are now connected by channels.
  • Channels are pluggable and dictate durability. Flume NG ships with an in-memory channel for fast, but non-durable event delivery and a JDBC-based channel for durable event delivery. We have recently added a file-based durable channel too.
  • There's no more logical or physical nodes. We call all physical nodes agents and agents can run zero or more sources and sinks.
  • There's no master and no ZooKeeper dependency anymore. At this time, Flume runs with a simple file-based configuration system.
  • Just about everything is a plugin, some end user facing, some for tool and system developers. (Specifically, sources, sinks, channels, configuration providers, lifecycle management policies, input and output formats, compression, source and sink channel adapters, and the kitchen sink.)
  • Tons of things are not yet implemented. Please file  and / or vote for features you deem important.
