Chinaunix首页 | 论坛 | 博客
  • 博客访问: 418677
  • 博文数量: 22
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 1712
  • 用 户 组: 普通用户
  • 注册时间: 2013-09-09 10:51
文章分类

全部博文(22)

文章存档

2016年(3)

2015年(6)

2014年(1)

2013年(12)

我的朋友

分类: 大数据

2016-03-14 20:58:22

    本文只会涉及flume和kafka的结合,kafka和storm的结合可以参考其他博客
1. flume安装使用

    下载flume安装包
    解压$ tar -xzvf apache-flume-1.5.2-bin.tar.gz -C /opt/flume
    flume配置文件放在conf文件目录下,执行文件放在bin文件目录下。
    1)配置flume
    进入conf目录将flume-conf.properties.template拷贝一份,并命名为自己需要的名字
    $ cp flume-conf.properties.template flume.conf
     修改flume.conf的内容,我们使用file sink来接收channel中的数据,channel采用memory channel,source采用exec source,配置文件如下:

点击(此处)折叠或打开

  1. agent.sources = seqGenSrc
  2. agent.channels = memoryChannel
  3. agent.sinks = loggerSink
  4. # For each one of the sources, the type is defined
  5. agent.sources.seqGenSrc.type = exec
  6. agent.sources.seqGenSrc.command = tail -F /data/mongodata/mongo.log
  7. #agent.sources.seqGenSrc.bind = 172.168.49.130
  8. # The channel can be defined as follows.
  9. agent.sources.seqGenSrc.channels = memoryChannel
  10. # Each sink's type must be defined
  11. agent.sinks.loggerSink.type = file_roll
  12. agent.sinks.loggerSink.sink.directory = /data/flume
  13. #Specify the channel the sink should use
  14. agent.sinks.loggerSink.channel = memoryChannel
  15. # Each channel's type is defined.
  16. agent.channels.memoryChannel.type = memory
  17. # Other config values specific to each type of channel(sink or source)
  18. # can be defined as well
  19. # In this case, it specifies the capacity of the memory channel
  20. agent.channels.memoryChannel.capacity = 1000
  21. agent.channels.memory4log.transactionCapacity = 100
    2)运行flume agent
    切换到bin目录下,运行一下命令:
    $ ./flume-ng agent --conf ../conf -f ../conf/flume.conf --n agent -Dflume.root.logger=INFO,console
    在/data/flume目录下可以看到生成的日志文件。

2. 结合kafka
    由于flume1.5.2没有kafka sink,所以需要自己开发kafka sink
    可以参考flume 1.6里面的kafka sink,但是要注意使用的kafka版本,由于有些kafka api不兼容的
    这里只提供核心代码,process()内容。

点击(此处)折叠或打开

  1. Sink.Status status = Status.READY;

  2.     Channel ch = getChannel();
  3.     Transaction transaction = null;
  4.     Event event = null;
  5.     String eventTopic = null;
  6.     String eventKey = null;
  7.     
  8.     try {
  9.         transaction = ch.getTransaction();
  10.         transaction.begin();
  11.         messageList.clear();
  12.         
  13.         if (type.equals("sync")) {
  14.             event = ch.take();

  15.      if (event != null) {
  16.          byte[] tempBody = event.getBody();
  17.          String eventBody = new String(tempBody,"UTF-8");
  18.          Map<String, String> headers = event.getHeaders();

  19.          if ((eventTopic = headers.get(TOPIC_HDR)) == null) {
  20.          eventTopic = topic;
  21.          }

  22.          eventKey = headers.get(KEY_HDR);

  23.          if (logger.isDebugEnabled()) {
  24.          logger.debug("{Event} " + eventTopic + " : " + eventKey + " : "
  25.          + eventBody);
  26.          }
  27.         
  28.          ProducerData<String, Message> data = new ProducerData<String, Message>
  29.          (eventTopic, new Message(tempBody));
  30.         
  31.          long startTime = System.nanoTime();
  32.          logger.debug(eventTopic+"++++"+eventBody);
  33.          producer.send(data);
  34.          long endTime = System.nanoTime();
  35.      }
  36.         } else {
  37.             long processedEvents = 0;
  38.             for (; processedEvents < batchSize; processedEvents += 1) {
  39.                 event = ch.take();

  40.          if (event == null) {
  41.          break;
  42.          }

  43.          byte[] tempBody = event.getBody();
  44.          String eventBody = new String(tempBody,"UTF-8");
  45.          Map<String, String> headers = event.getHeaders();

  46.          if ((eventTopic = headers.get(TOPIC_HDR)) == null) {
  47.          eventTopic = topic;
  48.          }

  49.          eventKey = headers.get(KEY_HDR);

  50.          if (logger.isDebugEnabled()) {
  51.          logger.debug("{Event} " + eventTopic + " : " + eventKey + " : "
  52.          + eventBody);
  53.          logger.debug("event #{}", processedEvents);
  54.          }

  55.          // create a message and add to buffer
  56.          ProducerData<String, String> data = new ProducerData<String, String>
  57.          (eventTopic, eventBody);
  58.          messageList.add(data);
  59.             }
  60.             
  61.             // publish batch and commit.
  62.      if (processedEvents > 0) {
  63.      long startTime = System.nanoTime();
  64.      long endTime = System.nanoTime();
  65.      }
  66.         }
  67.         
  68.         transaction.commit();
  69.     } catch (Exception ex) {
  70.         String errorMsg = "Failed to publish events";
  71.         logger.error("Failed to publish events", ex);
  72.         status = Status.BACKOFF;
  73.         if (transaction != null) {
  74.           try {
  75.             transaction.rollback();
  76.           } catch (Exception e) {
  77.             logger.error("Transaction rollback failed", e);
  78.             throw Throwables.propagate(e);
  79.           }
  80.         }
  81.         throw new EventDeliveryException(errorMsg, ex);
  82.       } finally {
  83.         if (transaction != null) {
  84.           transaction.close();
  85.         }
  86.       }
  87.     
  88.     return status;
    下一步,修改flume配置文件,将其中sink部分的配置改成kafka sink,如:
    

点击(此处)折叠或打开

  1. producer.sinks.r.type = org.apache.flume.sink.kafka.KafkaSink
  2. producer.sinks.r.brokerList = bigdata-node00:9092
  3. producer.sinks.r.requiredAcks = 1
  4. producer.sinks.r.batchSize = 100
  5. #producer.sinks.r.kafka.producer.type=async
  6. #producer.sinks.r.kafka.customer.encoding=UTF-8
  7. producer.sinks.r.topic = testFlume1
    type指向kafkasink所在的完整路径
    下面的参数都是kafka的一系列参数,最重要的是brokerList和topic参数

现在重新启动flume,就可以在kafka的对应topic下查看到对应的日志
阅读(22777) | 评论(0) | 转发(1) |
给主人留下些什么吧!~~