Some tips for configing the flume properties

The Tips

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.I already have installed many flume systems to collect streaming log data. But I found some problems when we used flume. I write this blog to record the problems and solutions. and anybody else will avoid such problem.

The running environment

  • CDH version 5.8.0+
  • Flume 1.6.0+
  • Java 1.7.0+
  • Linux 2.6.32-573.el6.x86_6
  • Centos 6.6+

Tips one :rotating invalid

flume by using the following configuration uploads and rotates files to hadoop ...

View comments.

more ...

How to limit virtual memory using of Flume process in Centos 6.x

The Problem

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.I already have installed many flume systems to collect streaming log data. But I found one problem that flume process was occuping more virtual memories and fewer physical memories. Why is this situation occurred? This blog explains and provides the methods to solve this problem.

The running environment

  • CDH version 5.8.0+
  • Flume 1.6.0+
  • Java 1.7.0+
  • Linux 2.6.32-573.el6.x86_6
  • Centos 6.6+

The Top command Result of flume process

26963 root ...

View comments.

more ...

Flume must be used the hadoop native libraries when uploading gz file

THe Problem

Recently, I had been one requirement in my project for uploading real-time log record into hadoop cluster. I chose the open source software Flume. After installing flume, The log record could be transferred to hadoop cluster with gz suffix successfully. But I found the gz file size more than decompressed one.

-rw-r--r-- 1 root   root        942 Dec 27 17:28 ngaancache-access.log.2016122321.1482498035352
-rw-r--r-- 1 root   root       6571 Dec 27 17:32 ngaancache-access.log.2016122321.1482498035352.gz

When I used gzip command to decompress this file, one warning infomation "trailing garbage ignored" is reported as followed

#gzip ...

View comments.

more ...