Some tips for configing the flume properties

The Tips

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.I already have installed many flume systems to collect streaming log data. But I found some problems when we used flume. I write this blog to record the problems and solutions. and anybody else will avoid such problem.

The running environment

  • CDH version 5.8.0+
  • Flume 1.6.0+
  • Java 1.7.0+
  • Linux 2.6.32-573.el6.x86_6
  • Centos 6.6+

Tips one :rotating invalid

flume by using the following configuration uploads and rotates files to hadoop ...

View comments.

more ...

How to upgrade the tomcat version used by CDH httpFs service

The Problem

Tomcat released one patch which fixed one error bug about CVE-2016-8745. In my CDH cluster, httpFS service is used by web http service, and it is run by 6.0.44 version Tomcat. We must upgrade the tomcat version from 6.0.44 to 6.0.50+ avoid of security attacking.

the CDH Envirenment

  • CDH version 5.7.0+
  • Java 1.7.0+
  • Linux 2.6.32-573.el6.x86_6
  • Centos 6.6+

The upgrade steps

Download the newest version of tomcat

Tomcat version 6.0.53 can be downloaded. I can extract gz package

tar xvfz apache-tomcat-6.0 ...

View comments.

more ...

How to limit virtual memory using of Flume process in Centos 6.x

The Problem

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.I already have installed many flume systems to collect streaming log data. But I found one problem that flume process was occuping more virtual memories and fewer physical memories. Why is this situation occurred? This blog explains and provides the methods to solve this problem.

The running environment

  • CDH version 5.8.0+
  • Flume 1.6.0+
  • Java 1.7.0+
  • Linux 2.6.32-573.el6.x86_6
  • Centos 6.6+

The Top command Result of flume process

26963 root ...

View comments.

more ...

How to config and query Impala SQL interface of CDH with kerberos mechanism

The Problem

Recently, I have spended several days on rearching impala sql interface with security mechanism. There are two methods to query impala data. One is the kerberos mechanism, the other is ldap method which provided user and password. The first one is very difficult and usually adapted for internal using in the hadoop cluster, So I choose the ldap method for external appliction such as jdbc interface. This blog provides the configuration steps and queries demo for using ldap to impala databases.

The test environment

  • CDH version 5.8.0+
  • kerberos software
  • ldap service
  • Linux 2.6.32-573.el6 ...

View comments.

more ...