hadoop - Filtering log files in Flume using interceptors -


i have http server writing log files load hdfs using flume first want filter data according data have in header or body. read can using interceptor regex, can explain need do? need write java code overrides flume code?

also take data , according header send different sink (i.e source=1 goes sink1 , source=2 goes sink2) how done?

thank you,

shimon

you don't need write java code filter events. use regex filtering interceptor filter events body text matches regular expression:

agent.sources.logs_source.interceptors = regex_filter_interceptor agent.sources.logs_source.interceptors.regex_filter_interceptor.type = regex_filter agent.sources.logs_source.interceptors.regex_filter_interceptor.regex = <your regex> agent.sources.logs_source.interceptors.regex_filter_interceptor.excludeevents = true 

to route events based on headers use multiplexing channel selector:

a1.sources = r1 a1.channels = c1 c2 c3 c4 a1.sources.r1.selector.type = multiplexing a1.sources.r1.selector.header = state a1.sources.r1.selector.mapping.cz = c1 a1.sources.r1.selector.mapping.us = c2 c3 a1.sources.r1.selector.default = c4 

here events header "state"="cz" go channel "c1", "state"="us" - "c2" , "c3", other - "c4".

this way can filter events header - route specific header value channel, points null sink.


Comments

Popular posts from this blog

html5 - What is breaking my page when printing? -

c# - must be a non-abstract type with a public parameterless constructor in redis -

ajax - PHP/JSON Login script (Twitter style) not setting sessions -