hadoop - Filtering log files in Flume using interceptors -
i have http server writing log files load hdfs using flume first want filter data according data have in header or body. read can using interceptor regex, can explain need do? need write java code overrides flume code?
also take data , according header send different sink (i.e source=1 goes sink1 , source=2 goes sink2) how done?
thank you,
shimon
you don't need write java code filter events. use regex filtering interceptor filter events body text matches regular expression:
agent.sources.logs_source.interceptors = regex_filter_interceptor agent.sources.logs_source.interceptors.regex_filter_interceptor.type = regex_filter agent.sources.logs_source.interceptors.regex_filter_interceptor.regex = <your regex> agent.sources.logs_source.interceptors.regex_filter_interceptor.excludeevents = true
to route events based on headers use multiplexing channel selector:
a1.sources = r1 a1.channels = c1 c2 c3 c4 a1.sources.r1.selector.type = multiplexing a1.sources.r1.selector.header = state a1.sources.r1.selector.mapping.cz = c1 a1.sources.r1.selector.mapping.us = c2 c3 a1.sources.r1.selector.default = c4
here events header "state"="cz" go channel "c1", "state"="us" - "c2" , "c3", other - "c4".
this way can filter events header - route specific header value channel, points null sink.
Comments
Post a Comment