apache kafka - suggestion with hadoop project -
i thinking build using big data. ideally is:
take .csv put flume, kafka, perform n etl , put in kafka, kafka put flume , in hdfs. once infos in hdfs perform map reduce job or hive queries , chart whatever want.
how can put .csv file flume , save kafka? have piece of code not sure if works:
myagent.sources = r1 myagent.sinks = k1 myagent.channels = c1 myagent.sources.r1.type = spooldir myagent.sources.r1.spooldir = /home/xyz/source myagent.sources.r1.fileheader = true myagent.sinks.k1.type = org.apache.flume.sink.kafka.kafkasink vmagent.channels.c1.type = memory myagent.channels.c1.capacity = 1000 myagent.channels.c1.transactioncapacity = 100 myagent.sources.r1.channels = c1 myagent.sinks.k1.channel = c1
any or suggestions? , if piece of code correct, how move on?
thanks everyone!!
your sink config incomplete. try :
a1.sinks.k1.type = org.apache.flume.sink.kafka.kafkasink
a1.sinks.k1.topic = mytopic
a1.sinks.k1.brokerlist = localhost:9092
a1.sinks.k1.requiredacks = 1
a1.sinks.k1.batchsize = 20
a1.sinks.k1.channel = c1
Comments
Post a Comment