I have a toy setup sending log4j messages to hdfs using flume. I\'m not able to configure the hdfs sink to avoid many small files. I thought I could configure the hdfs sink to
This can possibly happen because of the memory channel and its capacity. I guess its dumping data to HDFS as soon as its capacity becomes full. Did you try using file channel instead of memory ?