Using the storm hdfs connector to write data into HDFS

☆樱花仙子☆ 提交于 2019-12-12 05:47:53

问题


The source code for the "storm-hdfs connector" that can be used to write data into HDFS. The github url is : https://github.com/ptgoetz/storm-hdfs There is a particular topology: "HdfsFileTopology" used to write '|' delimited data into HDFS. link: https://github.com/ptgoetz/storm-hdfs/blob/master/src/test/java/org/apache/storm/hdfs/bolt/HdfsFileTopology.java

I have questions about the part of the code:

Yaml yaml = new Yaml();
        InputStream in = new FileInputStream(args[1]);
        Map<String, Object> yamlConf = (Map<String, Object>) yaml.load(in);
        in.close();
        config.put("hdfs.config", yamlConf);

        HdfsBolt bolt = new HdfsBolt()
                .withConfigKey("hdfs.config")
                .withFsUrl(args[0])
                .withFileNameFormat(fileNameFormat)
                .withRecordFormat(format)
                .withRotationPolicy(rotationPolicy)
                .withSyncPolicy(syncPolicy)
                .addRotationAction(new MoveFileAction().toDestination("/dest2/"));

What does this part of the code do, especially the YAML part?


回答1:


I think the code is quite clear. In order for HdfsBolt to be able to write into HDFS, it needs information about the HDFS itself and that is what you do when your create that YAML file.

And to run that topology, you provide the path of that YAML file as a command line argument.

Usage: HdfsFileTopology [topology name] [yaml config file]

The author of the library made a good description here: Storm-HDFS Usage.

If you read the source code, you will find the contents of the YAML file will be used to configure the HDFS. Properly it could be something like HDFS Defaults but I can't be sure.

Properly it is bette to ask the author of the library.



来源:https://stackoverflow.com/questions/26801495/using-the-storm-hdfs-connector-to-write-data-into-hdfs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!