I need to write data in to Hadoop (HDFS) from external sources like a windows box. Right now I have been copying the data onto the namenode and using HDFS\'s put command to
There is an API in Java. You can use it by including the Hadoop code in your project. The JavaDoc is quite helpful in general, but of course you have to know, what you are looking for *g * http://hadoop.apache.org/common/docs/
For your particular problem, have a look at: http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileSystem.html (this applies to the latest release, consult other JavaDocs for different versions!)
A typical call would be:
Filesystem.get(new JobConf()).create(new Path("however.file"));
Which returns you a stream you can handle with regular JavaIO.
Install Cygwin, install Hadoop locally (you just need the binary and configs that point at your NN -- no need to actually run the services), run hadoop fs -copyFromLocal /path/to/localfile /hdfs/path/
You can also use the new Cloudera desktop to upload a file via the web UI, though that might not be a good option for giant files.
There's also a WebDAV overlay for HDFS but I don't know how stable/reliable that is.