Run StormCrawler in local mode or install Apache Storm?

最后都变了- 提交于 2019-12-08 09:13:06

问题


So I'm trying to figure out how to install and setup Storm/Stormcrawler with ES and Kibana as described here.

I never installed Storm on my local machine because I've worked with Nutch before and I never had to install Hadoop locally... thought it might be the same with Storm(maybe not?).

I'd like to start crawling with Stormcrawler instead of Nutch now.

It seems that if I just download a release and add the /bin to my PATH, I can only talk to a remote cluster.

It seems like I need to setup a development environment according to this, to give me the ability to develop different topologies over time and then just talk to the remote cluster from my local machine when ready to deploy the new topologies. Is that right?

So it seems like all I need to do is add Storm as a dependency to my Stormcrawler project when I build it with Maven?


回答1:


See Getting Started page and the tutorials on Youtube.

You don't need to install Storm as you can run the topology in local mode, just as you'd do with Nutch and Hadoop. Just generate a topology from the archetype, modify it to your needs e.g. add ES components and run it with -local. See README generated by the archetype.

Later on, you'd install Storm to benefit from the UI and possibly run it on multiple nodes but as a starting point doing it locally is a good way of exploring the capabilities of StormCrawler.



来源:https://stackoverflow.com/questions/51994601/run-stormcrawler-in-local-mode-or-install-apache-storm

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!