Trouble importing boilerpipe in python

心不动则不痛 提交于 2019-12-06 04:54:35

This worked for me on Mac OS X 10.8.5 with Python 2.7.9.:

pip install JPype1    # to install https://pypi.python.org/pypi/JPype1
pip install charade
git clone https://github.com/misja/python-boilerpipe.git
cd python-boilerpipe
sudo python setup.py install

Then you should be able to do in the python console

>>> from boilerpipe.extract import Extractor
>>> extractor = Extractor(extractor='ArticleExtractor', url="http://en.wikipedia.org/wiki/Main_Page")
>>> print extractor.getText()

You are missing boiler pipe java packages install, you can find it here - http://code.google.com/p/boilerpipe/downloads/list

you have only install python boilerpipe wrapper.

The following worked best for me:

git clone https://github.com/misja/python-boilerpipe.git
cd python-boilerpipe
sudo python setup.py install

You may have to:

  • install JPype (sudo apt-get install python-jpype on Ubuntu)
  • install charade (sudo pip install charade)

But you won't have to install the boilerpipe JAVA jar's since setup loads this for you.

I tried installing the python boilerpipe from pip, but had no luck. I was successfully running boilerplate java code, but kept getting this same error.

The class HTMLHighlighter wasn't found. Did you set your JAVA_HOME? The documentation states:

Be sure to have set JAVA_HOME properly since jpype depends on this setting.

I had the same issue. I saw the set-up details provided by the author of Mining the web. Here is the link to his Github page for boilerpipe

https://github.com/misja/python-boilerpipe/blob/master/setup.py

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!