mecab

Can I use MeCab on IBM Data Science Experience

僤鯓⒐⒋嵵緔 提交于 2019-12-25 00:37:09
问题 I want to use Mecab on IBM Data Science Experience. https://pypi.python.org/pypi/mecab-python3 Is it possible? 回答1: I'm afraid not, or at least not easily. That Python package requires native mecab libraries, which are not installed in the environment where DSX notebooks are running. Neither do users have permission to install them using a package manager (yum). If you're willing to spend effort, you can try to put the libraries from a mecab rpm into the user file system. Then extend the

Subprocess, repeatedly write to STDIN while reading from STDOUT (Windows)

喜夏-厌秋 提交于 2019-12-21 05:48:05
问题 I want to call an external process from python. The process I'm calling reads an input string and gives tokenized result, and waits for another input (binary is MeCab tokenizer if that helps). I need to tokenize thousands of lines of string by calling this process. Problem is Popen.communicate() works but waits for the process to die before giving out the STDOUT result. I don't want to keep closing and opening new subprocesses for thousands of times. (And I don't want to send the whole text,

Trying to get libmecab.dll (MeCab) to work with C#

若如初见. 提交于 2019-12-08 02:41:28
问题 I'm trying to use the Japanese morphological analyzer MeCab in a C# program (Visual Studio 2010 Express, Windows 7), and something's going wrong with the encoding. If my input (pasted into a textbox) is this: 一方、広義の「ネコ」は、ネコ類(ネコ科動物)の一部、あるいはその全ての獣を指す包括的名称を指す。 Then my output (in another textbox) looks like this: ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž

Trying to get libmecab.dll (MeCab) to work with C#

丶灬走出姿态 提交于 2019-12-06 15:48:17
I'm trying to use the Japanese morphological analyzer MeCab in a C# program (Visual Studio 2010 Express, Windows 7), and something's going wrong with the encoding. If my input (pasted into a textbox) is this: 一方、広義の「ネコ」は、ネコ類(ネコ科動物)の一部、あるいはその全ての獣を指す包括的名称を指す。 Then my output (in another textbox) looks like this: ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続,*,*,*,*,* ? åè©ž,サ変接続

Subprocess, repeatedly write to STDIN while reading from STDOUT (Windows)

感情迁移 提交于 2019-12-03 17:06:39
I want to call an external process from python. The process I'm calling reads an input string and gives tokenized result, and waits for another input (binary is MeCab tokenizer if that helps). I need to tokenize thousands of lines of string by calling this process. Problem is Popen.communicate() works but waits for the process to die before giving out the STDOUT result. I don't want to keep closing and opening new subprocesses for thousands of times. (And I don't want to send the whole text, it may easily grow over tens of thousands of -long- lines in future.) from subprocess import PIPE,