Reading POS tag models in Android

北慕城南 提交于 2019-12-10 17:09:25

问题


I have tried doing POS tagging using openNLP POS Models on a normal Java application. Now I would like to implement it on Android platform. I am not sure what is the Android requirement or restrictions as I am not able to read the models (binary file) and execute the POS tagging properly.

I tried getting the .bin file from external storage as well as putting it in an external libraries but still it couldn't work. These are my codes:

InputStream modelIn = null;
POSModel model = null;

String path = Environment.getExternalStorageDirectory().getPath() + "/TextSumIt/en-pos-maxent.bin";

modelIn = new BufferedInputStream( new FileInputStream(path));
model = new POSModel(modelIn);

The error I got:

11-15 06:39:35.072: W/System.err(565): opennlp.tools.util.InvalidFormatException: The profile data stream has an invalid format!
11-15 06:39:35.177: W/System.err(565):  at opennlp.tools.dictionary.serializer.DictionarySerializer.create(DictionarySerializer.java:224)
11-15 06:39:35.177: W/System.err(565):  at opennlp.tools.postag.POSDictionary.create(POSDictionary.java:282)
11-15 06:39:35.182: W/System.err(565):  at opennlp.tools.postag.POSModel$POSDictionarySerializer.create(POSModel.java:48)
11-15 06:39:35.182: W/System.err(565):  at opennlp.tools.postag.POSModel$POSDictionarySerializer.create(POSModel.java:44)
11-15 06:39:35.182: W/System.err(565):  at opennlp.tools.util.model.BaseModel.<init>(BaseModel.java:135)
11-15 06:39:35.197: W/System.err(565):  at opennlp.tools.postag.POSModel.<init>(POSModel.java:93)
11-15 06:39:35.197: W/System.err(565):  at com.main.textsumit.SummarizationActivity.postagWords(SummarizationActivity.java:676)
11-15 06:39:35.205: W/System.err(565):  at com.main.textsumit.SummarizationActivity.generateSummary(SummarizationActivity.java:252)
11-15 06:39:35.205: W/System.err(565):  at com.main.textsumit.SummarizationActivity.onCreate(SummarizationActivity.java:127)

What is it that cause it not reading the model properly? And how should I resolve this? Please help.

Thank you.


回答1:


For what it's worth, if this is still an issue: I had a similar issue attempting to use the POS model in a different context (non-Android), and in my case it appeared to be the extraction failing from the bin file, not anything with the model itself. It appears to be local to the tags.tagdict file in the archive (as suggested here http://sharpnlp.codeplex.com/discussions/263620), so if you don't need that currently (and I didn't for my simple scenarios) then try removing it from the archive. (But leave the archive intact as it's expected to arrive in zip'd form.)




回答2:


Try this, it worked for me

    System.setProperty("org.xml.sax.driver", "org.xmlpull.v1.sax2.Driver");
    try {
        AssetFileDescriptor fileDescriptor = 
        context.getAssets().openFd("en_pos_maxent.bin");
        FileInputStream inputStream = fileDescriptor.createInputStream();
        POSModel posModel = new POSModel(inputStream);
        posTaggerME = new POSTaggerME(posModel);
    } catch (Exception e) {}


来源:https://stackoverflow.com/questions/13392791/reading-pos-tag-models-in-android

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!