问题
I am trying to use OpenNLP in a project I am working in and i am very new to it. I tried out using the Named Entity Recognition with the training data available at http://opennlp.sourceforge.net/models-1.5/ However I want to see the training data that have been used. i.e. to actually open the .bin file and see its content in English. Can some one pls point me in the correct direction. I have tried to use UltraISO to read the .bin file but i was not successful. PLs help !! Thanx :)
回答1:
Use the Unix file
command to find the file type, like file en-token.bin
. For most OpenNLP .bin files, it will tell you that these are just ZIP files.
回答2:
the bin file is actually the bytes of a serialized java object representing a TokenNameFinder implementation called a NameFinderME (ME meaning Maximum entropy, which is the main multinomial logistic regression (ish) algorithm used in OpenNLP). You will not be able to see the training data by doing anything to this file. Correction: it's not the name finder, it's the namefinderMODEL that is serialized.
来源:https://stackoverflow.com/questions/26140492/how-can-i-view-the-content-of-a-bin-file-in-opennlp