问题
I have below code in R for extracting person and locations from text:
library(rvest)
library(NLP)
library(openNLP)
page = pdf_text("C:/Users/u214738/Documents/NER_Data.pdf")
text = as.String(page)
sent_annot = Maxent_Sent_Token_Annotator()
word_annot = Maxent_Word_Token_Annotator()
install.packages("openNLPmodels", repos = "http://datacube.wu.ac.at/src/contrib/", type = "source")
install.packages("openNLPmodels.en", repos = "http://datacube.wu.ac.at/", type = "source")
install.packages("openNLPmodels.en", repos = "http://datacube.wu.ac.at/", type = "source",kind="person")
install.packages("openNLPmodels.en",repos ="http://datacube.wu.ac.at/", type = "source",kind="location")
install.packages("openNLPmodels.de", repos = "http://datacube.wu.ac.at/", type = "source")
library(openNLPmodels.de)
library(openNLPmodels.en)
loc_annot = Maxent_Entity_Annotator(kind = "location") #annotate location
people_annot = Maxent_Entity_Annotator(kind = "person") #annotate person
annot.l1 = NLP::annotate(text, list(sent_annot,word_annot))
k <- sapply(annot.l1$features,`[[`,"kind")
Locations = text[annot.l1[k=="location"]]
People = text[annot.l1[k == "person"]]
unique(Locations)
print(Locations)
unique(People)
print(People)
But Results I get are as follows:
unique(Locations)
character(0)
print(Locations)
character(0)
unique(People)
character(0)
print(People)
character(0)
NER_Data contains any text with people names and locations like info of Bill Gates, Warren Buffet
Need your fast guidance on this module.
来源:https://stackoverflow.com/questions/58169707/efficient-named-entity-recognition-in-r