nlp | 易学教程

Sparse Efficiency Warning while changing the column

阅读更多关于 Sparse Efficiency Warning while changing the column

问题 def tdm_modify(feature_names,tdm): non_useful_words=['kill','stampede','trigger','cause','death','hospital'\ ,'minister','said','told','say','injury','victim','report'] indexes=[feature_names.index(word) for word in non_useful_words] for index in indexes: tdm[:,index]=0 return tdm I want to manually set zero weights for some terms in tdm matrix. Using the above code I get the warning. I don't seem to understand why? Is there a better way to do this? C:\Anaconda\lib\site-packages\scipy\sparse

InvalidArgumentError: input must be a vector, got shape: []

阅读更多关于 InvalidArgumentError: input must be a vector, got shape: []

问题 I m trying to save the embeddings of text data using universal sentence encoder in pandas dataframe new column but getting the error. Here is what I am trying to do. module_url = "https://tfhub.dev/google/universal-sentence-encoder/4" #@param ["https://tfhub.dev/google/universal-sentence-encoder/4", "https://tfhub.dev/google/universal-sentence-encoder-large/5"] model = thub.load(module_url) print ("module %s loaded" % module_url) def embed(input): return model(input) then for t in list(df[

InvalidArgumentError: input must be a vector, got shape: []

阅读更多关于 InvalidArgumentError: input must be a vector, got shape: []

How to remove english text from arabic string in python?

阅读更多关于 How to remove english text from arabic string in python?

问题 I have an Arabic string with English text and punctuations. I need to filter Arabic text and I tried removing punctuations and English words using sting. However, I lost the spacing between Arabic words. Where am I wrong? import string exclude = set(string.punctuation) main_text = "وزارة الداخلية: لا تتوفر لدينا معلومات رسمية عن سعوديين موقوفين في ليبيا http://alriyadh.com/1031499" main_text = ''.join(ch for ch in main_text if ch not in exclude) [output after this step="وزارة الداخلية لا

How to force a pos tag in spacy before/after tagger?

阅读更多关于 How to force a pos tag in spacy before/after tagger?

问题 If I process the sentence 'Return target card to your hand' with spacy and the en_web_core_lg model, it recognize the tokens as below: Return NOUN target NOUN card NOUN to ADP your ADJ hand NOUN How can I force 'Return' to be tagged as a VERB? And how can I do it before the parser, so that the parser can better interpret relations between tokens? There are other situations in which this would be useful. I am dealing with text which contains specific symbols such as {G} . These three

How do I preprocess and tokenize a TensorFlow CsvDataset inside the map method?

阅读更多关于 How do I preprocess and tokenize a TensorFlow CsvDataset inside the map method?

问题 I made a TensorFlow CsvDataset , and I'm trying to tokenize the data as such: import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' from tensorflow import keras import tensorflow as tf from tensorflow.keras.preprocessing.text import Tokenizer import os os.chdir('/home/nicolas/Documents/Datasets') fname = 'rotten_tomatoes_reviews.csv' def preprocess(target, inputs): tok = Tokenizer(num_words=5_000, lower=True) tok.fit_on_texts(inputs) vectors = tok.texts_to_sequences(inputs) return vectors,

Sentiment analysis of non-English texts

阅读更多关于 Sentiment analysis of non-English texts

问题 I want to analyze sentiment of texts that are written in German. I found a lot of tutorials on how to do this with English, but I found none on how to apply it to different languages. I have an idea to use the TextBlob Python library to first translate the sentences into English and then to do sentiment analysis, but I am not sure whether or not it is the best way to solve this task. Or are there any other possible ways to solve this task? 回答1: As Andy has pointed about above, the best

Sentiment analysis of non-English texts

阅读更多关于 Sentiment analysis of non-English texts

Replace entity with its label in SpaCy

阅读更多关于 Replace entity with its label in SpaCy

问题 Is there anyway by SpaCy to replace entity detected by SpaCy NER with its label? For example: I am eating an apple while playing with my Apple Macbook. I have trained NER model with SpaCy to detect "FRUITS" entity and the model successfully detects the first "apple" as "FRUITS", but not the second "Apple". I want to do post-processing of my data by replacing each entity with its label, so I want to replace the first "apple" with "FRUITS". The sentence will be " I am eating an FRUITS while

Replace entity with its label in SpaCy

阅读更多关于 Replace entity with its label in SpaCy