similarity | 易学教程

Text similarity analysis (Excel)

阅读更多关于 Text similarity analysis (Excel)

问题 I have a list of items and I want to identify their similarity in relation to the other items in this list. My desired output would be something along the lines of: The percentage shown in the similarity column is purely illustrative. I'm thinking that a test for similarity would be something along the lines of: number of concurrent letters / by the total number of letters in the matched item But would be keen to get opinions on that one. Is this something which is reasonably doable on Excel?

Pandas Similarity Matching

阅读更多关于 Pandas Similarity Matching

问题 I tried searching the answer in SO but didnt find any help. Here is what I´m trying to do: I have a dataframe (here is a small example of it): df = pd.DataFrame([[1, 5, 'AADDEEEEIILMNORRTU'], [2, 5, 'AACEEEEGMMNNTT'], [3, 5, 'AAACCCCEFHIILMNNOPRRRSSTTUUY'], [4, 5, 'DEEEGINOOPRRSTY'], [5, 5, 'AACCDEEHHIIKMNNNNTTW'], [6, 5, 'ACEEHHIKMMNSSTUV'], [7, 5, 'ACELMNOOPPRRTU'], [8, 5, 'BIT'], [9, 5, 'APR'], [10, 5, 'CDEEEGHILLLNOOST'], [11, 5, 'ACCMNO'], [12, 5, 'AIK'], [13, 5, 'CCHHLLOORSSSTTUZ'], [14

How to Normalize similarity measures from Wordnet

阅读更多关于 How to Normalize similarity measures from Wordnet

问题 I am trying to calculate semantic similarity between two words. I am using Wordnet-based similarity measures i.e Resnik measure(RES), Lin measure(LIN), Jiang and Conrath measure(JNC) and Banerjee and Pederson measure(BNP). To do that, I am using nltk and Wordnet 3.0. Next, I want to combine the similarity values obtained from different measure. To do that i need to normalize the similarity values as some measure give values between 0 and 1, while others give values greater than 1. So, my

Transform attribute vector into a matrix with differences of elements

阅读更多关于 Transform attribute vector into a matrix with differences of elements

问题 Similarly to this previous post I need to transfrom an attribute vector into a matrix. This time with differences between pairs of elements using R. For example I have a vector which reports the age of N people (from 18 to 90 years). I need to convert this vector into a NxN matrix named A (with people names on rows and columns), where each cell Aij has the value of |age_i-age_j|, representing the absolute difference in age between the two people i and j. Here is an example with 3 persons,

How to show a similarity of two columns in percentage using R?

阅读更多关于 How to show a similarity of two columns in percentage using R?

问题 I have a simple question which I stock in it! I have two data.frames and I want to compare them and show their similarity in percentage but I do not know how! Here is a simple example: a <- as.matrix(rbinom(10,1,1/2)) b <- as.matrix(rbinom(10,1,1/2)) > a [,1] [1,] 1 [2,] 0 [3,] 1 [4,] 0 [5,] 1 [6,] 0 [7,] 1 [8,] 1 [9,] 1 [10,] 0 > b [,1] [1,] 1 [2,] 0 [3,] 1 [4,] 1 [5,] 0 [6,] 0 [7,] 0 [8,] 0 [9,] 1 [10,] 0 I know that table shows the differences/similarities > table(a,b) b a 0 1 0 3 1 1 3 3

Similarity between two lists of documents

阅读更多关于 Similarity between two lists of documents

问题 I need to find the similarity between two lists of the short texts in Python. Texts can be 1-4 word long. The length of the lists can be 10K each. I didn't find how to do this effectively in spaCy. Maybe other packages can do this? I assume the words are represented by a vector (300d), but any other options are also Ok. This task can be done in a cycle, but there should be a more effective way for sure. This task fits the TensorFlow, pyTorch, and similar packages, but I'm not familiar with

Calculating similarity between two vectors/Strings in R

阅读更多关于 Calculating similarity between two vectors/Strings in R

问题 It might be similar question asked in this forum but I feel my requirement peculiar. I have a data frame df1 where it consists of variable "WrittenTerms" with 40,000 observations and I have another data-fame df2 with variable "SuggestedTerms" with 17,000 observations I need to calculate the similarity between "written Term" and "suggestedterms" df1$WrittenTerms head pain lung cancer abdminal pain df2$suggestedterms cardio attack breast cancer abdomen pain head ache lung cancer I need to get

PHP - How do I partially compare elements in 2 arrays

阅读更多关于 PHP - How do I partially compare elements in 2 arrays

问题 I have 2 arrays: $arr1 = array('Test', 'Hello', 'World', 'Foo', 'Bar1', 'Bar'); and $arr2 = array('hello', 'Else', 'World', 'Tes', 'foo', 'BaR1', 'Bar'); I need to compare the 2 arrays and save the position of the matching elements to a 3rd array $arr3 = (3, 0, 2, 4, 5, 6); //expected result, displaying position of matching element of $arr1 in $arr2. By 'matching' I mean all elements that are identical (ex. World), or partially the same (ex. Test & Tes) and also those elements that are alike

PHP - How do I partially compare elements in 2 arrays

阅读更多关于 PHP - How do I partially compare elements in 2 arrays

Cannot use the Knowledge academic API

阅读更多关于 Cannot use the Knowledge academic API

问题 I have a problem when I try to use the function similarity proposed in the academic knowledge API. I tested the following commad to compute the similarity between two string: curl -v -X GET "https://api.labs.cognitive.microsoft.com/academic/v1.0/similarity?s1={string}&s2={string}" -H "Ocp-Apim-Subscription-Key: {subscription key}" The error that I get is : {"error":{"code":"Unspecified","message":"Access denied due to invalid subscript ion key. Make sure you are subscribed to an API you are