Formula for comparing 2 columns for containing data and counting their occurence?

♀尐吖头ヾ 提交于 2020-07-22 05:56:12

问题


I faced a problem with comparing data in excel. I asked a similar question earlier (Is there any Excel Formula for comparing 2 columns for containing data and counting their occurrence?), but my problem still not solved.

So please help me, someone. I will show an example of what do I want to get:
Scrennshot #1Screenshot #2

As you can see by these screenshots formula returns me "1" value only if it is an exact match, but I need an approximate match. So for example, if I need "Apple" and I have "Apple Inc" formula must return "1" cause cell is containing "Apple".

I will attach a link for this gsheet to make my question clearer.

https://docs.google.com/spreadsheets/d/1croUUM3XZTblqpqIva73qX54JeR8oC1cCsMOWyCW1us/edit#gid=0


回答1:


You appear to have your COUNTIF conditions inverted. Let's take your first record as an example:

="Apple5"    |    =COUNTIF('Named Focus List'!A:A, "*" & A2 & "*")

Substitute values in:

=COUNTIF({"What do I need";"Apple";"Orange";"Melon"} "*Apple5*")

So, this counts "How many values in the list {"What do I need";"Apple";"Orange";"Melon"} contain the text "Apple5" anywhere within them?" The answer is, none.

What you actually want to know is "How many of the values in the list {"What do I need";"Apple";"Orange";"Melon"} are contained within the text "Apple5"

Now, at first this seems easy: swap 2 arguments around:

=COUNTIF(A2, "*" & 'Named Focus List'!A:A & "*")

However, you will get an error! (Depending on what version of Excel or GoogleSheets you use, this may be a #VALUE! or #SPILL! error)

This is because it will return an array of results, for every entry in your List (i.e. {0,1,0,0}). To add these all together into a single value, we can wrap it all in a SUMPRODUCT: (Doing it this way means we also don't need to worry about pressing Ctrl+Shift+Enter for an Array Formula)

=SUMPRODUCT(COUNTIF(A2, "*" & 'Named Focus List'!A:A & "*"))

Better? Well, slightly. You might notice that your number is now absurdly high. This is because it is also counting every blank row as a match. Ooops! (This is but one of many reasons to avoid whole-column calculations)

There are a couple of ways around this - you could hard-code the List Range manually, but I'm going to use INDEX to find the bottom cell instead:

=SUMPRODUCT(COUNTIF(A2, "*" & 'Named Focus List'!$A$1:INDEX('Named Focus List'!$A:$A, MAX(COUNTA('Named Focus List'!$A:$A),1)) & "*"))

The MAX(.., 1) is just to make sure we always look at at least one cell, and the COUNTA means that if there are 7 values in the column, then we look at the first 7 rows, and if there are 100 values in the column then we look at the first 100 rows. Try not to leave blank cells in the middle of the list!




回答2:


I like Chronocidal's solution but this way speaks more to me:

=SUM(1*NOT(ISERROR(FIND('Named Focus List'!$A2:A4,Sheet1!A2))))

I think it makes it clearer what we are trying to accomplish. The "find" returns a vector of numerical locations if A2 contains substring in each of the elements of A2 to A4 and an error if it does not. The combination of "iserror" and "not" returns a vector taking a value TRUE if A2 contains substring in each of the elements of A2 to A4. Multiplying by 1 turns TRUE into 1 and FALSE into zero values. Then the "sum" function then generates the desired count.

You can use Chronocidal's clever code to avoid hard coding the range of the list, but I'd be worried about enforcing the no blank cell requirement.




回答3:


In D2, array formula (to be confirmed by "Ctrl"+"Shift"+"Enter" to enter them ) copied down :

=0+(COUNT(SEARCH('Named Focus List'!$A$2:$A$4,$A2))>0)




回答4:


use:

=ARRAYFORMULA(N(REGEXMATCH(A2:A, 
 TEXTJOIN("|", 1, 'Named Focus List'!A2:A))))



来源:https://stackoverflow.com/questions/62777715/formula-for-comparing-2-columns-for-containing-data-and-counting-their-occurence

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!