So I have a column called chemical formula for like 40,000 entries, and what I want to be able to do is count up how many elements are contained in the chemical formula. So for
Even though this has an excellent (and accepted) VBA solution, I couldn't resist the challenge to do this without using VBA.
I posted a solution earlier, which wouldn't work in all cases. This new code should always work:
=MAX(
IFERROR(IF(FIND(C$1&ROW($1:$99),$B2),ROW($1:$99),0),0),
IFERROR(IF(FIND(C$1&CHAR(ROW($65:$90)),$B2&"Z"),1,0),0)
)
Enter as an array formula: Ctrl + Shift + Enter
Output:
The formula outputs 0 when not found, and I simply used conditional formatting to turn zeroes gray.
How it works
This part of the formula looks for the element, followed by a number between 1 and 99. If found, the number of atoms is returned. Otherwise, 0 is returned. The results are stored in an array:
IFERROR(IF(FIND(C$1&ROW($1:$99),$B2),ROW($1:$99),0),0)
In the case of C13H25NO5, a search for "C" returns this array:
{1,0,0,0,0,0,0,0,0,0,0,0,13,0,0,0,...,0}
1 is the first array element, because C1 is a match. 13 is the thirteenth array element, and that's what we're interested in.
The next part of the formula looks for the element, followed by an uppercase letter, which indicates a new element. (The letters A through Z are characters 65 through 90.) If found, the number 1 is returned. Otherwise, 0 is returned. The results are stored in an array:
IFERROR(IF(FIND(C$1&CHAR(ROW($65:$90)),$B2&"Z"),1,0),0)
"Z" is appended to the chemical formula, so that a match will be found when its last element has no number. (For example, "H2O".) There is no element "Z" in the Periodic Table, so this won't cause a problem.
In the case of C13H25NO5, a search for "N" returns this array:
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0}
1 is the 15th element in the array. That's because it found the letters "NO", and O is the 15th letter of the alphabet.
Taking the maximum value from each array gives us the number of atoms as desired.