问题
I have a list of strings in cells - 1000s of them - and I need to work out the characters per word but separated by word - preferably in 1 swift formula...
For Example: 1. "Black Cup With Handle" > Formula I need > 5,3,4,6
- "Giant Bear Statue" > Formula I need > 5,4,6
I need this for a recurring task which has been macro'd in a very inefficient way to count words into columns (of which we need to use up to 20 for the just encase) but this needs to be tackled.
Usually, we count the spaces and layer nested serach() formulas to piggyback onto one and other to break down the structure then character counts the individual words...
I could alternatively the macro to substitute the spaces for commas and used text to columns but that still leaves me with a prolonged counting process for what im looking for
we obviously use =LEN(A1)-LEN(SUBSTITUTE(A1," ",""))
to count the spaces in the word
we currently then use =SEACRH()
function combined with =MID()
functions (and some bizarre numbers) to reveal each word into its own individual cell
then =LEN
once again bu on all individual words - very long-winded
Im hoping to find a shorter way to do this but feeling there may not be a dynamic enough way to do it with formula alone, hoping someone can prove me wrong!
回答1:
You'll have different options depending on your Excel version.
OPTION 1: TEXTJOIN
I think you are looking for a TEXTJOIN
function. Just bare in mind that you can only use this the more later versions of Excel (see link to documentation) and it could work like this:
Formula in B1
:
=TEXTJOIN(",",TRUE,LEN(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s")))
NOTE: It's an array formula and you need to enter it using CtrlShiftEnter
To make it so that you won't need to use the above key-combo, we can include an INDEX
:
=TEXTJOIN(",",TRUE,INDEX(LEN(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s")),))
Additional Information:
FILTERXML
This function takes (as per documentation) two required arguments:
- A string in valid XML
- A string in valid XPath
Because we want to return an array of elements (words) from the cell, we need to SUBSTITUTE the spaces for end-tags (</..>
) and concatenate that with a start-tag (<..>
) at the start of the string and another end-tag at the end.
I'll have to rely on an XML
explaination on the tags as to why <?><?>
works and it's meaning, because as far as my testing goes I could swap the letters around or replace by another letter with the same results as long as the final Xpath
would resemble the same character. It would be great if someone would be able to complement this answer with a better explanation on this matter.
TEXTJOIN
If you are a Office 365 subscriber or own Excel 2019 you can make use of this function. There are (as per documentation) at least 3 required arguments:
- A delimiter which must be a text string, either empty, or one or more characters enclosed by double quotes, or a reference to a valid text string. If a number is supplied, it will be treated as text.
- The second argument can hold either
TRUE
orFALSE
and determines whether or not you want to exclude/include empty values - The third argument is the text item to be joined. A text string, or array of strings, such as a range of cells.
Now this is where we can join the two functions together, FILTERXML
returning an array which we can use in TEXTJOIN
.
INDEX + LEN
I'll have to explain the use of these functions together. I don't think LEN
and INDEX
will need much of an introduction on their own, but together they work quite nicely. Natively there will be a force called implicit intersection that will prevent LEN
from returning an array of values when you pass an array of values to the function, in this case through our FILTERXML
.
Normally you would disable this mechanism using a key combination of: CtrlShiftEnter, better known as CSE
.
Now what INDEX
does is disabling this implicit intersection making LEN
able to return an array, removing the need to CSE
the formula. INDEX
is one of the functions that has this "power". A more in depth explanation on implicit intersection can be found here
OPTION 2: UDF
Without access to TEXTJOIN
I think you'll need to have a look at using an UDF, possibly looking like below:
Function TEXTJOIN(rng As Range) As String
TEXTJOIN = Join(Application.Evaluate("LEN({""" & Join(Split(rng, " "), """,""") & """})"), ",")
End Function
You can call this in B1
like so: =TEXTJOIN(A1)
Additional Information:
The UDF consists out of three main mechanisms that work together:
JOIN
This funciton takes two parameters, where the first one is required:
- First parameter is a one-dimensional array containing substrings
- The second (optional) parameter is a string character used to separate the substrings in the returned string. If omitted, the space character (" ") is used. If delimiter is a zero-length string (""), all items in the list are concatenated with no delimiters.
The function returns a string value
SPLIT
This function takes a string and delimits it by a specified character/substring. It takes the following arguments:
- 1st: A required string expression containing substrings and delimiters. If expression is a zero-length string(""), Split returns an empty array, that is, an array with no elements and no data.
- 2nd: The optional delimiter which is a string character used to identify substring limits. If omitted, the space character (" ") is assumed to be the delimiter. If delimiter is a zero-length string, a single-element array containing the entire expression string is returned.
- 3rd: An optional limit, a number of substrings to be returned; -1 indicates that all substrings are returned.
- 4th: Compare, also optional, is a numeric value indicating the kind of comparison to use when evaluating substrings. See Settings section for values.
In this case we would only need the first two arguments.
Application.Evaluate
This is IMO one of the most handy mechanisms you can use to pull of a returned array of values without having to loop through items/cells. It may get slow when you feed the function a large array formula, but in this case it will be fine. The funtion converts a Microsoft Excel name into an object or value, and when we pass it an formula, it thus will return the results. In this particular case it will return an array.
回答2:
I'm not totally clear on what end-result you're looking for, or whether you're okay with a VBA solution, but this is my interpretation:
Function lengths(txt As String) As String
Dim wrd
For Each wrd In Split(txt)
If lengths <> "" Then lengths = lengths & ","
lengths = lengths & Len(wrd)
Next wrd
End Function
Paste the code into a VBA module and then, for example, if A1
contained Black Cup With Handle
then in another cell you could use =length(A1)
which would return 5,3,4,6
.
回答3:
If you want to use a VBA code you could try this:
Option Explicit
Sub test()
Dim arr As Variant
Dim str As String
Dim i As Long, j As Long, LastRow As Long
With ThisWorkbook.Worksheets("Sheet1")
LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
For i = 1 To LastRow
str = .Range("A" & i).Value
arr = Split(str, " ")
For j = LBound(arr) To UBound(arr)
.Cells(i, j + 3).Value = Len(arr(j))
Next j
Next i
End With
End Sub
Results:
来源:https://stackoverflow.com/questions/58673369/is-there-a-way-to-count-the-number-of-characters-per-word-for-a-string-returnin