What is the difference between these two (String#scan
and String#split
) in Ruby?
They serve entirely different purposes. String#scan is used to extract matches of a regular expression from a string and return the matches in an array, while String#split is intended to split a string up into an array, based on a delimiter. The delimiter may be either a static string (like ;
to split on a single semicolon) or a regular expression (like /\s/+
to split on any whitespace characters).
The output of String#split
doesn't include the delimiter. Rather, everything except the delimiter would be returned in the output array, while the output of String#scan
would only include what is matched by the delimiter.
# A delimited string split on | returns everything surrounding the | delimiters
"a|delimited|string".split("|")
# Prints: ["a", "delimited", "string"]
# The same string scanninng for | only returns the matched |
"a|delimited|string".scan("|")
# Prints: ["|", "|"]
Both of the above would also accept a regular expression in place of the simple string "|"
.
# Split on everything between and including two t's
"a|delimited|string".split(/t.+t/)
# Prints: ["a|delimi", "ring"]
# Search for everything between and including two t's
"a|delimited|string".scan(/t.+t/)
# Prints: ["ted|st"]