Problem
"aaa".split('a') == [] "aaa".split('a').join('a') == ""
Standard split is often ‘clever’, but not logical and not symmetric to join. To fix this here is a naive alternative that behaves ‘dumb’ but logical.
Solution
class String # https://grosser.it/2011/08/28/ruby-string-naive-split-because-split-is-to-clever/ # " ".split(' ') == [] # " ".naive_split(' ') == ['','','',''] # "".split(' ') == [] # "".naive_split(' ') == [''] def naive_split(pattern) pattern = /#{Regexp.escape(pattern)}/ unless pattern.is_a?(Regexp) result = split(pattern, -1) result.empty? ? [''] : result end end
I’ve got a simpler implementation for you:
class String
def naive_split(pattern)
split(pattern, -1)
end
end
or in other words. It is already part of the standard library, though a bit hidden.
would not work:
” “.split(‘ ‘,-1) == [“”]
Probably need to escape that regexp.
pattern = /#{Regexp.escape(pattern)}/ unless pattern.is_a?(Regexp)
If you don’t, I’m pretty sure this’ll break it:
“hello?”.naive_split(‘?’)
Tanks, just added it 🙂
Both #split and #join understands empty separator, like (”). However #split eats the separator String, #scan does not, so either you #scan on ‘a’, or #split on an empty string, like ”:
> “aaa”.scan(‘a’)
=> [“a”, “a”, “a”]
OR
> “aaa”.split(”)
=> [“a”, “a”, “a”]
#join inserts new characters when joining, so .join(‘a’) will insert multiple ‘a’s into the string (which contradicts symmetry). But you can insert the empty string, so either way you have split, its reverse is .join(”) and not .join(‘a’).