Ruby String Naive Split because split is to clever

Problem

"aaa".split('a') == []
"aaa".split('a').join('a') == ""

Standard split is often ‘clever’, but not logical and not symmetric to join. To fix this here is a naive alternative that behaves ‘dumb’ but logical.

Solution

class String
  # https://grosser.it/2011/08/28/ruby-string-naive-split-because-split-is-to-clever/
  # "    ".split(' ') == []
  # "    ".naive_split(' ') == ['','','','']
  # "".split(' ') == []
  # "".naive_split(' ') == ['']
  def naive_split(pattern)
    pattern = /#{Regexp.escape(pattern)}/ unless pattern.is_a?(Regexp)
    result = split(pattern, -1)
    result.empty? ? [''] : result
  end
end

5 thoughts on “Ruby String Naive Split because split is to clever

  1. I’ve got a simpler implementation for you:

    class String
    def naive_split(pattern)
    split(pattern, -1)
    end
    end

    or in other words. It is already part of the standard library, though a bit hidden.

  2. Probably need to escape that regexp.

    pattern = /#{Regexp.escape(pattern)}/ unless pattern.is_a?(Regexp)

    If you don’t, I’m pretty sure this’ll break it:

    “hello?”.naive_split(‘?’)

  3. Both #split and #join understands empty separator, like (”). However #split eats the separator String, #scan does not, so either you #scan on ‘a’, or #split on an empty string, like ”:
    > “aaa”.scan(‘a’)
    => [“a”, “a”, “a”]
    OR
    > “aaa”.split(”)
    => [“a”, “a”, “a”]
    #join inserts new characters when joining, so .join(‘a’) will insert multiple ‘a’s into the string (which contradicts symmetry). But you can insert the empty string, so either way you have split, its reverse is .join(”) and not .join(‘a’).

Leave a comment