Ruby String Naive Split because split is to clever

Problem

"aaa".split('a') == []
"aaa".split('a').join('a') == ""

Standard split is often ‘clever’, but not logical and not symmetric to join. To fix this here is a naive alternative that behaves ‘dumb’ but logical.

Solution

class String
  # https://grosser.it/2011/08/28/ruby-string-naive-split-because-split-is-to-clever/
  # "    ".split(' ') == []
  # "    ".naive_split(' ') == ['','','','']
  # "".split(' ') == []
  # "".naive_split(' ') == ['']
  def naive_split(pattern)
    pattern = /#{Regexp.escape(pattern)}/ unless pattern.is_a?(Regexp)
    result = split(pattern, -1)
    result.empty? ? [''] : result
  end
end

5 thoughts on “Ruby String Naive Split because split is to clever

  1. I’ve got a simpler implementation for you:

    class String
    def naive_split(pattern)
    split(pattern, -1)
    end
    end

    or in other words. It is already part of the standard library, though a bit hidden.

  2. Probably need to escape that regexp.

    pattern = /#{Regexp.escape(pattern)}/ unless pattern.is_a?(Regexp)

    If you don’t, I’m pretty sure this’ll break it:

    “hello?”.naive_split(‘?’)

  3. Both #split and #join understands empty separator, like (”). However #split eats the separator String, #scan does not, so either you #scan on ‘a’, or #split on an empty string, like ”:
    > “aaa”.scan(‘a’)
    => [“a”, “a”, “a”]
    OR
    > “aaa”.split(”)
    => [“a”, “a”, “a”]
    #join inserts new characters when joining, so .join(‘a’) will insert multiple ‘a’s into the string (which contradicts symmetry). But you can insert the empty string, so either way you have split, its reverse is .join(”) and not .join(‘a’).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s