Ruby String split with regex
Solution 1
I think this would do it:
a.split(/\.(?=[\w])/)
I don't know how much you know about regex, but the (?=[\w])
is a lookahead that says "only match the dot if the next character is a letter kind of character". A lookahead won't actually grab the text it matches. It just "looks". So the result is exactly what you're looking for:
> a.split(/\.(?=[\w])/)
=> ["foo", "bar", "size", "split('.')", "last"]
Solution 2
I'm afraid that regular expressions won't take you very far. Consider for example the following expressions (which are also valid Ruby)
"(foo.bar.size.split( '.' )).last"
"(foo.bar.size.split '.').last"
"(foo.bar.size.split '( . ) . .(). .').last"
The problem is, that the list of calls is actually a tree of calls. The easiest solution in sight is probably to use a Ruby parser and transform the parse tree according to your needs (in this example we are recursively descending into the call tree, gathering the calls into a list):
# gem install ruby_parser
# gem install awesome_print
require 'ruby_parser'
require 'ap'
def calls_as_list code
tree = RubyParser.new.parse(code)
t = tree
calls = []
while t
# gather arguments if present
args = nil
if t[3][0] == :arglist
args = t[3][1..-1].to_a
end
# append all information to our list
calls << [t[2].to_s, args]
# descend to next call
t = t[1]
end
calls.reverse
end
p calls_as_list "foo.bar.size.split('.').last"
#=> [["foo", []], ["bar", []], ["size", []], ["split", [[:str, "."]]], ["last", []]]
p calls_as_list "puts 3, 4"
#=> [["puts", [[:lit, 3], [:lit, 4]]]]
And to show the parse tree of any input:
ap RubyParser.new.parse("puts 3, 4")
Solution 3
a = "foo.bar.size.split('.').last"
p a.split(/(?<!')\.(?!')/)
#=> ["foo", "bar", "size", "split('.')", "last"]
You are looking for Lookahead and Lookbehind assertions. http://www.regular-expressions.info/lookaround.html
Solution 4
here I don't have ruby env. I tried with python re.split().
In : re.split("(?<!')\.(?!')",a)
Out: ['foo', 'bar', 'size', "split('.')", 'last']
the regex above has negative lookahead AND lookbehind, to make sure only the "dot" between single quotes won't work as separator.
of course, for the given example by you, one of lookbehind or lookahead is sufficient. you can choose the right way for your requirement.
Related videos on Youtube
Haris Krajina
I am a software and product engineer that is very passionate about building innovative and simple products.
Updated on September 10, 2020Comments
-
Haris Krajina over 3 years
This is Ruby 1.8.7 but should be same as for 1.9.x
I am trying to split a string for example:
a = "foo.bar.size.split('.').last" # trying to split into ["foo", "bar","split('.')","last"]
Basically splitting it in commands it represents, I am trying to do it with Regexp but not sure how, idea was to use regexp
a.split(/[a-z\(\)](\.)[a-z\(\)]/)
Here trying to use group
(\.)
to split it with but this seems not to be good approach.-
sawa over 11 yearsIt is not as easy as you think.
-
iconoclast over 9 years@sawa: you closed a question because you think it's too hard?
-
sawa over 9 years@iconoclast I don't remember, but not because of the reason you think.
-
iconoclast over 9 years@sawa I see no legitimate reason to close this question. What am I missing?
-
sawa over 9 years@iconoclast It is not constructive to do such thing. See Matt's answer and comments under Jason Swett's answer. But the reason is not that either.
-
iconoclast over 9 yearsHow does that justify closing the question? How is it constructive to shutdown all attempts to solve difficult problems? The question is clearly not an opinion-based question. The main thing that is opinionated is your claim that this is not a good idea.
-
reducing activity over 7 years@sawa - "It is not constructive to do such thing". Maybe in this particular example. But this is top result for googling "Ruby split with regexp" (see duckduckgo.com/?q=Ruby+split+with+regexp). I see no reason whatsoever to close this question.
-
-
Haris Krajina over 11 yearsWow, excellent and thank you for info about lookahead. No I did not know that and it is excellent thing to learn seems very useful.
-
sawa over 11 yearsThis will split a string like
"foo.bar.size.split('.bar').last"
into["foo", "bar", "size", "split('", "bar')", "last"]
. -
sawa over 11 yearsAs you may have noticed, this will not work correctly for
["foo", "bar", "size", "split('o.b')", "last"]
. -
Jason Swett over 11 yearsGood point. Hats off to the person who can figure out how to make this work with any argument inside the
split
- it's beyond my skill level. -
Haris Krajina over 11 yearsThat is true @sawa, as you assume this is for metaprograming purposes so I will experience problems down the road. Looking for the away to make it
'
aware. -
Jason Swett over 11 yearsNot just
'
but probably"
! Might want to account for the wholesplit(...)
depending on what your requirements are.