Ruby on ZiGzAg About GitHub LinkedIn Twitter RSS

Grep in Ruby – A powerful Enumerable method

31 Mar 2010 – ZhuHai

If you were a Unix shell fan, I bed that the “grep” command would be in your top-used list. But as for the Ruby programming, are you still prefer “grep” to the other methods come with the “Enumerable” module?
Do you know in most of the situation, using “grep” is more handy and powerful than the commonly used “map” or “select” methods? I will try to unveil the power of “grep” method in this post.

First of all, let’s review what “grep” do for an Enumerable object:
(Quoted from the Rdoc document of Ruby)

    enum.grep(pattern)                   => array 
    enum.grep(pattern) {| obj | block }  => array 
    Returns an array of every element in enum for which Pattern === element.
    If the optional block is supplied, each matching element is passed to it,
    and the block's result is stored in the output array.  

Simple enough,right? Here I will show you some convenient use cases:

Simpler select – Filtering by pattern matching rather than giving a block to “select”.

Most of the time, we use the “select” method for filtering the elements in an Enumerable object by giving a block as criterion. Such as using ['a','1','b'].select{|x| /^\d*$/ === x} to filter out the numeric string. But by applying “grep” method with a Regexp pattern instead of a block, we get a shorter version: ['a','1','b'].grep(/^\d*$/). This shortcut not only works for the RegExp pattern matching, But also works for all objects that overwrote the “===” method. Such as “Range” and “Class”. Therefore we can easily filter out elements like:

It is very easy to customize your filtering as well, just have you categorization logic defined as case equal(“===”) in your own class, then the instances of that class can be easily filtered out from an Enumerable object using the “grep” method.

Easier transforming using grep with Regexp and $n

It is not uncommon that we need to select some elements from a String collection according to some pattern and want to strip out part of the origin. Like the below example:
Given a String Array like ['hello.rb','world.rb','public.html'], and what we expected is the Ruby files without the extension name like: ['hello','world'].
Without thinking it too much, we can achieve this goal by combining the “select” and “map” methods just like #1 line below. But after we get familiar with the “grep” method and convention of the Regexp, we can have a more compact and elegant expression like #2 line:

  1. ['hello.rb','world.rb','public.html'].select{|x|x=~/\.rb/}.map{|x|x[0..-4]}
  2. ['hello.rb','world.rb','public.html'].grep(/(.*)\.rb/){$1}

Selective bulk transforming – combination of “select” and “map”

Among the Ruby methods of Enumerable, normally we use “select” for filtering and use “map” for bulk transforming. But as the examples showed before, the “grep” method can act as a combination of those two methods. Here it is called “selective bulk transforming”. A simple example is to pickup the numeric string and transfer them into Integer instances. See the following two versions for comparison:

  1. ['a','1','b','2'].select{|x| /^\d*$/ === x}.map{|x| x.to_i}
  2. ['a','1','b','2'].grep(/^\d*$/){|x| x.to_i}

Or, even we make them shorter by the &symbol block, the version using “grep” looks more lovely:

  1. ['a','1','b','2'].select{|x| /^\d*$/ === x}.map(&:to_i)
  2. ['a','1','b','2'].grep(/^\d*$/,&:to_i)

Conclusion

Although the definition and implementation of the “grep” method from Enumerable module is simple, but by combing with other Ruby expression like case equal(“===”), Regexp, and “&symbol” block, It become a very handy and powerful method. Therefore next time when you come across some situation need to use “select + map”, keep “grep” in your mind beforehand.

Comments

Fork me on GitHub