inject vs each
Discussion

As I began to study Ruby, I found very useful to contrast some Ruby methods to the usual way in which we developed the equivalent functionality in other languages. The reason is that, once the correspondence is established, it becomes more natural to think in the Ruby way (at least for that method) spontaneously (rather than, "ok, I need to review the doc for... either collect, detect, or inject, one of those is probably what is needed").

[Incidentally: I found the same when studying human languages: realizing what a certain idiom in a new language corresponded to, exactly, in another language, was essential to begin using it spontaneously (or, if a translation still occurred in the brain, it became instantaneous!)].

So, I start discussing the most mysterious method: inject, comparing it to the equivalent functionality in eg, Javascript. Actually, we start from Javascript (top), and then rewrite it in Ruby (bottom):

Suppose we want to add and multiply elements of an Array:

 // In Javascript (I write long lines of code to better compare with Ruby below):
 arr=[2, 4, 8];

 var mult = 1; for (var i=0; i < arr.length; i++) { mult *= arr[i]; }
 var sum  = 0; for (var i=0; i < arr.length; i++) { sum  += arr[i]; }
    
 # In Ruby (note: the for loop and the array access disappear):
 arr=[2, 4, 8];

 mult = arr.inject { |mult, item|  mult * item }
 sum  = arr.inject { |sum,  item|  sum  + item }
    

Comparing the Javascript code to Ruby, we can deduce several things:

  • inject is (in spite of its extravagant name) an 'accumulate' function
  • the partial result of the operation (that I will call the 'accumulator') is returned each time by the block to the function, which, at the next iteration, gives it back as the first parameter to the block! Oh, and the value of the accumulator given to the block the first time is the first element of the collection!

    It was useful to me at the beginning (hey, even now) to picture this exchange as a ping pong game, where the method serves each time both the accumulator (a bit like the server announcing the current score) and an array item, and the code block returns the new accumulator value.

    Ok, really cool; but, after the first moment of wonder, a disturbing question pops to mind: why exactly do we need this ping-pong? we certainly can reach the same result with our unassuming friend each; there is only a variable that changes place:

    # using each instead of inject
    arr  = [2, 4, 8]
    
    mult = 1 
    arr.each { |item| mult *= item}
    
    puts mult    # --> 64
       
    # inject
    arr  = [2, 4, 8]
    
    
    mult = arr.inject {|mult, item| mult * item }
    
    puts mult  # --> 64
       
  • This sounds bizarre: each can do the same work than inject, by just initializing a variable!? do we need inject then? (the slight sense of oddity felt may increase when we discover that we need sometimes to initialize the accumulator: see next). However, let us continue experimenting, before drawing a conclusion.
  • One can observe that something seems to be missing in inject; suppose that we have to compute an algorithm with edge-cases where the iteration may not happen even once, like a factorial(n) for values of n <= 1. The method solves this by allowing to specify a parameter that sets the initial value of the accumulator.

    It is useful to contrast the implementation of method factorial(n) in each and inject (by the way, we are aware that factorial is the moment where even sane programmers feel compelled to exhibit recursion skills; but we will bravely resist the pleasure to waste large amounts of memory and cpu cycles for no reason whatsoever):

    // In Ruby, with each
    
    def factorial(n)
      fact = 1
      (1..n).each { |i| fact *= i }
      fact
    end
    
    factorial(6);   //  --> 720
    factorial(0);   //  -->  1 
    
    # with inject 
    
    def factorial(n)
      (1..n).inject(1) {|fact,v| fact *= v }
    end
    
    
    
    factorial(6)  # -> 720
    factorial(0)  # -> 1
    

    We must say that inject begins to shine a bit more: not only we save a variable to declare the accumulator, but the method itself returns the value of the computation (while the method with the each loop needs to list at the end the accumulator value). We can see that inject produces very compact methods.

    However, it is now important to understand when we need to initialize explicitly the accumulator [people often use this form of inject even when not necessary, perhaps to be on the safe-side; but then, the initial value needs to be right. In the example with the sum/multiply over an array, the initial value, if provided, would need to be 0 for the sum, and 1 for the multiplication; simple, but unnecessary]. In general, there are 2 cases in where it is necessary to specify the initial value:

    1. the accumulator does not have the same type of data as the elements of the array; for example, an iteration to find the size of the longest word in an array; if we do not set the value of the accumulator, it will be set to the first word (and not to its size).
    2. the iteration may not be performed even once, because of an empty range (as in the case above). This means that if the range values are parameters (ie, not under our control), we need to initialize always!

Conclusion: do we really gain from using inject (rather than each, that gets the job done in what it seems a more natural way)? does preventing a declaration of an extra variable justify a method so .. subtle? (but wait, we also save another extra line to return the result, as the method returns the value of the computatation).

David Black has a pragmatic answer to this question (see "Ruby for Rails", p 418, the 'def balance' example): the 2 versions (with each or inject) produce exactly the same result. Which you use is up to you, although it is a good idea to make sure you understand both of them.

Well, it is exactly what we tried to do.

[Note added some months after this page was written]

Rails code uses inject for methods which perform (and return) prodigious computations in 1 line. If one is interested in understanding (ie, not just browsing it) Rails code, it is definitely useful to be fluent with this method.



[URL: ; Last updated: ]