momer@soryy:/$ cd /home/soryy

momer@soryy:~$ for dir in presentations posts; do echo $dir/:; ls -lath $dir | tail -5; done

presentations/:

-rw-r--r-- 1 momer momer 22 Oct 2014 groupcache-in-depth-overview.pres
posts/:
-rw-r--r-- 1 momer momer 623 07 Dec 2014 apache-cassandra-introduct....blog
-rw-r--r-- 1 momer momer 15136 09 Aug 2014 not-another-go/golang-net/h....blog
-rw-r--r-- 1 momer momer 2409 09 Aug 2014 indepth-golang-resources-a....blog
-rw-r--r-- 1 momer momer 9847 31 Jul 2014 ajax/javascript-enabled-par....blog
-rw-r--r-- 1 momer momer 10214 05 Jul 2014 common-mistakes-made-with-g....blog
-rw-r--r-- 1 momer momer 3374 16 Jun 2014 docker-resolving-dns-issue....blog
-rw-r--r-- 1 momer momer 3538 25 Apr 2014 why-jruby.blog
-rw-r--r-- 1 momer momer 9274 16 Mar 2014 apis-with-devise.blog

Why JRuby

Preface

There are many different implementations of Ruby [1], and each has their own set of disadvantages and advantages.

Why JRuby?

JRuby has many advantages, including

  • Performance benefits in some applications [2]
  • True system threads [3]
  • Access to the Java Toolchain [4]

Real Threads?

As explained in the http://www.restlessprogrammer.com/2013/02/multi-threading-in-jruby.html linked earlier, JRuby runs on the Java Virtual Machine (JVM) [5], allowing it to provide 1:1 system threads [6].

While Matz's Ruby Interpreter (MRI) provides threads, they're constrained by the MRI implementation of a Global Interpreter Lock [7].

Why does this matter?

Let's consider this thought experiment:

You have many large .csv files, each about 1GB large. You'd like to process each of them using a simple process:

  1. Split each file into workable pieces
  2. Read each piece concurrently

So, you end up writing something like this [8]

threads = Dir[File.join(File.dirname("path/to/my", "*.csv"))].each do |file|
  Thread.new do
    CSV.open(file, 'r+') do |csv|
    ...
    end
  end
end

# Waits for threads to finish, returns each thread in turn when done
threads.map(&:join)

But the process is taking forever! So, you open up your ActivityMonitor (or Task Manager) to see what's going on. Maybe the program hung!

Example 100% CPU Core utilization in MRI Ruby

Nope. You see that your ruby process, of which there's only one entry in your ActivityMonitor, is running at 100%. It even has more than one thread! But then, on your processor view, you notice that only one core is being maxxed out.

What gives? MRI Ruby is doing its best to provide the perception of a program running in multiple threads. In fact, each process is running in multiple threads; however, you can think of the MRI implementation as running these threads within the confines of a single parent thread.

You've probably guessed by now that JRuby is an implementation which is able to bypass this MRI limitation.

In JRuby, each thread runs as a thread on the JVM, where it's usually mapped to a native operating system thread after some fancy footwork by the JVM. [6].

So in JRuby, running a similar convoluted example, we'd see something like this:

Example >100% CPU Core utilization in JRuby

Ok, I admit, the examples in the screenshots aren't exactly real-world use-cases, but you get the point: JRuby tangibly grants you more power. Now, all you have to do is not fuck it up! Check out the list below for reading material to get you quick-started on this last point, of not screwing up threads, state, etc. in JRuby.

Mo