Quiz: which ruby program uses more CPU?

Program A:

sleep(100)

Program B:

t = Thread.new do
  sleep(100)
end

t.join

Which program uses more CPU? Hint: this is ruby 1.8.6.

The simple test

Run both programs overnight (surround the sleep(100) with a repeating while), and ‘ps aux’ to see cpu usage.

The complex test that explains what is happening

Run both programs with valgrind / kcachegrind, or pop into gdb.

The answer

Now, neither of these programs use a lot of CPU, but if you leave them running for long periods of time, one program will continue to use CPU cycles and the other will not. While they should both be blocking for 100 seconds, one is waking up every 10ms. Why?

By default ruby creates one operating system thread that runs your ruby code. If you create one or more threads, however, ruby creates a second operating system thread. This second thread wakes up every 10ms, checks to see if it trapped an interrupt signal, checks to see if the thread was canceled, and then goes back to sleep. You can verify this by popping into gdb and looking at where your threads are. This second thread, while not using very much CPU, does use some. Hence, Program B uses more CPU.

Morals

Don’t use threads if you don’t have to, but *really* don’t worry too much about it. If you’re that short on CPU cycles, you should probably be getting some more hardware.

Tagged | 1 Comment

Ruby Daemons: Verifying Good Behavior

If your ruby daemon isn’t doing something useful, it should be sleeping. A misbehaving daemon wakes up repeatedly, uses your CPU unnecessarily, and then goes back to sleep without doing anything helpful. It is easy to write a misbehaving daemon in Ruby, since you’re pretty well-insulated from what is actually happening at the operating system level. Let’s fix that.

Use ps

Watch your daemon using ps (I use ‘ps aux’). Look in the time field. If the time is increasing but your daemon is supposed to be sleeping, you have a problem. A greatly misbehaving daemon will also show up in the STAT field as running. Caught in the act! Now to figure out what it is doing.

Use strace

Use strace to watch the outgoing system calls of your daemon. Grab your daemon’s pid from the ps screen and use ’strace -p<your_pid>’. If your daemon is well-behaved, you should see nothing here if your daemon is sleeping. Otherwise, you’ll see a bunch of activity. Grab your copy of Stevens’s Advanced Programming in the UNIX Environment and debug.

Example one: insomniac daemon / incorrect select timeout using ruby inotify

inotify is a unix library for sending events when something changes in the filesystem (i.e. a file is moved, renamed, deleted, touched, etc). Ruby inotify is a useful plugin for using the inotify library in ruby. A simple inotify daemon may simply wait for something to happen to a file (say it is renamed) and then it performs some action (logs the rename operation? renames the file back to its original name?). If nothing is happening to the file, the daemon should be doing an equivalent amount of nothing.

Download ruby inotify version 0.0.2 from http://raa.ruby-lang.org/project/ruby-inotify/
Compile using rake.
Run ruby ./examples/watcher.rb ./tests/

This will start up an example script that watches the ./tests directory for any changes. Now create a new file in the ./tests directory to make sure that the script notices it. The script noticed it? Great. As you can see this script isn’t really doing anything from a functional standpoint except waiting. Or, we should say it shouldn’t be doing anything besides waiting. Let’s see.

Use ps to grab the script’s pid and strace it. You should see a scrolling list of selects, something like this repeated indefinitely:

select(5, [3], [], [], {0, 44})         = 0 (Timeout)
select(5, [3], [], [], {0, 0})          = 0 (Timeout)
select(5, [3], [], [], {0, 3999})       = 0 (Timeout)

See your copy of Stevens’s Advanced Programming in the UNIX Environment for good information on what this means, but in a nutshell select in combination with inotify is watching the directory for events. The last argument to select (the {0,44}, {0,0}, or{0,3999}) is a timeout value where the first digit is seconds and the second digit is microseconds. So {0,3999} is a timeout value of 0 seconds and 3999 microseconds, or about 4ms. Select is returning 0 (Timeout), which means select waited until it timed out without an event happening. This is being repeated over and over and over again. Effectively, we’re polling for events every 4ms instead of simply waiting for the event to happen.

We’re going to skip over the steps where we use gdb or ddd to find the code where this is occurring. Instead, open ./ext/inotify.c. Ruby inotify works by having an event loop (starting on line 129 with the while(1)) that checks for events (see event_check: line 30), processes the events (line 134+) and repeats. Note how if event_check returns 0, there are no events to process. Now look at event_check (lines 36-49). On line 47, you can see the rb_thread_select call; this will go through ruby’s select multiplexer and eventually end up in the select calls we saw earlier using strace. In fact, you can see on lines 41-42 the timeout being set and passed to rb_thread_select. So here is the 4ms poll. Time to fix it.

Take this line:

r = rb_thread_select (fd+1, &rfds, NULL, NULL, &timeout);

and change the &timeout to NULL, i.e.:

r = rb_thread_select (fd+1, &rfds, NULL, NULL, NULL);

A NULL timeout means infinity here so once this function is called, we’ll wait infinitely long for an event. This is fine. Save, go back to your command line, cntl-c out of your strace and your example script, and use rake to recompile ruby-inotify.

Rerun ruby ./examples/watcher.rb ./tests/
Use ps to grab the pid and strace it.

There should be no scrolling selects, there should only be activity on the strace when you actually create files in the ./tests/ directory and the script is actually doing work. Good script.

Example two: excited daemon / unnecessary signal redefinition

Check out this daemon example in ruby: http://blog.sosedoff.com/2009/01/24/writing-very-simple-daemon-in-ruby/. Download the example code, and you’ll also need the daemonize.rb script in the same directory as the example.

Run ‘ruby daemon.rb’ (you may need a sudo for this). Every second, the daemon will output a “Daemon working” message and then go back to sleep for another second. The code that does this looks like this:

def daemon_log(str)
  puts "[#{Time.now.strftime("%m/%d/%Y-%H:%M:%S")}] #{str}"
end

...

while $daemon[:work] do
   daemon_log("Daemon working")
   daemon_handle_signals
   sleep($daemon[:delay_sleep])
end

The daemon is running; now grab the pid using ps, and strace it. Every second you’ll see something like

rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
write(1, "[02/23/2009-14:17:47] Daemon work"..., 37) = 37
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, ~[KILL STOP RTMIN RT_1], 8) = 0
rt_sigaction(SIGTERM, {0x33a9a96c40, [], SA_RESTORER, 0x38c5e0f0f0}, {0x33a9a96c40, [], SA_RESTORER, 0x38c5e0f0f0}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, ~[KILL STOP RTMIN RT_1], 8) = 0
rt_sigaction(SIGKILL, {0x33a9a96c40, [], SA_RESTORER, 0x38c5e0f0f0}, {0x33a9a96c40, [], SA_RESTORER, 0x38c5e0f0f0}, 8) = -1 EINVAL (Invalid argument)
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, ~[KILL STOP RTMIN RT_1], 8) = 0
rt_sigaction(SIGINT, {0x33a9a96c40, [], SA_RESTORER, 0x38c5e0f0f0}, {0x33a9a96c40, [], SA_RESTORER, 0x38c5e0f0f0}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, ~[KILL STOP RTMIN RT_1], 8) = 0
rt_sigaction(SIGTSTP, {0x33a9a96c40, [], SA_RESTORER, 0x38c5e0f0f0}, {0x33a9a96c40, [], SA_RESTORER, 0x38c5e0f0f0}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
select(0, NULL, NULL, NULL, {1, 0}

)     = 0 (Timeout)

Here is what is clear to us: the line beginning with ’stat’ is getting the time, the line beginning with ‘write’ is writing the log, and near the end there is this line ’select(0, NULL, NULL, NULL, {1, 0}’ that is doing a one second delay. But what is rt_sigprocmask and rt_sigaction and why is it happening every second? See the sigprocmask and sigaction section of your Steven’s book for real details, but effectively our application is registering to receive signals every second. This would be fine if we were handling a signal and then needed to handle future signals differently depending upon the first signal; however, this daemon is pretty simple, so let’s look back at the code:

while $daemon[:work] do
   daemon_log("Daemon working")
   daemon_handle_signals
   sleep($daemon[:delay_sleep])
end

Let’s look at daemon_handle_signals (lines 83-105):

def daemon_handle_signals
  # termination signal
  Signal.trap("TERM") do
    daemon_log("TERM signal received.")
    daemon_terminate
  end  

  # kill signal
  Signal.trap("KILL") do
    daemon_log("KILL signal received.")
    daemon_terminate
  end

  # keyboard interruption
  Signal.trap("INT") do
    daemon_log("SIGINT signal received.")
    daemon_terminate
  end

  Signal.trap("TSTP") do
    daemon_log("SIGTSTP signal received.")
  end
end

Each of these Signal.trap() lines is registering a block with the rt_sigaction call we see in strace. This code really only needs to be run once at program boot, not each time though the event loop. Once Signal.trap() is called and passed a block, the block will be evaluated when the process is signaled, regardless of where our application is in its codebase. Go back to this piece of code (lines 70-74):

while $daemon[:work] do
   daemon_log("Daemon working")
   daemon_handle_signals
   sleep($daemon[:delay_sleep])
end

And move the daemon_handle_signals call above the loop, like this:

daemon_handle_signals
while $daemon[:work] do
   daemon_log("Daemon working")
   sleep($daemon[:delay_sleep])
end

Terminate the old daemons and straces if you haven’t yet, and start them back up. You should see a much smaller strace happening every second, but the signals still work. Easiest one to try is cntl-c which sends a INT signal to the proccess and causes it to quit.

A good daemon is one that does no unnecessary work.
Always strace your daemons to make sure they’re doing what you expect.
Go ahead and write daemons in ruby.

Tagged , | 1 Comment

Ruby: Overriding Class.new to make custom wrapper pattern

If you have a class Content and you call Content.new, you’ll get back an object of class Content. This is normal and expected. If you want something that is abnormal and unexpected to occur, this recipe is for you. While here we just implement a simple wrapper pattern, you could just as easily have Content.new return objects of class DefinitelyNotContent or BombsIfTheProgrammerTriesToUseThisAsContent.


class Wrapper
  attr_accessor :content

  def initialize(content)
    @content = content
  end

  def content_value
    @content.value
  end
end

class Content
  attr_accessor :value

  class << self
    alias_method :new_orig, :new

    def new(*args)
      o = new_orig *args
      Wrapper.new(o)
    end
  end

  def initialize(default)
    @value = default
  end

end

class ComplicatedContent < Content
  alias_method :simple_value, :value
  def value
    simple_value.reverse
  end
end

c = Content.new("a_to_z")
puts "Content.new("test"):"
puts "- class = #{c.class}"
puts "- value = #{c.content_value}"

c = ComplicatedContent.new("a_to_z")
puts "ComplicatedContent.new("test"):"
puts "- class = #{c.class}"
puts "- value = #{c.content_value}"
Tagged , | 1 Comment

Rails / Merb performance comparison (on mongrel, jruby, tomcat, glassfish)

[*****UPDATE 9/21/08: I've added Rails 2.1.1 and Rails 2.2 head stats for C MRI Ruby on Mongrel. Rails 2.2 tests pretty slow and the results should not be taken seriously until the Rails team comes to a point release and optimizes. I've also added tests with the MySQL C gem and Merb seems to get more of a speed boost than Rails does.

[*****UPDATE 9/21/08: I've done some additional optimization on the JRuby Mongrel tests, and JRuby Mongrel is now approaching the speed of MRI Mongrel. On a production platform, JRuby Mongrel might be faster than MRI Mongrel since Apple's Java implementation is so slow.]

[*****UPDATE 9/19/08: the Merb team did some profiling of Merb yesterday and they fixed some slowdowns that were recently introduced. I've rerun all the Merb tests using Merb head as of 9/18/08 @ 5:36 PM and added a new column with the Merb head data. Merb is now faster in all cases: see the results below.]

I just finished running benchmarks on Merb 0.9.5 and Rails 2.0.2 across a number of servers in both MRI C Ruby 1.8.6 and JRuby 1.1.4.

The test consisted of one action that retrieves one model object via ActiveRecord and renders the contents of that object via a simple template in erb. I’ve uploaded both apps.

I did the benchmarking from a different machine using apache benchmark; specifically, I called “ab -n 1000″ repeatedly until the results converged. Apache benchmark wasn’t using any concurrency (-c command line option) so the benchmark should have been testing how fast the app runs in one core. The benchmark was run on my MacBookPro 2.4Ghz Intel Core 2 Duo w/ 4 GB of ram.

*****DISCLAIMER****: the absolute requests per second (rps) figures reported here will be significantly slower than what you would get on a production machine. The figures here are only useful when comparing speed differences between frameworks and environments.


C MRI Ruby 1.8.6 + Mongrel 1.1.4

Environment Rails head (rps) Rails 2.1.1 (rps) Rails 2.0.2 (rps) Merb 0.9.5 (rps) Merb head (rps) Notes
-e production - - 250 221 361
-e production (session :off) - - 280 n/a n/a Merb only establishes sessions when needed.
-e production (with mysql c library) 211* 263 278 - 440
-e production (with mysql c library and session :off) 211* 296 321 n/a n/a Merb only establishes sessions when needed.

* The results for Rails head 2.2 are preliminary and should be ignored until the Rails team releases an optimized point release.

Results: Rails is 13% faster than Merb even when it is using sessions.

Merb head is running ~58% faster than Rails 2.0.2.


Jruby 1.1.4 + Mongrel 1.1.5

Environment Rails (rps) Merb (rps) Merb head (rps) Notes
-e production 176 143 267
-e production (session :off) 202 n/a n/a
-e production (jruby -J-server) 171 156 267 To get a performance increase when using -J-server, it is important to optimize memory settings (see next line).
-e production (jruby -J-server, -Xmn256m -Xms1024m -Xmx1024m) 205 - 296 Memory settings used (256 MB for garbage collection, 1024MB for heap) seems to be optimal on my system.
-e production (jruby thread pooling) n/a 155 279
-e production (jruby -J-server and thread pool of 25) n/a 162 298
-e production (jruby -J-server, thread pool of 25, -Xmn256m -Xms1024m -Xmx1024m) n/a - 334 Memory settings used (256 MB for garbage collection, 1024MB for heap) seems to be optimal on my system.

Results: Rails is about 9% faster than Merb running with java server mode enabled and with a thread pool.

Merb head is running 44% + faster than Rails to ~ 62% faster with all JRuby optimizations. The fastest rps here is Merb head (server mode, thread pool, memory settings) at 334 rps, which is approaching the rps of Merb in MRI Ruby of 361 rps (using ruby mysql). The Ruby MySQL adapter that ships with Rails 2.0.2 is very slow (see benchmarks above in C MRI Ruby + Mongrel) and the JRuby / Mongrel environment should be faster with a native Java database adapter (maybe the JDBC adapter?). Note also that the MacOSX implementation of java is notoriously slow (I’ve heard estimates of 30% slower than a linux java) so if the same tests were run on a different JVM, the JRuby Mongrel speed might be faster than the C MRI Ruby speed.


Jruby 1.1.4 + Tomcat 6.0.14 / Glassfish 9.1u2

Environment Rails (rps) Merb (rps) Merb head (rps) Notes
Glassfish production 130 110 185 Couldn’t get JRuby thread pool working.
Tomcat production 144 75 197 Couldn’t get JRuby thread pool working.
There was some Tomcat/Merb specific problem here that was fixed in Merb head.

Results: Rails is about 18% faster than Merb running on Glassfish. Rails is about 2x faster on Tomcat, though I expect that there is some sort of Tomcat-JRuby-Merb-specific bug here.

Merb head is running ~36%+ faster than Rails in Tomcat and Glassfish. The previous results where Tomcat was running 2X slower are now gone, as the Merb fixes eliminated that discrepancy.


Conclusion

In these tests, Rails is running around 9-18% faster than Merb in general (with the exception of Tomcat, which should be looked into). This implies that Merb has a little more overhead than Rails does, which is a surprise. Merb should have lower overhead, so why is it showing up slower in these tests?

With the updates by the Merb team, Merb is now running 36-70% faster than Rails, depending on the environment.

Tagged , , , | Comments closed

Merb, not rails: a practical introduction to a new framework.

As promised, here are the slides from my talk last night on Merb at the East Bay Ruby Meetup.

Merb, not rails? A practical introduction to a new framework.

***Warning: the performance results in some of the slides are invalid; refer instead to the updated merb / rails performance comparison for accurate data.

Tagged , | Comments closed

Quick Tutorial: Bundling a merb application to a .war

  • Make sure you have the JRuby warbler gem installed.
  • Create a warble config file from the root directory of your merb application. The command should look something like this: jruby -S warble config. The file will be placed at config/warble.rb.
  • Edit line 55 of warble.rb to read: config.webxml.booter = :merb
  • Edit line 33 of warble.rb to read: config.gems = ["merb-core","merb-more"]. Add other required gems, if necessary.
  • At the root directory of your web application, run: jruby -S warble.
  • Deploy the war.
Tagged , | Comments closed

Using Logging and Test::Unit together

If you use Tim Pease’s ruby Logging library in conjunction with Test::Unit, you must make sure to require ‘logging’ before you require ‘test/unit’. Both Logging and Test::Unit make use of Ruby’s at_exit method hook: Logging uses it to close any appenders (e.g. if you are writing a log to a file, it closes the file), and Test::Unit uses it to actually run the unit tests. Ruby executes any at_exit methods in reverse order, so if you require ‘logging’ after you require ‘test/unit’ then your logs are closed before the unit tests are run. If you try to log anything in the code you are testing, you’ll get an error like this:

  1) Error:
test_at_exit_conflict(AtExitConflictTest):
RuntimeError: appender '<Logging::Appenders::Stdout: stdout>' is closed
method append in appender.rb at line 107
method log_event in logger.rb at line 366
method each in logger.rb at line 366
method log_event in logger.rb at line 366
method debug in (eval) at line 6
method test_at_exit_conflict in logging-testunit-atexit.rb at line 11

This code will throw the above exception:

require "test/unit"
require "logging"

$log = Logging::Logger["testlogger"]
$log.add_appenders(Logging::Appender.stdout)

class AtExitConflictTest < Test::Unit::TestCase
  def test_at_exit_conflict
    $log.debug("test")
    assert(true)
  end
end

Reverse the order of the require ‘test/unit’ and require ‘logging’ to solve the problem.

Comments closed

OpenCms 7 XStandard WYSIWYG Module v0.6

This is an initial release of an XStandard WYSIWYG Module for OpenCms 7+. From the XStandard website:

XStandard is the leading standards-compliant plug-in WYSIWYG editor for desktop applications and browser-based content management systems (IE/Firefox/Safari/Opera). The editor generates clean XHTML Strict or 1.1, and uses CSS for formatting, to ensure the clean separation of content from presentation. The editor is keyboard accessible, and markup generated by XStandard meets the most demanding accessibility requirements.

You should use the XStandard editor instead of the FCKEditor that comes with OpenCms if you want your application to generate well-formed, accessible, and standards-compliant XHTML. In some cases, such as for any public website of a federal agency of the USA, this is legally required.

Read More »

Comments closed