Can someone with more webserver knowledge please explain how puma works from a h...

FooBarWidget · on Aug 27, 2013

I'm one of the authors behind Phusion Passenger (https://www.phusionpassenger.com/), a polyglot web server for Ruby, Python and Node.js.

What you said is basically correct. And yes, it is possible to combine event loops with threads in the way you described to get CPU concurrency as well. Whether it is actually helpful depends on the use case. The Phusion Passenger core is evented (similar to Nginx and Node.js) since version 4.0. We considered a multithreaded-evented architecture as well, but it turned out to be less beneficial than we hoped because the applications themselves use plenty of CPU already, and because we rely on shared in-memory state for proper load balancing between processes, thus having a source of contention. In the end, the single-threaded evented architecture in Phusion Passenger turned out to be more than enough.

cmbaus · on Aug 27, 2013

How does Passenger deal with blocking I/O from applications if it is single threaded? Does the entire event loop block while doing I/O?

FooBarWidget · on Aug 28, 2013

There are two components at work. One is the Phusion Passenger core, which is evented and uses only non-blocking I/O. At no point does the event loop block.

The other component is the Ruby application process. I believe you are talking about this component.

Concurrency on the Ruby application process side is handled using two strategies:

1. Multiprocessing. 1 process per concurrent connection, with a buffering layer. This is architecturally the same as how Unicorn works when behind an Nginx reverse proxy.

2. Multithreading. 1 thread per concurrent connection, possibly with multiple processes. This is architecturally similar to Puma, though not entirely the same. It should be noted that multithreading support is available in the Enterprise version only.

trustfundbaby · on Aug 28, 2013

Passenger spawns copies of your application and (I think this is the default now, it wasn't before) puts your requests in a queue, so that as the app instances finish processing requests they snatch a new request off the queue. If they are all busy then the requests back up on the queue until one of them becomes available.

amarraja · on Aug 27, 2013

Thanks for the update. I'm likely going to try and deconstruct and rebuild some of it to really get my head around evented architectures -- will try and blog it. Years of threads had me in bliss until this non-blocking renaissance came along :)

hhenn · on Aug 27, 2013

I'm looking at the 'Puma' section in Jesse Storimer's book "Working With TCP Sockets" and that seems to be the general idea behind it... uses a thread pool for concurrency, monitors persistent connections with an evented reactor.

I'm not affiliated with the author, but this was such a nice book to get an intro to webservers and their architectures/tradeoffs from.

Really fueled my love of sockets :)

amarraja · on Aug 27, 2013

Thanks for reminding me about Jesse's book(s), I knew there was something I was meant to do!