Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can someone with more webserver knowledge please explain how puma works from a high level? I know very little about web servers so the following may not even make sense!

I have looked at the source, and it appears a thread pool will listen to incoming requests, and pass them to a reactor then move on to handle more requests. Another thread polls the sockets and writes the data to the response stream when ready. [Note: this all may be completely wrong!]

If the way the server works as above is correct, does it mean it's possible to achieve event-loop-based levels of concurrent connections along with good old CPU concurrency as well?



I'm one of the authors behind Phusion Passenger (https://www.phusionpassenger.com/), a polyglot web server for Ruby, Python and Node.js.

What you said is basically correct. And yes, it is possible to combine event loops with threads in the way you described to get CPU concurrency as well. Whether it is actually helpful depends on the use case. The Phusion Passenger core is evented (similar to Nginx and Node.js) since version 4.0. We considered a multithreaded-evented architecture as well, but it turned out to be less beneficial than we hoped because the applications themselves use plenty of CPU already, and because we rely on shared in-memory state for proper load balancing between processes, thus having a source of contention. In the end, the single-threaded evented architecture in Phusion Passenger turned out to be more than enough.


How does Passenger deal with blocking I/O from applications if it is single threaded? Does the entire event loop block while doing I/O?


There are two components at work. One is the Phusion Passenger core, which is evented and uses only non-blocking I/O. At no point does the event loop block.

The other component is the Ruby application process. I believe you are talking about this component.

Concurrency on the Ruby application process side is handled using two strategies:

1. Multiprocessing. 1 process per concurrent connection, with a buffering layer. This is architecturally the same as how Unicorn works when behind an Nginx reverse proxy.

2. Multithreading. 1 thread per concurrent connection, possibly with multiple processes. This is architecturally similar to Puma, though not entirely the same. It should be noted that multithreading support is available in the Enterprise version only.


Passenger spawns copies of your application and (I think this is the default now, it wasn't before) puts your requests in a queue, so that as the app instances finish processing requests they snatch a new request off the queue. If they are all busy then the requests back up on the queue until one of them becomes available.


Thanks for the update. I'm likely going to try and deconstruct and rebuild some of it to really get my head around evented architectures -- will try and blog it. Years of threads had me in bliss until this non-blocking renaissance came along :)


I'm looking at the 'Puma' section in Jesse Storimer's book "Working With TCP Sockets" and that seems to be the general idea behind it... uses a thread pool for concurrency, monitors persistent connections with an evented reactor.

I'm not affiliated with the author, but this was such a nice book to get an intro to webservers and their architectures/tradeoffs from.

Really fueled my love of sockets :)


Thanks for reminding me about Jesse's book(s), I knew there was something I was meant to do!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: