A deep dive into Rack for Ruby
posted by Ayush Newatia
30 October, 2024
Rack is the foundation for every popular Ruby web framework in existence. It standardises an interface between a Ruby application and a web server.
This mechanism allows us to pair any Rack-compliant web server (Puma, Unicorn, Falcon etc.) with any Rack-compliant web framework (Rails, Sinatra, Roda, Hanami, etc).
Separating the concerns like this is immensely powerful and provides a lot of flexibility in choice.
It does, however, come with limitations. Rack 2 operated on an assumption that every request must provide a response and close the connection. It made no facility for persistent connections which would enable pathways like WebSockets.
Developers have to make use of a hacky escape hatch to take over connections from Rack to implement WebSockets or similar peristent connections. This all changes with Rack 3.
The basics
Before we look into how that works, let’s backtrack and take a closer look at Rack itself.
A barebones Rack app
A basic Rack app looks like:
class App
def call(env)
[200, { "Content-Type" => "text/plain" }, ["Hello World"]]
end
end
run App.new
The env
is a hash containing request specific information such as HTTP headers. When a request is made, the call
method is called and we return an array representing the response.
The first element is the HTTP response code, in this case 200
. The second element is a Hash containing any Rack and HTTP response headers we wish to send. And the last element is an array of strings representing the response body.
Let’s organise this app into a folder and run it.
$ mkdir rack-demo
$ cd rack-demo
$ bundle init
$ bundle add rack rackup
$ touch app.rb
$ touch config.ru
Fill in app.rb
with the following:
class App
def call(env)
[200, { "content-type" => "text/plain" }, ["Hello World"]]
end
end
And config.ru
with:
require_relative "app"
run App.new
We can run this app using the default WEBrick server by running:
$ bundle exec rackup
The server will run on port 9292
. We can v3erify this with a curl
command.
$ curl localhost:9292
Hello World
That’s got the basic app running! WEBrick is a development-only server so let’s swap it out for Puma.
Changing web servers
$ bundle add puma
Now try running rackup
again and you’ll see it’s automatically detected Puma in the bundle and started that instead of WEBrick!
$ bundle exec rackup
Puma starting in single mode...
* Puma version: 6.4.2 (ruby 3.2.2-p53) ("The Eagle of Durango")
* Min threads: 0
* Max threads: 5
* Environment: development
* PID: 45877
* Listening on http://127.0.0.1:9292
* Listening on http://[::1]:9292
Use Ctrl-C to stop
I recommend starting Puma directly instead of using rackup
as that allows us to pass configuration arguments should we want to.
$ bundle exec puma -w 4
[45968] Puma starting in cluster mode...
[45968] * Puma version: 6.4.2 (ruby 3.2.2-p53) ("The Eagle of Durango")
[45968] * Min threads: 0
[45968] * Max threads: 5
[45968] * Environment: development
[45968] * Master PID: 45968
[45968] * Workers: 4
[45968] * Restarts: (✔) hot (✔) phased
[45968] * Listening on http://0.0.0.0:9292
[45968] Use Ctrl-C to stop
[45968] - Worker 0 (PID: 45981) booted in 0.0s, phase: 0
[45968] - Worker 1 (PID: 45982) booted in 0.0s, phase: 0
[45968] - Worker 2 (PID: 45983) booted in 0.0s, phase: 0
[45968] - Worker 3 (PID: 45984) booted in 0.0s, phase: 0
This basic app demonstrates the Rack interface. An incoming HTTP request is parsed into the env
Hash and provided to the application. The application processes the request and supplies an Array as the response which the server formats as sends to the client.
Rack compliance in frameworks
Every compliant web framework follows the Rack spec under the hood and provides an access point to go down to this level.
In Rails, we can send a Rack response in a controller as:
class HomeController
def index
self.response = [200, {}, ["I'm Home!"]]
end
end
Similarly in Roda:
route do |r|
r.on "home" do
r.halt [200, {}, ["I'm Home!"]]
end
end
Every Rack compliant framework will have a slightly different syntax for accomplishing this, but since they’re all sending Rack responses under the hood, they will have an API for the developer to access that response.
You can find the full Rack specification on GitHub. It’s relatively accessible for a technical specification.
As this demo shows, Rack operates under the assumption that a request comes in, is processed by a web application, and a response is sent back. Throwing persistent connections into the mix totally breaks this model, yet Rack-compliant frameworks like Rails implement WebSockets.
Socket hijacking
In the previous post, we set up the most basic Rack app possible and learned how to process a request and send a response.
In this post we’ll learn how to take over connections from Rack so we can hold persistent connections to enable pathways such as WebSockets.
First, let’s look at how an HTTP connection actually works.
As the diagram shows, a TCP socket is opened and the request is sent to the server. The server responds and closes the connection. All communication is in plain text.
Using a technique called socket hijacking, we can take control of a socket from Rack when a request comes in. Rack offers two techniques for socket hijacking:
- Partial hijack: Rack sends the HTTP response headers and hands over the connection to the application.
- Full hijack: Rack simply hands over the connection to the client without writing anything to the socket.
Partial hijacking
This is how you do a partial hijack:
class App
def call(env)
body = proc do |stream|
5.times do
stream.write "#{Time.now}\n\n"
sleep 1
end
ensure
stream.close
end
[200, { "content-type" => "text/plain", "rack.hijack" => body }, []]
end
end
Run the above app and curl to it, you’ll see it writes the time at 1 second intervals.
$ curl -i localhost:9292
Full hijacking
This is how you’d do a full hijack:
class App
def call(env)
headers = [
"HTTP/1.1 200 OK",
"Content-Type: text/plain"
]
stream = env["rack.hijack"].call
stream.write(headers.map { |header| header + "\r\n" }.join)
stream.write("\r\n")
stream.flush
begin
5.times do
stream.write "#{Time.now}\n\n"
sleep 1
end
ensure
stream.close
end
[-1, {}, []]
end
end
This is a bad practice. Don’t do this. I’ll say again, DO NOT do this unless you really really know what you’re doing. This approach rife with gotchas and weird behaviour.
Streaming bodies
While full hijacking is a terrible idea, partial hijacking is a useful tool. But it still feels hacky so Rack 3 formally adopted that approach into the spec by introduction the concept of streaming bodies.
class App
def call(env)
body = proc do |stream|
5.times do
stream.write "#{Time.now}\n\n"
sleep 1
end
ensure
stream.close
end
[200, { "content-type" => "text/plain" }, body]
end
end
Here we provide a block as the response body rather than an Array. Rack will keep the connection open until the block finishes executing.
There’s a huge gotcha here when using Puma. Puma is a multi-threaded server and assigns a thread for each incoming request. We’re taking over the socket from Rack, but we’re still tying up a Puma thread as long as the connection is open.
Puma concurrency can be configured but threads are limited and tying one up for long periods is not a good idea. Let’s see this in action first.
$ bundle exec puma -w 1 -t 1:1
In two separate terminal windows, run the following command at the same time:
$ curl localhost:9292
You’ll see that one request is immediately served but the other is held until the first one completes. This is because we started Puma with a single worker and single thread meaning it can only serve a single request at a time.
We can get round this by creating our own thread.
class App
def call(env)
body = proc do |stream|
Thread.new do
5.times do
stream.write "#{Time.now}\n\n"
sleep 1
end
ensure
stream.close
end
end
[200, { "content-type" => "text/plain" }, body]
end
end
Now if you try the above experiment again, you’ll see both curl requests are served concurrently because they’re not tying up a Puma thread.
Once again I must warn against this approach unless you know what you’re doing. These demonstrations are largely academic as systems programming is a deep and complex topic.
Falcon web server
Since the threading problem is specific to the Puma web server, let’s look at another option: Falcon. This is a new, highly-concurrent Rack-compliant web server built on the async
gem. It uses Ruby Fibers instead of Threads which are a cheaper to create and have much lower overhead.
The async
gem hooks into all Ruby I/O and other waiting operations such as sleep
and uses these to switch between different Fibers ensuring a program is never held up doing nothing.
Revert your app to the previous version where we’re not spawning a new thread:
class App
def call(env)
body = proc do |stream|
5.times do
stream.write "#{Time.now}\n\n"
sleep 1
end
ensure
stream.close
end
[200, { "content-type" => "text/plain" }, body]
end
end
Then remove Puma and install Falcon.
$ bundle remove puma
$ bundle add falcon
Run the Falcon server. We need to explicitly bind it because it only serves https
traffic by default.
$ bundle exec falcon serve -n 1 -b http://localhost:9292
The server is only using a single thread, which you can confirm with the below command. You’ll need to grab your specific pid
from Falcon’s logs.
$ top -pid <pid> -stats pid,th
The thread count printed by the above command will be 2
because the MRI uses a thread internally.
Try the earlier experiment again and try running two curl
requests simultaneously.
$ curl localhost:9292
You’ll see they’re both served at the same time thanks to Ruby Fibers!
Falcon is relatively new. Ruby Fibers were only introduced in Ruby 3.0. Since Falcon is Rack-compliant, it can be used with Rails too, but the docs recommend using it with v7.1 or newer only. As such, it’s a bit risky to use Falcon in production but it’s very exciting development in the Ruby world in my opinion. I can’t wait to see its progress in the next few years.
We’ve now learned how to create persistent connections in Rack and how to run them in a way that doesn’t block other requests, but the use cases so far have been academic and contrived. Let’s look at how we can use this technique in a practical way.
SSEs and WebSockets
The web has two formalised specifications for communication over a peristent connection: Server Sent Events (SSEs) and WebSockets.
WebSockets are very widely used and quite popular, but SSEs are not nearly as well known so let’s dig into those first.
Server Sent Events
SSEs enables a client to hold an open connection with the server but only the server is able to publish messages to the client. It isn’t a bi-directional protocol.
SSEs are a JavaScript API so let’s modify our app to serve an HTML page with the required script:
class App
def call(env)
req = Rack::Request.new(env)
path = req.path_info
case path
when "/"
sse_js(env)
end
end
private
def sse_js(env)
body = <<~HTML
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>SSE - Demo</title>
<script type="text/javascript">
const eventSource = new EventSource("/sse")
eventSource.addEventListener("message", event => {
document.body.insertAdjacentHTML(
"beforeend",
`<p>${event.data}</p>`
)
})
</script>
</head>
<body>
</body>
</html>
HTML
[200, { "content-type" => "text/html" }, [body]]
end
end
The API is encapsulated in the EventSource
class and new messages from the server trigger events which we’re listening for. Next we need to build the endpoint to send the events:
class App
def call(env)
req = Rack::Request.new(env)
path = req.path_info
case path
when "/"
sse_js(env)
when "/sse"
sse(env)
end
end
private
def sse_js(env)
# ...
end
def sse(env)
body = proc do |stream|
Thread.new do
5.times do
stream.write "data: #{Time.now}!\n\n"
sleep 1
end
ensure
stream.close
end
end
[200, { "content-type" => "text/event-stream" }, body]
end
end
From a server point of view, it’s fairly similar to the streaming bodies example in the previous part. It’s worth noting the content-type
header and the format of string written back to the client.
Run the server (make sure you switch back to Puma):
$ bundle exec puma
Open up a web browser to localhost:9292
and you’ll see the time written to the document 5 times at 1-second intervals.
This technique is great when the server just needs to notify the client about updates. The above example was fairly contrived as it uses a loop so let’s look at how we can use this technique in a real application.
Ruby Queue
s
Ruby provides a Queue
data structure for communication between threads. We can use that to publish data back to a client. Let’s stick with the same use case of publishing the current time 5 times at 1-second intervals, but this time we’ll publish it from a background thread.
class App
def call(env)
# ...
end
private
def sse_js(env)
# ...
end
def sse(env)
queue = Queue.new
trigger_background_loop(queue)
body = proc do |stream|
Thread.new do
loop do
data = queue.pop
stream.write "data: #{data}!\n\n"
end
ensure
stream.close
end
end
[200, { "content-type" => "text/event-stream" }, body]
end
def trigger_background_loop(queue)
Thread.new do
5.times do
queue.push(Time.now)
sleep 1
end
end
end
end
In the above example, we spawn another background thread to push the current time to the queue every second. In the SSE thread, we call queue.pop
which blocks until something is added to the queue.
Using this technique, we can make use of a pub/sub system such as Redis to add data to the queue
from a background thread, which is then published to the client.
Next, let’s look at WebSockets.
WebSockets
WebSockets are a binary, bi-directional protocol for client-server communication. They’re widely used in the modern web and underpins Rails’ Action Cable framework.
A WebSocket is created using an HTTP connection, but as a protocol, it’s completely independent of HTTP.
To create a WebSocket connection, the client must make an HTTP request with the headers:
Connection: Upgrade
Upgrade: websocket
The server will respond with the status 101
meaning Switching Protocols
. The same TCP connection used for the HTTP request is now upgraded to a WebSocket connection.
We won’t get into the nitty-gritty of the WebSocket protocol in this post. It’s fairly fiddly since its a binary protocol. Starr Horne has written an amazing article on the topic if you’re curious.
Let’s look at how we can upgrade a TCP socket to a WebSocket connection.
Upgrading from HTTP to WebSockets
As described above, we’ll need to send a 101
response, after which we need to write to the socket using WebSockets’ binary protocol for the communication to work.
require 'digest/sha1'
class App
def call(env)
req = Rack::Request.new(env)
key = req.get_header("HTTP_SEC_WEBSOCKET_KEY")
response_key = Digest::SHA1.base64digest([key, "258EAFA5-E914-47DA-95CA-C5AB0DC85B11"].join)
body = proc do |stream|
response = "Hello world!"
output = [0b10000001, response.size, response]
stream.write output.pack("CCA#{ response.size }")
ensure
stream.close
end
[101, { "Upgrade" => "websocket", "Connection" => 'upgrade', "Sec-WebSocket-Accept" => response_key }, body]
end
end
We have to create a response key to securely create the connection. The UUID used to generate it is a global constant found in the specification. We won’t go into the binary format of the string we’re writing into the WebSocket connection, but it’s all described in Starr Horne’s article if you’re curious.
Demo
Run the server:
$ bundle exec puma
The easiest way to create a connection is to use a WebSocket client. I recommend websocat
.
$ websocat ws://127.0.0.1:9292/
You’ll see the string Hello world!
printed out! The connection is now active. In theory, we can write and receive messages over this socket now. We haven’t implemented anything on the server to receive or publish messages though, so it won’t yet work in practice.
Conclusion
That concludes our deep dive into Rack. We looked at what Rack was and how to set up a basic Rack app. We then learned about socket hijacking to maintain persistent connections.
And lastly, we used two specifications provided by the web platform to communicate over persistent connections: Server Sent Events (SSEs) and WebSockets.
SSEs are a unidirectional protocol to send data from the server to the client. It’s plain text just like HTTP. WebSockets are a bi-directional protocol to communicate between a server and a client and it’s a binary protocol.
Always remember that persistent connections come with challenges when using a threaded web server like Puma. A persistent connection ties up a thread and can cause significant performance issues unless you open a sizeable can of worms to implement your own threading mechanism.
This post was initially published on AppSignal's blog.