RubyKaigi 2025
Supply Chain Security & AI/LLM Security
Ruby & Rails Kafka Processing Framework
A efficient Amazon SQS processor for Ruby
(no logo yet, working on it)
Use the right tools for the job...
...but also make Ruby the right tool to do more!
Karafka is multi-threaded
Ruby has a Global VM Lock (GVL)
โ 80%
I/O-bound tasks
Perfect for threads
โ ๏ธ 20%
CPU-intensive tasks
Limited by the GVL
๐ค Users often misunderstand their CPU boundaries
Hidden CPU costs:
"If these applications were really spending the overwhelming majority of their time waiting on I/O, it would be impossible for YJIT to perform this well overall."
โ Jean Boussier (byroot)
Even in stream processing, I/O may be just a fraction of the total workload
Add CPU-friendly Parallelization to Karafka and Shoryuken
๐งต Ractors
๐ Processes (fork)
Both bypass the Global VL Lock for true parallel execution
๐ JSON parsing benchmark (Bug #19288)
no Ractor:
1.742630 0.023948 1.766578 ( 1.770248)
Ractor
12.724407 1.142691 13.867098 ( 4.854311)
Ractors are ~3x slower for CPU-intensive operations
"Like any good Rubyist, I KISS..."
def fork_workers(count = 4)
count.times do
pid = fork do
Karafka::Cli::Server.call
end
@worker_pids << pid
end
end
def stop_workers
@worker_pids.each do |pid|
Process.kill(:TERM, pid)
end
end
def wait_for_workers
@worker_pids.each do |pid|
Process.waitpid(pid)
end
end
It worked!
(in development)
I showed my work to KJ Tsanaktsidis from the Ruby Core team...
"Pidfds are absolutely the right professional API!"
๐ค
There were some flaws in my approach
PID Reuse
ps aux | grep ruby
user 12345 2.5 0.8 Sl 10:25 0:18 ruby app.rb
kill -9 12345
bash: kill: (12345) - No such process
The same race condition can affect Ruby code!
Signal Race Conditions
๐๏ธ Puma, Karafka, Shoryuken
๐ User code, gems, libraries
Signal handlers are global per-process - creating fundamental conflicts
# Process level handler
Signal.trap(:CHLD) do
begin
pid, status = Process.waitpid2(-1, Process::WNOHANG)
if pid
puts "MAIN: Signal handler reaped child" \
" PID #{pid} (exit code: #{status.exitstatus})"
end
rescue Errno::ECHILD
# No children left to reap
end
end
# Library level handler
begin
waited_pid, status = Process.waitpid2(pid)
puts "LIBRARY: Successfully reaped PID #{waited_pid}"
return status
rescue Errno::ECHILD
puts "LIBRARY: ERROR - Child PID #{pid} was reaped"
return nil
end
โ ๏ธ Even with pidfd, exit code races remain:
waitpid
or waitid
"wins"ECHILD
errorWhat pidfd CAN'T solve:
What pidfd DOES solve:
For complete process management, careful architectural design is still required!
Process Hierarchy Limitations
# Poll every second
Thread.new do
loop do
begin
# Try sending signal 0 to check existence
Process.kill(0, pid)
puts "Process #{pid} is running"
rescue Errno::ESRCH
puts "Process #{pid} has exited"
break
end
sleep 1 # CPU waste, latency
end
end
In UNIX, "everything is a file"
file = File.open("example.txt")
puts file.fileno # Returns the file descriptor number
Syscall | Purpose |
---|---|
pidfd_open(pid, flags) | Get a file descriptor for a process |
pidfd_send_signal(pidfd, sig, info, flags) | Send signals via the file descriptor |
waitid(P_PIDFD, ...) | Clean up terminated processes |
Works with standard file descriptor APIs (poll, select, epoll)
Ruby has no pidfd support
How to use pidfd from Ruby?
Process.spawn_handle
Process::Handle
class"Pidfds are absolutely the right professional API!"
โ KJ Tsanaktsidis, Ruby Core contributor
class Pidfd
extend FFI::Library
begin
ffi_lib FFI::Library::LIBC
attach_function :fdpid_open, :syscall,
%i[long int uint], :int
attach_function :fdpid_signal, :syscall,
%i[long int int pointer uint], :int
attach_function :waitid,
%i[int int pointer uint], :int
# Syscall numbers (Linux x86_64)
PIDFD_OPEN = 434
PIDFD_SEND_SIGNAL = 424
API_SUPPORTED = true
rescue LoadError
API_SUPPORTED = false
end
end
def self.supported?
# If FFI failed to load
return false unless API_SUPPORTED
# Won't work on macOS or Windows
return false if RUBY_DESCRIPTION.include?('darwin')
return false if RUBY_DESCRIPTION.match?(/mswin|ming|cygwin/)
# Not all OSes may support this (BSD)
new(::Process.pid)
true
rescue Errors::PidfdOpenFailedError
false
end
def initialize(pid)
@mutex = Mutex.new
@pid = pid
# Call syscall to get pidfd
@pidfd = open(pid)
# Wrap as Ruby IO for polling
@pidfd_io = IO.new(@pidfd)
end
private
def open(pid)
pidfd = fdpid_open(
pidfd_open_syscall,
pid,
0
)
return pidfd if pidfd != -1
raise Errors::PidfdOpenFailedError, pidfd
end
def alive?
@pidfd_select ||= [@pidfd_io]
@mutex.synchronize do
return false if @cleaned
# pidfd becomes readable when process terminates
# nil means not readable = still alive
IO.select(@pidfd_select, nil, nil, 0).nil?
end
end
def signal(sig_name)
@mutex.synchronize do
return false if @cleaned
# Never signal dead processes
return false unless alive?
result = fdpid_signal(
pidfd_signal_syscall,
@pidfd,
Signal.list.fetch(sig_name),
nil,
0
)
return true if result.zero?
raise Errors::PidfdSignalFailedError, result
end
end
def cleanup
@mutex.synchronize do
return if @cleaned
# Reap the process
begin
waitid(P_PIDFD, @pidfd, nil, WEXITED)
rescue Errno::ECHILD
# Ignore
# should not happen unless traps overwritten
end
# Clean up resources
@pidfd_io.close
@pidfd_select = nil
@pidfd_io = nil
@pidfd = nil
@cleaned = true
end
end
pidfd.alive? # Check process state
pidfd.signal() # Send signals safely
pidfd.cleanup() # Prevent zombie processes
def control
@nodes.each do |node|
if node.alive?
next if terminate_if_hanging(node)
next if stop_if_not_healthy(node)
next if stop_if_not_responding(node)
else
next if cleanup_one(node)
next if restart_after_timeout(node)
end
end
end
def on_statistics_emitted(_event)
periodically do
Kernel.exit!(orphaned_exit_code) if node.orphaned?
node.healthy
end
end
No orphaned processes if supervisor crashes
def orphaned?
# Check if supervisor is still alive
!@parent_pidfd.alive?
end
Each worker tracks its parent with pidfd
In most cases, pidfd is overkill โ even in Karafka
โ Where pidfd Truly Shines
โ ๏ธ Standard Approach is Usually Fine
pidfd is a solution for specific edge cases, not a general replacement for PIDs
Questions?