runit monitoring a process that forks many processes
I've been trying to use runit to manage an in house workflow. We have a
ruby script that we run 4 times as a backgrounded process. It ends up
consuming a lot of memory over time due to various memory leaks, and
ideally something like runit using chpst could kill the process, and
restart it once it hits a threshold.
The ruby script forks out sub processes for each instance of the script we
want to run.
#!/usr/bin/env ruby
def fork_log(msg, level = 'INFO')
puts "[ forker #{Time.now.ctime} #{level} ] #{msg}"
end
def shutdown_pids(pids)
pids.each do |pid|
begin
fork_log("shutting down #{pid}")
Process.kill 9, pid
Process.wait
rescue Exception => ex
fork_log("exception in Process.kill => #{ex}", 'WARN')
end
end
end
pids = Array.new
iterations = ARGV[0].to_i
cmd = 'sleep-runit'
iterations.times do |t|
pids << fork do
trap 'TERM' do
fork_log 'trapped a TERM', 'WARN'
end
fork_log("forking #{cmd} #{t}")
`#{cmd} #{t}`
end
Process.detach(pids.last)
end
begin
trap 'TERM' do
fork_log 'trapped a TERM', 'WARN'
end
Process.waitall
rescue Exception => intr
fork_log("system interrupt caught during waiting => #{intr}", 'WARN')
fork_log('shutting down processes.', 'WARN')
shutdown_pids(pids)
ensure
fork_log('shutting down processes.', 'WARN')
end
The runit script(forker is the script above, it's in /usr/bin, and in the
path):
#!/bin/sh
# merge stderr and stdout
exec 2>&1
When I issue a sudo sv stop forker I get: timeout: run: forker: (pid
17815) 38s, want down, got TERM
When I check ps -ef for ruby processes I see forker, and all of it's
children. I can start it up though, no problem.
exec chpst -uwww forker 4
No comments:
Post a Comment