I have a Ruby program that loads up two very large yaml files, so I can get some speed-up by taking advantage of the multiple cores by forking off some processes. I\'ve tried lo
One problem is you need to use Process.wait
to wait for your forked processes to complete. The other is that you can't do interprocess communication through variables. To see this:
@one = nil
@two = nil
@hash = {}
pidA = fork do
sleep 1
@one = 1
@hash[:one] = 1
p [:one, @one, :hash, @hash] #=> [ :one, 1, :hash, { :one => 1 } ]
end
pidB = fork do
sleep 2
@two = 2
@hash[:two] = 2
p [:two, @two, :hash, @hash] #=> [ :two, 2, :hash, { :two => 2 } ]
end
Process.wait(pidB)
Process.wait(pidA)
p [:one, @one, :two, @two, :hash, @hash] #=> [ :one, nil, :two, nil, :hash, {} ]
One way to do interprocess communication is using a pipe (IO::pipe
). Open it before you fork, then have each side of the fork close one end of the pipe.
From ri IO::pipe
:
rd, wr = IO.pipe
if fork
wr.close
puts "Parent got: <#{rd.read}>"
rd.close
Process.wait
else
rd.close
puts "Sending message to parent"
wr.write "Hi Dad"
wr.close
end
_produces:_
Sending message to parent
Parent got:
If you want to share variables, use threads:
@one = nil
@two = nil
@hash = {}
threadA = Thread.fork do
sleep 1
@one = 1
@hash[:one] = 1
p [:one, @one, :hash, @hash] #=> [ :one, 1, :hash, { :one => 1 } ] # (usually)
end
threadB = Thread.fork do
sleep 2
@two = 2
@hash[:two] = 2
p [:two, @two, :hash, @hash] #=> [ :two, 2, :hash, { :one => 1, :two => 2 } ] # (usually)
end
threadA.join
threadB.join
p [:one, @one, :two, @two, :hash, @hash] #=> [ :one, 1, :two, 2, :hash, { :one => 1, :two => 2 } ]
However, I'm not sure if threading will get you any gain when you're IO bound.