I'm currently writing a small shell in C++.
Jobs and the PIDs associated with them are stored within a queue of job pointers (job *)
. When a new job is run, information about it is added to the queue. Since multiple jobs can be handled simultaneously and new jobs can be entered at the shell's console at any time, I have a signal handler to wait on jobs which are terminated.
When a job is terminated, I need to remove it's information from the active job queue and move it to my deque of terminated jobs. However, it is possible that a user's new job is being added to the queue when another job stops.
In such a case, their insert
queue operation would be suspended and my signal handler would be called, which would perform it's pop
operation.
I'm trying to understand how I can resolve this potential race condition, as I imagine corruption can occur during this process. I cannot use a mutex, as a deadlock would occur if the interrupted parent process is using the queue at the time.
I see some information about C++11
being capable of atomic operations as declared by the user, along with information regarding tasklets. I'm not sure if these are relevant to my question though.
Interestingly enough, an example shell (MSH - http://code.google.com/p/mini-shell-msh/) which I am using as a reference does not appear to do any handling of such conditions. The signal handler immediately modifies the job list, along with the main console. Perhaps there is something I am overlooking here?
As always, all feedback is appriciated.
You have several ways to avoid race condition.
- Use a wait free (atomic) queue for job pointers;
- use any other kind of queue, but protect it with
sigprocmask
(in the non-handler code) and with propersa_mask
value in thesigaction
call; - don't use signal handler at all, use some non-portable system call which allows working with signals in synchronous way: in Linux it is possible with
signalfd
, not sure about other platforms.
You need to disable signals with sigprocmask()
around your critical sections in the non-handler code. This is analogous to device drivers in the kernel disabling interrupts in the user half of the driver while updating structures shared with an interrupt handler.
来源:https://stackoverflow.com/questions/8146108/signal-handler-accessing-queue-data-structure-race-condition