Automatically identify (and kill) processes with long processing time

半城伤御伤魂 提交于 2019-12-13 04:46:07

问题


I'm running a script that daily downloads, builds and checks a program I'm contributing to. The "check" part implies performing a suit of test runs and comparing results with the reference.

As long as the program finishes (with or without errors), everything is OK (I mean that I can handle that). But in some cases some test run is apparently stuck in an infinite loop and I have to kill it. This is quite inconvenient for a job that's supposed to run unattended. If this happens at some point, the test will not progress any further and, worse, next day a new job will be launched, which might suffer the same problem.

Manually, I can identify the "stuck" process, for instance, with ps -u username, anything with more than, say, 15 minutes in the TIME column should be killed. Note that this is not just the "age" of the process, but the processing time used. I don't want to kill the wrapper script or the ssh session.

Before trying to write some complicated script that periodically runs ps -u username, parses the output and kills what needs to be killed, is there some easier or pre-cooked solution?

EDIT:

From the replies in the suggested thread, I have added this line to the user's crontab, which seems to work so far:

10,40 * * * * ps -eo uid,pid,time | egrep '^ *`id -u`' | egrep ' ([0-9]+-)?[0-9]{2}:[2-9][0-9]:[0-9]{2}' | awk '{print $2}' | xargs -I{} kill {}

It runs every half hour (at *:10 and *:40), identifies processes belonging to the user (id -u in backticks, because $UID is not available in dash) and with processing time longer than 20 minutes ([2-9][0-9]), and kills them.

The time parsing is not perfect, it would not catch processes that have been running for several hours and less than 20 minutes, but since it runs every 30 minutes that should not happen.

来源:https://stackoverflow.com/questions/19167168/automatically-identify-and-kill-processes-with-long-processing-time

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!