What is the best way to validate a crontab entry with PHP? Should I be using a regex, or an external library? I\'ve got a PHP script that adds/removes entries from a crontab
Use the pattern:
/^((?:[1-9]?\d|\*)\s*(?:(?:[\/-][1-9]?\d)|(?:,[1-9]?\d)+)?\s*){5}$/
In PHP:
<?php
$cron = "*/5 1-2 3 3,4,5 *";
$result = preg_match( "/^((?:[1-9]?\d|\*)\s*(?:(?:[\/-][1-9]?\d)|(?:,[1-9]?\d)+)?\s*){5}$/", $cron, $matches);
print_r($matches);
Thanks to Jordi Salvat i Alabart who posted great solution.
I have only modified existing solution posted by Jordi Salvat i Alabart. It worked for me well, but I wanted to extract particular parts by capturing groups. I have added non-capturing parentheses to be able to extract particular parts of crontab record. It is easy to see which capture group to use when you test output regex at: http://www.regexplanet.com/advanced/java/index.html
<?php
/**
* @author Jordi Salvat i Alabart - with thanks to <a href="www.salir.com">Salir.com</a>.
*/
function buildRegexp() {
$numbers = array(
'min' => '[0-5]?\d',
'hour' => '[01]?\d|2[0-3]',
'day' => '0?[1-9]|[12]\d|3[01]',
'month' => '[1-9]|1[012]',
'dow' => '[0-6]'
);
foreach ($numbers as $field => $number) {
$range = "(?:$number)(?:-(?:$number)(?:\/\d+)?)?";
$field_re[$field] = "\*(?:\/\d+)?|$range(?:,$range)*";
}
$field_re['month'].='|jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec';
$field_re['dow'].='|mon|tue|wed|thu|fri|sat|sun';
$fields_re = '(' . join(')\s+(', $field_re) . ')';
$replacements = '@reboot|@yearly|@annually|@monthly|@weekly|@daily|@midnight|@hourly';
return '^\s*(' .
'$' .
'|#' .
'|\w+\s*=' .
"|$fields_re\s+" .
"|($replacements)\s+" .
')' .
'([^\\s]+)\\s+' .
'(.*)$';
}
This code generates regex:
^\s*($|#|\w+\s*=|(\*(?:\/\d+)?|(?:[0-5]?\d)(?:-(?:[0-5]?\d)(?:\/\d+)?)?(?:,(?:[0-5]?\d)(?:-(?:[0-5]?\d)(?:\/\d+)?)?)*)\s+(\*(?:\/\d+)?|(?:[01]?\d|2[0-3])(?:-(?:[01]?\d|2[0-3])(?:\/\d+)?)?(?:,(?:[01]?\d|2[0-3])(?:-(?:[01]?\d|2[0-3])(?:\/\d+)?)?)*)\s+(\*(?:\/\d+)?|(?:0?[1-9]|[12]\d|3[01])(?:-(?:0?[1-9]|[12]\d|3[01])(?:\/\d+)?)?(?:,(?:0?[1-9]|[12]\d|3[01])(?:-(?:0?[1-9]|[12]\d|3[01])(?:\/\d+)?)?)*)\s+(\*(?:\/\d+)?|(?:[1-9]|1[012])(?:-(?:[1-9]|1[012])(?:\/\d+)?)?(?:,(?:[1-9]|1[012])(?:-(?:[1-9]|1[012])(?:\/\d+)?)?)*|jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\s+(\*(?:\/\d+)?|(?:[0-6])(?:-(?:[0-6])(?:\/\d+)?)?(?:,(?:[0-6])(?:-(?:[0-6])(?:\/\d+)?)?)*|mon|tue|wed|thu|fri|sat|sun)\s+|(@reboot|@yearly|@annually|@monthly|@weekly|@daily|@midnight|@hourly)\s+)([^\s]+)\s+(.*)$
Or Java alternative to generate this regex (without @X stuff):
public static String buildRegex(){
// numbers intervals and regex
Map<String, String> numbers = new HashMap<String, String>();
numbers.put("min", "[0-5]?\\d");
numbers.put("hour", "[01]?\\d|2[0-3]");
numbers.put("day", "0?[1-9]|[12]\\d|3[01]");
numbers.put("month", "[1-9]|1[012]");
numbers.put("dow", "[0-6]");
Map<String, String> field_re = new HashMap<String, String>();
// expand regex to contain different time specifiers
for(String field : numbers.keySet()){
String number = numbers.get(field);
String range = "(?:"+number+")(?:-(?:"+number+")(?:\\/\\d+)?)?";
field_re.put(field, "\\*(?:\\/\\d+)?|"+range+"(?:,"+range+")*");
}
// add string specifiers
String monthRE = field_re.get("month");
monthRE = monthRE + "|jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec";
field_re.put("month", monthRE);
String dowRE = field_re.get("dow");
dowRE = dowRE + "|mon|tue|wed|thu|fri|sat|sun";
field_re.put("dow", dowRE);
StringBuilder fieldsReSB = new StringBuilder();
fieldsReSB.append("^\\s*(")
.append("$")
.append("|#")
.append("|\\w+\\s*=")
.append("|");
.append("(")
.append(field_re.get("min")).append(")\\s+(")
.append(field_re.get("hour")).append(")\\s+(")
.append(field_re.get("day")).append(")\\s+(")
.append(field_re.get("month")).append(")\\s+(")
.append(field_re.get("dow"))
.append(")")
.append("\\s+)")
.append("([^\\s]+)\\s+")
.append("(.*)$");
return fieldsReSB.toString();
}
Thanks to Jordi Salvat i Alabart and ph4r05.
I have small modified existing solution posted on php. Perl alternative to generate regex:
sub _BuildRegex {
my $number = {
'min' => '[0-5]?\d',
'hour' => '[01]?\d|2[0-3]',
'day' => '0?[1-9]|[12]\d|3[01]',
'month' => '[1-9]|1[012]',
'dow' => '[0-6]'
};
my $field_re = {};
foreach my $nmb ( qw/min hour day month dow/ ) {
my $range = "(?:$number->{$nmb})(?:-(?:$number->{$nmb})(?:\\/\\d+)?)?";
$field_re->{$nmb} = "\\*(?:\\/\\d+)?|$range(?:,$range)*";
}
$field_re->{'month'} .='|[jJ]an|[fF]eb|[mM]ar|[aA]pr|[mM]ay|[jJ]un|[jJ]ul|[aA]ug|[sS]ep|[oO]ct|[nN]ov|[dD]ec';
$field_re->{'dow'} .= '|[mM]on|[tT]ue|[wW]ed|[tT]hu|[fF]ri|[sS]at|[sS]un';
my $ff = [];
push @$ff, $field_re->{$_} foreach ( qw/min hour day month dow/ );
my $fields_req = '(' . join(')\s+(', @$ff) . ')';
my $replacements = '@reboot|@yearly|@annually|@monthly|@weekly|@daily|@midnight|@hourly';
return '^\s*(' .
'$' .
'|#' .
'|\w+\s*=' .
"|$fields_req\\s+" .
"|($replacements)\\s+" .
')' .
'([^\\s]+)\\s+' .
'(.*)$';
}
Hmmm, interesting problem.
If you're going to really validate it, regex isn't going to be enough, you'll have to actually parse the entry and validate each of the scheduling bits. That's because each bit can be a number, a month/day of the week string, a range (2-7), a set (3, 4, Saturday), a Vixie cron-style shortcut (60/5) or any combination of the above -- any single regex approach is going to get very hairy, fast.
Just using the crontab
program of Vixie cron to validate isn't sufficient, because it actually doesn't validate completely! I can get crontab
to accept all sorts of illegal things.
Dave Taylor's Wicked Cool Shell Scripts (Google books link) has a sh script that does partial validation, I found the discussion interesting. You might also use or adapt the code.
I also turned up links to two PHP classes that do what you say (whose quality I haven't evaluated):
Another approach (depending on what your app needs to do) might be to have PHP construct the crontab entry programatically and insert it, so you know it's always valid, rather than try to validate an untrusted string. Then you would just need to make a "build a crontab entry" UI, which could be simple if you don't need really complicated scheduling combinations.
There is a nice PHP library that can be used for Cron expression validation:
To install this library via composer:
composer require mtdowling/cron-expression
To check if Cron expression is valid
$isValid = Cron\CronExpression::isValidExpression($expression);
Who said regular expressions can't do that?
Courtesy of my employer, Salir.com, here's a PHPUnit test which does such validation. Feel free to modify & distribute. I'll appreciate if you keep the @author notice & link to web site.
<?php
/**
* @author Jordi Salvat i Alabart - with thanks to <a href="www.salir.com">Salir.com</a>.
*/
abstract class CrontabChecker extends PHPUnit_Framework_TestCase {
protected function assertFileIsValidUserCrontab($file) {
$f= @fopen($file, 'r', 1);
$this->assertTrue($f !== false, 'Crontab file must exist');
while (($line= fgets($f)) !== false) {
$this->assertLineIsValid($line);
}
}
protected function assertLineIsValid($line) {
$regexp= $this->buildRegexp();
$this->assertTrue(preg_match("/$regexp/", $line) !== 0);
}
private function buildRegexp() {
$numbers= array(
'min'=>'[0-5]?\d',
'hour'=>'[01]?\d|2[0-3]',
'day'=>'0?[1-9]|[12]\d|3[01]',
'month'=>'[1-9]|1[012]',
'dow'=>'[0-7]'
);
foreach($numbers as $field=>$number) {
$range= "($number)(-($number)(\/\d+)?)?";
$field_re[$field]= "\*(\/\d+)?|$range(,$range)*";
}
$field_re['month'].='|jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec';
$field_re['dow'].='|mon|tue|wed|thu|fri|sat|sun';
$fields_re= '('.join(')\s+(', $field_re).')';
$replacements= '@reboot|@yearly|@annually|@monthly|@weekly|@daily|@midnight|@hourly';
return '^\s*('.
'$'.
'|#'.
'|\w+\s*='.
"|$fields_re\s+\S".
"|($replacements)\s+\S".
')';
}
}