I\'m using PHP\'s preg_match_all() to search a string imported using file_get_contents(). The regex returns matches but I would like to know at which line number those matches a
well it's kinda late, maybe you alrady solved this, but i had to do it and it's fairly simple.
using PREG_OFFSET_CAPTURE
flag in preg_match
will return the character position of the match.
lets assume $charpos, so
list($before) = str_split($content, $charpos); // fetches all the text before the match
$line_number = strlen($before) - strlen(str_replace("\n", "", $before)) + 1;
voilá!
You can't do this with only regexs. At least not cleanly. What can you do it to use the PREG_OFFSET_CAPTURE
flag of the preg_match_all and do a post parsing of the entire file.
I mean after you have the array of matches strings and starting offsets for each string just count how many \r\n
or \n
or \r
are between the beginning of the file and the offset for each match. The line number of the match would be the number of distinct EOL terminators (\r\n
| \n
| \r
) plus 1
.
You can use preg_match_all to find offsets of every linefeed and then compare them against the offsets you already have.
// read file to buffer
$data = file_get_contents($datafile);
// find all linefeeds in buffer
$reg = preg_match_all("/\n/", $data, $lfall, PREG_OFFSET_CAPTURE );
$lfs = $lfall[0];
// create an array of every offset
$linenum = 1;
$offset = 0;
foreach( $lfs as $lfrow )
{
$lfoffset = intval( $lfrow[1] );
for( ; $offset <= $lfoffset; $offset++ )
$offsets[$offset] = $linenum; // offset => linenum
$linenum++;
}
i think first of all, you need to read the $String into an array, each element stand for each line, and do look like this :
$List=file($String);
for($i=0;$i<count($List),$i++){
if(preg_match_all()){;//your work here
echo $i;//echo the line number where the preg_match_all() works
}
}
Using preg_match_all
with the PREG_OFFSET_CAPTURE flag is necessary to solve this problem, the code comments should explain what kind of array preg_match_all
returns and how the line numbers can be calculated:
// Given string to do a match with
$string = "\n\nabc\nwhatever\n\ndef";
// Match "abc" and "def" in a string
if(preg_match_all("#(abc).*(def)#si", $string, $matches, PREG_OFFSET_CAPTURE)) {
// Now $matches[0][0][0] contains the complete matching string
// $matches[1][0][0] contains the results for the first substring (abc)
// $matches[2][0][0] contains the results for the second substring (def)
// $matches[0][0][1] contains the string position of the complete matching string
// $matches[1][0][1] contains the string position of the first substring (abc)
// $matches[2][0][1] contains the string position of the second substring (def)
// First (abc) match line number
// Cut off the original string at the matching position, then count
// number of line breaks (\n) for that subset of a string
$line = substr_count(substr($string, 0, $matches[1][0][1]), "\n") + 1;
echo $line . "\n";
// Second (def) match line number
// Cut off the original string at the matching position, then count
// number of line breaks (\n) for that subset of a string
$line = substr_count(substr($string, 0, $matches[2][0][1]), "\n") + 1;
echo $line . "\n";
}
This will return 3
for the first substring and 6
for the second substring. You can change \n
to \r\n
or \r
if you use different newlines.
This works but performs a new preg_match_all
on every line which could be quite expensive.
$file = file.txt;
$log = array();
$line = 0;
$pattern = '/\x20{2,}/';
if(is_readable($file)){
$handle = fopen($file, 'rb');
if ($handle) {
while (($subject = fgets($handle)) !== false) {
$line++;
if(preg_match_all ( $pattern, $subject, $matches)){
$log[] = array(
'str' => $subject,
'file' => realpath($file),
'line' => $line,
'matches' => $matches,
);
}
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
}
}
Alternatively you could read the file once yo get the line numbers and then perform the preg_match_all
on the entire file and catpure the match offsets.
$file = 'file.txt';
$length = 0;
$pattern = '/\x20{2,}/';
$lines = array(0);
if(is_readable($file)){
$handle = fopen($file, 'rb');
if ($handle) {
$subject = "";
while (($line = fgets($handle)) !== false) {
$subject .= $line;
$lines[] = strlen($subject);
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
if($subject && preg_match_all ( $pattern, $subject, $matches, PREG_OFFSET_CAPTURE)){
reset($lines);
foreach ($matches[0] as $key => $value) {
while( list($line, $length) = each($lines)){ // continues where we left off
if($value[1] < $length){
echo "match is on line: " . $line;
break; //break out of while loop;
}
}
}
}
}
}}