I have a form that allows the user to either upload a text file or copy/paste the contents of the file into a textarea. I can easily differentiate between the two and put wh
preg_split
the variable containing the text, and iterate over the returned array:
foreach(preg_split("/((\r?\n)|(\r\n?))/", $subject) as $line){
// do stuff with $line
}
strtok
:Since one of the suggested solutions uses strtok
, unfortunately it doesn't point out a potential memory issue (though it claims to be memory efficient). When using strtok
according to the manual, the:
Note that only the first call to strtok uses the string argument. Every subsequent call to strtok only needs the token to use, as it keeps track of where it is in the current string.
It does this by loading the file into memory. If you're using large files, you need to flush them if you're done looping through the file.
<?php
function process($str) {
$line = strtok($str, PHP_EOL);
/*do something with the first line here...*/
while ($line !== FALSE) {
// get the next line
$line = strtok(PHP_EOL);
/*do something with the rest of the lines here...*/
}
//the bit that frees up memory
strtok('', '');
}
According to the manual, for the file upload part you can use the file
command:
//Create the array
$lines = file( $some_file );
foreach ( $lines as $line ) {
//do something here.
}
I would like to propose a significantly faster (and memory efficient) alternative: strtok
rather than preg_split
.
$separator = "\r\n";
$line = strtok($subject, $separator);
while ($line !== false) {
# do something with $line
$line = strtok( $separator );
}
Testing the performance, I iterated 100 times over a test file with 17 thousand lines: preg_split
took 27.7 seconds, whereas strtok
took 1.4 seconds.
Note that though the $separator
is defined as "\r\n"
, strtok
will separate on either character - and as of PHP4.1.0, skip empty lines/tokens.
See the strtok manual entry: http://php.net/strtok
foreach(preg_split('~[\r\n]+~', $text) as $line){
if(empty($line) or ctype_space($line)) continue; // skip only spaces
// if(!strlen($line = trim($line))) continue; // or trim by force and skip empty
// $line is trimmed and nice here so use it
}
^ this is how you break lines properly, cross-platform compatible with Regexp
:)
It's overly-complicated and ugly but in my opinion this is the way to go:
$fp = fopen("php://memory", 'r+');
fputs($fp, $data);
rewind($fp);
while($line = fgets($fp)){
// deal with $line
}
fclose($fp);
Kyril's answer is best considering you need to be able to handle newlines on different machines.
"I'm mostly looking for useful PHP functions, not an algorithm for how to do it. Any suggestions?"
I use these a lot: