PHP - alpha sort lines from several files in one directory and save them to files of “x” lines max in alpha named folders

不想你离开。 提交于 2019-12-25 04:51:03

问题


This below goes through files in a directory, reads them and saves them in files of 500 lines max to a new directory. This works great for me (thanks Daniel) but, I need a modification. I would like to save to alpha num based files.

First, sort the array alpha numerically (already lowercase) would be the first step I assume.

Grab all of the lines in each $incoming."/.txt" that start with "a" and put them into a folder at $save500."/a" but, a max of 500 lines each. (I guess it would be best to start with the first at the top of the sort so "0" not "a" right?)

All the lines that start with a number, go into $save500."/num".

None of the lines will start with anything but a-z0-9.

This will allow me to search my files for a match more efficiently using this flatfile method. Narrowing it down to one folder.

$nextfile=0;
    if (glob("" . $incoming . "/*.txt") != false){
     $nextfile = count(glob("" . $save500 . "/*.txt"));
     $nextfile++;
    }
    else{$nextfile = 1;}
    /**/
     $files = glob($incoming."/*.txt");
     $lines = array();
     foreach($files as $file){
     $lines = array_merge($lines, file($file, FILE_SKIP_EMPTY_LINES | FILE_IGNORE_NEW_LINES));
    }
     $lines = array_unique($lines);
    /*this would put them all in one file*/
    /*file_put_contents($dirname."/done/allofthem.txt", implode("\n", $lines));*/
    /*this breaks them into files of 500*/
     foreach (array_chunk($lines, 500) as $chunk){
     file_put_contents($save500 . "/" . $nextfile . ".txt", implode("\n", $chunk));
     $nextfile++;
    }

Each still need to be in a max of 500 lines.

I will graduate to mysql later on. Only been doing this a couple months now.

As if that is not enough. I even thought of taking the first two characters off. Making directories with subs a/0 thru z/z!

Could be the wrong approach above since no responses.

But I want a word like aardvark saved to the 1.txt the a/a folder (appending). Unless 1.txt has 500 lines then save it to a/a 2.txt.

So xenia would be appended to the x/e folder 1.txt file unless there are 500 lines so create 2.txt and save it there.

I will then be able to search for those words more efficiently without loading a ton into memory or looping through files /lines that won't contain a match.

Thanks everyone!


回答1:


I wrote some code here that should do what you're looking for, it's not a perfomance beauty but should do the job. Try it in a safe environment, no guarantee for any data-loss ;)

Comment if there are any errors, it's pretty late here ;) I have to get some sleep ;)

NOTE: This one only works if every line has at least 2 characters! ;)

$nextfile=0;

if (glob("" . $incoming . "/*.txt") != false){
  $nextfile = count(glob("" . $save500 . "/*.txt"));
  $nextfile++;
}
else
{
  $nextfile = 1;
}



$files = glob($incoming."/*.txt");
$lines = array();
foreach($files as $file){
  $lines = array_merge($lines, file($file, FILE_SKIP_EMPTY_LINES | FILE_IGNORE_NEW_LINES));
}


$lines = array_unique($lines);


/*this would put them all in one file*/
/*file_put_contents($dirname."/done/allofthem.txt", implode("\n", $lines));*/
/*this breaks them into files of 500*/

// sort array
sort($lines);

// outer grouping
$groups     = groupArray($lines, 0);
$group_keys = array_keys($groups);

foreach($group_keys as $cKey) {
  // inner grouping
  $groups[$cKey] = groupArray($groups[$cKey], 1);

  foreach($groups[$cKey] as $innerKey => $innerArray) {
    $nextfile = 1;
    foreach(array_chunk($innerArray, 500) as $chunk) {
      file_put_contents($save500 . "/" . $cKey . "/" . $innerKey . "/" . $nextfile . ".txt", implode("\n", $chunk));    
      $nextfile++;
    }
  }

}


function groupArray($data, $offset) {

  $grouped = array();

  foreach($data as $cLine) {
    $key = substr($cLine, $offset, 1);
    if(!isset($grouped[$key])) {
      $grouped[$key] = array($cLine);
    } 
    else
    {
      $grouped[$key][] = $cLine;
    }
  }

  return $grouped;
}


来源:https://stackoverflow.com/questions/3704165/php-alpha-sort-lines-from-several-files-in-one-directory-and-save-them-to-file

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!