Parsing e-mail-like headers (similar to RFC822)

前端 未结 2 518
猫巷女王i
猫巷女王i 2021-01-20 14:51

Problem / Question

There is a database of bot information that I would like to parse. It is said to be similar to RFC822 messages.

Before I re-invent the

2条回答
  •  野趣味
    野趣味 (楼主)
    2021-01-20 15:47

    Assuming that $data contains the sample data you pasted above, here is the parser:

    
     * DATA;
     *
     */
    
    $parsed  = array();
    $blocks  = preg_split('/\n\n/', $data);
    $lines   = array();
    $matches = array();
    foreach ($blocks as $i => $block) {
        $parsed[$i] = array();
        $lines = preg_split('/\n(([\w.-]+)\: *((.*\n\s+.+)+|(.*(?:\n))|(.*))?)/',
                            $block, -1, PREG_SPLIT_DELIM_CAPTURE);
        foreach ($lines as $line) {
            if(preg_match('/^\n?([\w.-]+)\: *((.*\n\s+.+)+|(.*(?:\n))|(.*))?$/',
                          $line, $matches)) {
                $parsed[$i][$matches[1]] = preg_replace('/\n +/', ' ',
                                                        trim($matches[2]));
            }
        }
    }
    
    print_r($parsed);
    

提交回复
热议问题