PHP library for creating/manipulating fixed-width text files

前端 未结 7 1247
走了就别回头了
走了就别回头了 2021-02-05 09:27

We have a web application that does time-tracking, payroll, and HR. As a result, we have to write a lot of fixed-width data files for export into other systems (state tax filing

相关标签:
7条回答
  • 2021-02-05 09:35

    I have happily used this class for similar use before. It is a php-classes file, but it is very well rated and has been tried-and-tested by many. It is not new (2003) but regardless it still does a very fine job + has a very decent and clean API that looks somewhat like the example you posted with many other goodies added.

    If you can disregard the german usage in the examples, and the age factor -> it is very decent piece of code.

    Posted from the example:
    
    
    //CSV-Datei mit Festlängen-Werten 
    echo "<p>Import aus der Datei fixed.csv</p>"; 
    $csv_import2 = new CSVFixImport; 
    $csv_import2->setFile("fixed.csv"); 
    $csv_import2->addCSVField("Satzart", 2); 
    $csv_import2->addCSVField("Typ", 1); 
    $csv_import2->addCSVField("Gewichtsklasse", 1); 
    $csv_import2->addCSVField("Marke", 4); 
    $csv_import2->addCSVField("interne Nummer", 4); 
    
    
    $csv_import2->addFilter("Satzart", "==", "020"); 
    $csv_import2->parseCSV(); 
    if($csv_import->isOK()) 
    { 
        echo "Anzahl der Datensätze: <b>" . $csv_import2->CSVNumRows() . "</b><br>"; 
        echo "Anzahl der Felder: <b>" . $csv_import2->CSVNumFields() . "</b><br>"; 
        echo "Name des 1.Feldes: <b>" . $csv_import2->CSVFieldName(0) . "</b><br>"; 
    
        $csv_import2->dumpResult(); 
    }
    

    My 2 cents, good-luck!

    0 讨论(0)
  • 2021-02-05 09:40

    If this is text file with separated fields, - your will need write it yourself. Probably it is not a large problem. Good organization, will save a lot of time.

    1. Your need universal way of defining structures. I.e. xml.
    2. Your need something to generate ... specially I prefer an Smarty templating for this.

    So this one:

       <group>
    
          <entry>123</entry>
    
          <entry>123</entry>
    
          <entry>123</entry>
    
        </group>
    

    Can be easy interpreted into test with this template:

    {section name=x1 loop=level1_arr}
    
    {--output root's--}
    
      {section name=x2 loop=level1_arr[x1].level2_arr}
    
         {--output entry's--}
    
      {/section}
    
    {/section}
    

    This is just idea.

    But imagine:

    1. You need xml
    2. You need template

    i.e. 2 definitions to abstract any text structure

    0 讨论(0)
  • 2021-02-05 09:41

    I don't know of a library that does exactly what you want, but it should be rather straight-forward to roll your own classes that handle this. Assuming that you are mainly interested in writing data in these formats, I would use the following approach:

    (1) Write a lightweight formatter class for fixed width strings. It must support user defined record types and should be flexible with regard to allowed formats

    (2) Instantiate this class for every file format you use and add required record types

    (3) Use this formatter to format your data

    As you suggested, you could define the record types in XML and load this XML file in step (2). I don't know how experienced you are with XML, but in my experience XML formats often causes a lot of headaches (probably due to my own incompetence regarding XML). If you are going to use these classes only in your PHP program, there's not much to gain from defining your format in XML. Using XML is a good option if you will need to use the file format definitions in many other applications as well.

    To illustrate my ideas, here is how I think you would use this suggested formatter class:

    <?php
    include 'FixedWidthFormatter.php' // contains the FixedWidthFormatter class
    include 'icesa-format-declaration.php' // contains $icesaFormatter
    $file = fopen("icesafile.txt", "w");
    
    fputs ($file, $icesaFormatter->formatRecord( 'A-RECORD', array( 
        'year' => 2011, 
        'tein' => '12-3456789-P',
        'tname'=> 'Willie Nelson'
    )));
    // output: A2011123456789UTAX     Willie Nelson                                     
    
    // etc...
    
    fclose ($file);
    ?>
    

    The file icesa-format-declaration.php could contain the declaration of the format somehow like this:

    <?php
    $icesaFormatter = new FixedWidthFormatter();
    $icesaFormatter->addRecordType( 'A-RECORD', array(
        // the first field is the record identifier
        // for A records, this is simply the character A
        'record-identifier' => array(
            'value' => 'A',  // constant string
            'length' => 1 // not strictly necessary
                          // used for error checking
        ),
        // the year is a 4 digit field
        // it can simply be formatted printf style
        // sourceField defines which key from the input array is used
        'year' =>  array(
            'format' => '% -4d',  // 4 characters, left justified, space padded
            'length' => 4,
            'sourceField' => 'year'
        ),
        // the EIN is a more complicated field
        // we must strip hyphens and suffixes, so we define
        // a closure that performs this formatting
        'transmitter-ein' => array(
            'formatter'=> function($EIN){
                $cleanedEIN =  preg_replace('/\D+/','',$EIN); // remove anything that's not a digit
                return sprintf('% -9d', $cleanedEIN); // left justified and padded with blanks
            },
            'length' => 9,
            'sourceField' => 'tein'
        ),
        'tax-entity-code' => array(
            'value' => 'UTAX',  // constant string
            'length' => 4
        ),
        'blanks' => array(
            'value' => '     ',  // constant string
            'length' => 5
        ),
        'transmitter-name' =>  array(
            'format' => '% -50s',  // 50 characters, left justified, space padded
            'length' => 50,
            'sourceField' => 'tname'
        ),
        // etc. etc.
    ));
    ?>
    

    Then you only need the FixedWidthFormatter class itself, which could look like this:

    <?php
    
    class FixedWidthFormatter {
    
        var $recordTypes = array();
    
        function addRecordType( $recordTypeName, $recordTypeDeclaration ){
            // perform some checking to make sure that $recordTypeDeclaration is valid
            $this->recordTypes[$recordTypeName] = $recordTypeDeclaration;
        }
    
        function formatRecord( $type, $data ) {
            if (!array_key_exists($type, $this->recordTypes)) {
                trigger_error("Undefinded record type: '$type'");
                return "";
            }
            $output = '';
            $typeDeclaration = $this->recordTypes[$type];
            foreach($typeDeclaration as $fieldName => $fieldDeclaration) {
                // there are three possible field variants:
                //  - constant fields
                //  - fields formatted with printf
                //  - fields formatted with a custom function/closure
                if (array_key_exists('value',$fieldDeclaration)) {
                    $value = $fieldDeclaration['value'];
                } else if (array_key_exists('format',$fieldDeclaration)) {
                    $value = sprintf($fieldDeclaration['format'], $data[$fieldDeclaration['sourceField']]);
                } else if (array_key_exists('formatter',$fieldDeclaration)) {
                    $value = $fieldDeclaration['formatter']($data[$fieldDeclaration['sourceField']]);
                } else {
                    trigger_error("Invalid field declaration for field '$fieldName' record type '$type'");
                    return '';
                }
    
                // check if the formatted value has the right length
                if (strlen($value)!=$fieldDeclaration['length']) {
                    trigger_error("The formatted value '$value' for field '$fieldName' record type '$type' is not of correct length ({$fieldDeclaration['length']}).");
                    return '';
                }
                $output .= $value;
            }
            return $output . "\n";
        }
    }
    
    
    ?>
    

    If you need read support as well, the Formatter class could be extended to allow reading as well, but this might be beyond the scope of this answer.

    0 讨论(0)
  • 2021-02-05 09:43

    I'm sorry i cant help you with a direct class i have seen some thing that does this but i can't remember where so sorry for that but it should be simple for a coder to build,

    So how i have seen this work in an example:

    php reads in data

    php then uses a flag (E.G a $_GET['type']) to know how to output the data E.G Printer, HTML, Excel

    So you build template files for each version then depending on the flag you load and use the defined template, as for Fixed Width this is a HTML thing not PHP so this should be done in templates CSS

    Then from this you can output your data how ever any user requires it,

    Smarty Templates is quite good for this and then the php header to send the content type when required.

    0 讨论(0)
  • 2021-02-05 09:47

    I don't know of any PHP library that specifically handles fixed-width records. But there are some good libraries for filtering and validating a row of data fields if you can do the job of breaking up each line of the file yourself.

    Take a look at the Zend_Filter and Zend_Validate components from Zend Framework. I think both components are fairly self-contained and require only Zend_Loader to work. If you want you can pull just those three components out of Zend Framework and delete the rest of it.

    Zend_Filter_Input acts like a collection of filters and validators. You define a set of filters and validators for each field of a data record which you can use to process each record of a data set. There are lots of useful filters and validators already defined and the interface to write your own is pretty straightforward. I suggest the StringTrim filter for removing padding characters.

    To break up each line into fields I would extend the Zend_Filter_Input class and add a method called setDataFromFixedWidth(), like so:

    class My_Filter_Input extends Zend_Filter_Input
    {
        public function setDataFromFixedWidth($record, array $recordRules)
        {
            if (array_key_exists('regex', $recordRules) {
                $recordRules = array($recordRules);
            }
    
            foreach ($recordRules as $rule) {
                $matches = array();
                if (preg_match($rule['regex'], $record, $matches)) {
                    $data = array_combine($rule['fields'], $matches);
                    return $this->setData($data);
                }
            }
    
            return $this->setData(array());
        }
    
    }
    

    And define the various record types with simple regular expressions and matching field names. ICESA might look something like this:

    $recordRules = array(
        array(
            'regex'  => '/^(A)(.{4})(.{9})(.{4})/',  // This is only the first four fields, obviously
            'fields' => array('recordId', 'year', 'federalEin', 'taxingEntity',),
        ),
        array(
            'regex'  => '/^(B)(.{4})(.{9})(.{8})/',
            'fields' => array('recordId', 'year', 'federalEin', 'computer',),
        ),
        array(
            'regex'  => '/^(E)(.{4})(.{9})(.{9})/',
            'fields' => array('recordId', 'paymentYear', 'federalEin', 'blank1',),
        ),
        array(
            'regex'  => '/^(S)(.{9})(.{20})(.{12})/',
            'fields' => array('recordId', 'ssn', 'lastName', 'firstName',),
        ),
        array(
            'regex'  => '/^(T)(.{7})(.{4})(.{14})/',
            'fields' => array('recordId', 'totalEmployees', 'taxingEntity', 'stateQtrTotal'),
        ),
        array(
            'regex'  => '/^(F)(.{10})(.{10})(.{4})/',
            'fields' => array('recordId', 'totalEmployees', 'totalEmployers', 'taxingEntity',),
        ),
    );
    

    Then you can read your data file line by line and feed it into the input filter:

    $input = My_Filter_Input($inputFilterRules, $inputValidatorRules);
    foreach (file($filename) as $line) {
        $input->setDataFromFixedWidth($line, $recordRules);
        if ($input->isValid()) {
            // do something useful
        }
        else {
            // scream and shout
        }
    }
    

    To format data for writing back to the file, you would probably want to write your own StringPad filter that wraps the internal str_pad function. Then for each record in your data set:

    $output = My_Filter_Input($outputFilterRules);
    foreach ($dataset as $record) {
        $output->setData($record);
        $line = implode('', $output->getEscaped()) . "\n";
        fwrite($outputFile, $line);
    }
    

    Hope this helps!

    0 讨论(0)
  • 2021-02-05 09:53

    Perhaps the dbase functions are what you want to use. They are not OOP, but it probably would not be too difficult to build a class that would act on the functions provided in the dbase set.

    Take a look at the link below for details on dbase functionality available in PHP. If you're just looking to create a file for import into another system, these functions should work for you. Just make sure you pay attention to the warnings. Some of the key warnings are:

    • There is no support for indexes or memo fields.
    • There is no support for locking.
    • Two concurrent web server processes modifying the same dBase file will very likely ruin your database.

    http://php.net/manual/en/book.dbase.php

    0 讨论(0)
提交回复
热议问题