We have a web application that does time-tracking, payroll, and HR. As a result, we have to write a lot of fixed-width data files for export into other systems (state tax filing
I don't know of a library that does exactly what you want, but it should be rather straight-forward to roll your own classes that handle this. Assuming that you are mainly interested in writing data in these formats, I would use the following approach:
(1) Write a lightweight formatter class for fixed width strings. It must support user defined record types and should be flexible with regard to allowed formats
(2) Instantiate this class for every file format you use and add required record types
(3) Use this formatter to format your data
As you suggested, you could define the record types in XML and load this XML file in step (2). I don't know how experienced you are with XML, but in my experience XML formats often causes a lot of headaches (probably due to my own incompetence regarding XML). If you are going to use these classes only in your PHP program, there's not much to gain from defining your format in XML. Using XML is a good option if you will need to use the file format definitions in many other applications as well.
To illustrate my ideas, here is how I think you would use this suggested formatter class:
formatRecord( 'A-RECORD', array(
'year' => 2011,
'tein' => '12-3456789-P',
'tname'=> 'Willie Nelson'
)));
// output: A2011123456789UTAX Willie Nelson
// etc...
fclose ($file);
?>
The file icesa-format-declaration.php
could contain the declaration of the format somehow like this:
addRecordType( 'A-RECORD', array(
// the first field is the record identifier
// for A records, this is simply the character A
'record-identifier' => array(
'value' => 'A', // constant string
'length' => 1 // not strictly necessary
// used for error checking
),
// the year is a 4 digit field
// it can simply be formatted printf style
// sourceField defines which key from the input array is used
'year' => array(
'format' => '% -4d', // 4 characters, left justified, space padded
'length' => 4,
'sourceField' => 'year'
),
// the EIN is a more complicated field
// we must strip hyphens and suffixes, so we define
// a closure that performs this formatting
'transmitter-ein' => array(
'formatter'=> function($EIN){
$cleanedEIN = preg_replace('/\D+/','',$EIN); // remove anything that's not a digit
return sprintf('% -9d', $cleanedEIN); // left justified and padded with blanks
},
'length' => 9,
'sourceField' => 'tein'
),
'tax-entity-code' => array(
'value' => 'UTAX', // constant string
'length' => 4
),
'blanks' => array(
'value' => ' ', // constant string
'length' => 5
),
'transmitter-name' => array(
'format' => '% -50s', // 50 characters, left justified, space padded
'length' => 50,
'sourceField' => 'tname'
),
// etc. etc.
));
?>
Then you only need the FixedWidthFormatter
class itself, which could look like this:
recordTypes[$recordTypeName] = $recordTypeDeclaration;
}
function formatRecord( $type, $data ) {
if (!array_key_exists($type, $this->recordTypes)) {
trigger_error("Undefinded record type: '$type'");
return "";
}
$output = '';
$typeDeclaration = $this->recordTypes[$type];
foreach($typeDeclaration as $fieldName => $fieldDeclaration) {
// there are three possible field variants:
// - constant fields
// - fields formatted with printf
// - fields formatted with a custom function/closure
if (array_key_exists('value',$fieldDeclaration)) {
$value = $fieldDeclaration['value'];
} else if (array_key_exists('format',$fieldDeclaration)) {
$value = sprintf($fieldDeclaration['format'], $data[$fieldDeclaration['sourceField']]);
} else if (array_key_exists('formatter',$fieldDeclaration)) {
$value = $fieldDeclaration['formatter']($data[$fieldDeclaration['sourceField']]);
} else {
trigger_error("Invalid field declaration for field '$fieldName' record type '$type'");
return '';
}
// check if the formatted value has the right length
if (strlen($value)!=$fieldDeclaration['length']) {
trigger_error("The formatted value '$value' for field '$fieldName' record type '$type' is not of correct length ({$fieldDeclaration['length']}).");
return '';
}
$output .= $value;
}
return $output . "\n";
}
}
?>
If you need read support as well, the Formatter class could be extended to allow reading as well, but this might be beyond the scope of this answer.