Strategies to handle a file with multiple fixed formats

后端 未结 6 860
北海茫月
北海茫月 2021-01-12 22:40

This question is not Perl-specific, (although the unpack function will most probably figure into my implementation).

I have to deal with files where multipl

6条回答
  •  臣服心动
    2021-01-12 23:10

    This sounds like the sort of thing a state machine is good at. One way to do a state machine in Perl is as an object, where each state is a method. The object gives you a place to store the structure you're building, and any intermediate state you need (like the filehandle you're reading from).

    my $state = 'expect_fmt1';
    while (defined $state) {
      $state = $object->$state();
    }
    ...
    sub expect_fmt1 {
      my $self = shift;
      # read format 1, parse it, store it in object
      return 'expect_fmt2';
    }
    

    Some thoughts on handling the cases where you have to look at the line before deciding what to do with it:

    If the file is small enough, you could slurp it into an arrayref in the object. That makes it easy for a state to examine a line without removing it.

    If the file is too big for easy slurping, you can have a method for reading the next line along with a cache in your object that allows you to put it back:

    my get_line {
      my $self = shift;
      my $cache = $self->{line_cache};
      return shift @$cache if @$cache;
      return $self->{filehandle}->getline;
    }
    my unget_line { my $self = shift; unshift @{ $self->{line_cache} }, @_ }
    

    Or, you could split the states that involve this decision into two states. The first state reads the line, stores it in $self->{current_line}, decides what format it is, and returns the state that parses & stores that format (which gets the line to parse from $self->{current_line}).

提交回复
热议问题