A way to find the size and location of padding in a struct?

前端 未结 10 2349
南笙
南笙 2020-12-16 02:49

I\'m trying to write a tool that will take as input some C code containing structs. It will compile the code, then find and output the size and offset of any padding the com

相关标签:
10条回答
  • 2020-12-16 02:56

    Say you have the following module.h:

    typedef void (*handler)(void);
    
    struct foo {
      char a;
      double b;
      int c;
    };
    
    struct bar {
      float y;
      short z;
    };
    

    A Perl program to generate unpack templates begins with the customary front matter:

    #! /usr/bin/perl
    
    use warnings;
    use strict;
    
    sub usage { "Usage: $0 header\n" }
    

    With structs, we feed the header to ctags and from its output collect struct members. The result is a hash whose keys are names of structs and whose values are arrays of pairs of the form [$member_name, $type].

    Note that it handles only a few C types.

    sub structs {
      my($header) = @_;
    
      open my $fh, "-|", "ctags", "-f", "-", $header
        or die "$0: could not start ctags";
    
      my %struct;
      while (<$fh>) {
        chomp;
        my @f = split /\t/;
        next unless @f >= 5 &&
                    $f[3] eq "m" &&
                    $f[4] =~ /^struct:(.+)/;
    
        my $struct = $1;
        die "$0: unknown type in $f[2]"
          unless $f[2] =~ m!/\^\s*(float|char|int|double|short)\b!;
    
        # [ member-name => type ]
        push @{ $struct{$struct} } => [ $f[0] => $1 ];
      }
    
      wantarray ? %struct : \%struct;
    }
    

    Assuming that the header can be included by itself, generate_source generates a C program that prints offsets to the standard output, fills structs with dummy values, and writes the raw structures to the standard output preceded by their respective sizes in bytes.

    sub generate_source {
      my($struct,$header) = @_;
    
      my $path = "/tmp/my-offsets.c";
      open my $fh, ">", $path
        or die "$0: open $path: $!";
    
      print $fh <<EOStart;
    #include <stdio.h>
    #include <stddef.h>
    #include <$header>
    void print_buf(void *b, size_t n) {
      char *c = (char *) b;
      printf("%zd\\n", n);
      while (n--) {
        fputc(*c++, stdout);
      }
    }
    
    int main(void) {
    EOStart
    
      my $id = "a1";
      my %id;
      foreach my $s (sort keys %$struct) {
        $id{$s} = $id++;
        print $fh "struct $s $id{$s};\n";
      }
    
      my $value = 0;
      foreach my $s (sort keys %$struct) {
        for (@{ $struct->{$s} }) {
          print $fh <<EOLine;
    printf("%lu\\n", offsetof(struct $s,$_->[0]));
    $id{$s}.$_->[0] = $value;
    EOLine
          ++$value;
        }
      }
    
      print $fh qq{printf("----\\n");\n};
    
      foreach my $s (sort keys %$struct) {
        print $fh "print_buf(&$id{$s}, sizeof($id{$s}));\n";
      }
      print $fh <<EOEnd;
      return 0;
    }
    EOEnd
    
      close $fh or warn "$0: close $path: $!";
      $path;
    }
    

    Generate a template for unpack where the parameter $members is a value in the hash returned by structs that has been augmented with offsets (i.e., arrayrefs of the form [$member_name, $type, $offset]:

    sub template {
      my($members) = @_;
    
      my %type2tmpl = (
        char => "c",
        double => "d",
        float => "f",
        int => "i!",
        short => "s!",
      );
    
      join " " =>
      map '@![' . $_->[2] . ']' . $type2tmpl{ $_->[1] } =>
      @$members;
    }
    

    Finally, we reach the main program where the first task is to generate and compile the C program:

    die usage unless @ARGV == 1;
    my $header = shift;
    
    my $struct = structs $header;
    my $src    = generate_source $struct, $header;
    
    (my $cmd = $src) =~ s/\.c$//;
    system("gcc -I`pwd` -o $cmd $src") == 0
      or die "$0: gcc failed";
    

    Now we read the generated program's output and decode the structs:

    my @todo = map @{ $struct->{$_} } => sort keys %$struct;
    
    open my $fh, "-|", $cmd
      or die "$0: start $cmd failed: $!";
    while (<$fh>) {
      last if /^-+$/;
      chomp;
      my $m = shift @todo;
      push @$m => $_;
    }
    
    if (@todo) {
      die "$0: unfilled:\n" .
          join "" => map "  - $_->[0]\n", @todo;
    }
    
    foreach my $s (sort keys %$struct) {
      chomp(my $length = <$fh> || die "$0: unexpected end of input");
      my $bytes = read $fh, my($buf), $length;
      if (defined $bytes) {
        die "$0: unexpected end of input" unless $bytes;
        print "$s: @{[unpack template($struct->{$s}), $buf]}\n";
      }
      else {
        die "$0: read: $!";
      }
    }
    

    Output:

    $ ./unpack module.h 
    bar: 0 1
    foo: 2 3 4

    For reference, the C program generated for module.h is

    #include <stdio.h>
    #include <stddef.h>
    #include <module.h>
    void print_buf(void *b, size_t n) {
      char *c = (char *) b;
      printf("%zd\n", n);
      while (n--) {
        fputc(*c++, stdout);
      }
    }
    
    int main(void) {
    struct bar a1;
    struct foo a2;
    printf("%lu\n", offsetof(struct bar,y));
    a1.y = 0;
    printf("%lu\n", offsetof(struct bar,z));
    a1.z = 1;
    printf("%lu\n", offsetof(struct foo,a));
    a2.a = 2;
    printf("%lu\n", offsetof(struct foo,b));
    a2.b = 3;
    printf("%lu\n", offsetof(struct foo,c));
    a2.c = 4;
    printf("----\n");
    print_buf(&a1, sizeof(a1));
    print_buf(&a2, sizeof(a2));
      return 0;
    }
    
    0 讨论(0)
  • 2020-12-16 03:01

    There is no C++ language feature to iterate through the members of a struct, so I think you're out of luck.

    You might be able to cut down some of the boiler-plate with a macro, but I think you're stuck specifying all the members explicitly.

    0 讨论(0)
  • 2020-12-16 03:02

    I prefer to read and write into a buffer, then have a function load the structure members from the buffer. This is more portable than reading directly into a structure or using memcpy. Also this algorithm frees up any worry about compiler padding and can also be adjusted to handle Endianess.

    A correct and robust program is worth more than any time spent compacting binary data.

    0 讨论(0)
  • 2020-12-16 03:10

    Hack up Convert::Binary::C.

    0 讨论(0)
  • 2020-12-16 03:13

    Isn't this what pahole does?

    0 讨论(0)
  • 2020-12-16 03:14

    You might try pstruct.

    I've never used it, but I was looking for some way you might be able to use stabs and this sounds like it would fit the bill.

    If it doesn't, I would suggest looking at other ways to parse out stabs info.

    0 讨论(0)
提交回复
热议问题