Pattern decoding

后端 未结 3 1776
情话喂你
情话喂你 2021-01-16 03:02

I need a little help in the following. I have this kind of datafile:

0 0    # <--- Group 1 -- 1 house (0) and 1 room (0)

0 0    # <--- Group 2 -- 2 ho         


        
相关标签:
3条回答
  • 2021-01-16 03:07

    I would first define a House class and a Group class:

    class House:
        def __init__(self, rooms):
            self.rooms = rooms
    
    
    class Group:
        def __init__(self, index, houses):
            self.index = index
            # houses.values() is a list with number of rooms for each house.
            self.houses = [House(houses[house_nr]) for house_nr in sorted(houses)]
    
        def __str__(self):
            return 'Group {}'.format(self.index)
    
        def __repr__(self):
            return 'Group {}'.format(self.index)
    

    Then parse the data into this hierarchical structure:

    with open('in.txt') as f:             
        groups = []
    
        # Variable to accumulate current group.
        group = collections.defaultdict(int)
    
        i = 1
        for line in f:
            if not line.strip():
                # Empty line found, create a new group.
                groups.append(Group(i, group))
                # Reset accumulator.
                group = collections.defaultdict(int)
                i += 1
                continue
    
            house_nr, room_nr = line.split()
            group[house_nr] += 1
        # Create the last group at EOF
        groups.append(Group(i, group))
    

    Then you can do stuff like this:

    found = filter(
        lambda g:
            len(g.houses) == 1 and # Group contains one house
            g.houses[0].rooms == 1, # First house contains one room
        groups)
    print(list(found)) # Prints [Group 1, Group 5, Group 6]
    
    found = filter(
        lambda g:
            len(g.houses) == 2 and # Group contains two houses
            g.houses[0].rooms == 3 and # First house contains three rooms
            g.houses[1].rooms == 2, # Second house contains two rooms
        groups)
    print(list(found)) # Prints [Group 2]
    
    0 讨论(0)
  • I don't know what would be your expected output, however I have converted/decoded your number pattern to a meaningful group/house/rooms format. any further "query" could be done on this content.

    see below:

    kent$  cat file
    0 0
    
    0 0
    0 1
    0 2
    1 0
    1 1
    
    0 0
    0 1
    0 2
    
    0 0
    1 0
    2 0
    3 0
    
    0 0
    
    0 0
    

    awk:

    kent$  awk 'BEGIN{RS=""} 
            { print "\ngroup "++g; 
            delete a;
            for(i=1;i<=NF;i++) if(i%2) a[$i]++;
            for(x in a) printf "House#: %s , Room(s): %s \n", x, a[x]; }' file
    

    we get output:

    group 1
    House#: 0 , Room(s): 1 
    
    group 2
    House#: 0 , Room(s): 3 
    House#: 1 , Room(s): 2 
    
    group 3
    House#: 0 , Room(s): 3 
    
    group 4
    House#: 0 , Room(s): 1 
    House#: 1 , Room(s): 1 
    House#: 2 , Room(s): 1 
    House#: 3 , Room(s): 1 
    
    group 5
    House#: 0 , Room(s): 1 
    
    group 6
    House#: 0 , Room(s): 1 
    

    note that the generated format could be changed to fit your "filter" or "query"

    UPDATE

    OP's comment:

    I need to know, the number of the group(s) which have/has for example 1 house with one room. The output would be in the above case: 1, 5 ,6

    as I said, based on your query criteria, we could adjust the awk output for next step. now I change the awk abovet to:

    awk 'BEGIN{RS=""} 
            {print "";  gid=++g; 
            delete a;
            for(i=1;i<=NF;i++) if(i%2) a[$i]++;
            for(x in a) printf "%s %s %s\n", gid,x, a[x]; }' file
    

    this will output:

    1 0 1
    
    2 0 3
    2 1 2
    
    3 0 3
    
    4 0 1
    4 1 1
    4 2 1
    4 3 1
    
    5 0 1
    
    6 0 1
    

    the format is groupIdx houseIdx numberOfRooms and there is a blank line between groups. we save the text above to a file named decoded.txt

    so your query could be done on this text:

    kent$  awk 'BEGIN{RS="\n\n"}{if (NF==3 && $3==1)print $1}' decoded.txt
    1
    5
    6
    

    the last awk line above means, print the group number, if room number ($3) = 1 and there is only one line in the group block.

    0 讨论(0)
  • 2021-01-16 03:12

    Perl solution. It converts the input into this format:

    1|0
    2|1 2
    3|2
    4|0 0 0 0
    5|0
    6|0
    

    The first column is group number, in second column there are number of rooms (minus one) of all its houses, sorted. To search for group with two different houses with 2 and 3 rooms, you can just grep '|1 2$', to search for groups with just one house with one room, grep '|0$'

    #!/usr/bin/perl
    #-*- cperl -*-
    
    #use Data::Dumper;
    
    use warnings;
    use strict;
    
    sub report {
        print join ' ', sort {$a <=> $b} @_;
        print "\n";
    }
    
    my $group = 1;
    my @last = (0);
    print '1|';
    my @houses = ();
    while (<>) {
        if (/^$/) { # group end
            report(@houses, $last[1]);
            undef @houses;
            print ++$group, '|';
            @last = (0);
        } else {
            my @tuple = split;
            if ($tuple[0] != $last[0]) { # new house
                push @houses, $last[1];
            }
            @last = @tuple;
        }
    }
    
    report(@houses, $last[1]);
    

    It is based on the fact that for each house, only the last line is important.

    0 讨论(0)
提交回复
热议问题