Pattern decoding

后端未结

关注

 3  1776

I need a little help in the following. I have this kind of datafile:

0 0    # <--- Group 1 -- 1 house (0) and 1 room (0)

0 0    # <--- Group 2 -- 2 ho


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  鱼传尺愫        
                
              
                            
                2021-01-16 03:07
              
            
            
                                                                       
I would first define a House class and a Group class:

class House:
    def __init__(self, rooms):
        self.rooms = rooms


class Group:
    def __init__(self, index, houses):
        self.index = index
        # houses.values() is a list with number of rooms for each house.
        self.houses = [House(houses[house_nr]) for house_nr in sorted(houses)]

    def __str__(self):
        return 'Group {}'.format(self.index)

    def __repr__(self):
        return 'Group {}'.format(self.index)


Then parse the data into this hierarchical structure:

with open('in.txt') as f:             
    groups = []

    # Variable to accumulate current group.
    group = collections.defaultdict(int)

    i = 1
    for line in f:
        if not line.strip():
            # Empty line found, create a new group.
            groups.append(Group(i, group))
            # Reset accumulator.
            group = collections.defaultdict(int)
            i += 1
            continue

        house_nr, room_nr = line.split()
        group[house_nr] += 1
    # Create the last group at EOF
    groups.append(Group(i, group))


Then you can do stuff like this:

found = filter(
    lambda g:
        len(g.houses) == 1 and # Group contains one house
        g.houses[0].rooms == 1, # First house contains one room
    groups)
print(list(found)) # Prints [Group 1, Group 5, Group 6]

found = filter(
    lambda g:
        len(g.houses) == 2 and # Group contains two houses
        g.houses[0].rooms == 3 and # First house contains three rooms
        g.houses[1].rooms == 2, # Second house contains two rooms
    groups)
print(list(found)) # Prints [Group 2]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  不要未来只要你来        
                
              
                            
                2021-01-16 03:11
              
            
            
                                                                       
I don't know what would be your expected output, however I have converted/decoded your number pattern to a meaningful group/house/rooms format. any further "query" could be done on this content.

see below:

kent$  cat file
0 0

0 0
0 1
0 2
1 0
1 1

0 0
0 1
0 2

0 0
1 0
2 0
3 0

0 0

0 0


awk:

kent$  awk 'BEGIN{RS=""} 
        { print "\ngroup "++g; 
        delete a;
        for(i=1;i<=NF;i++) if(i%2) a[$i]++;
        for(x in a) printf "House#: %s , Room(s): %s \n", x, a[x]; }' file


we get output:

group 1
House#: 0 , Room(s): 1 

group 2
House#: 0 , Room(s): 3 
House#: 1 , Room(s): 2 

group 3
House#: 0 , Room(s): 3 

group 4
House#: 0 , Room(s): 1 
House#: 1 , Room(s): 1 
House#: 2 , Room(s): 1 
House#: 3 , Room(s): 1 

group 5
House#: 0 , Room(s): 1 

group 6
House#: 0 , Room(s): 1 


note that the generated format could be changed to fit your "filter" or "query"

UPDATE

OP's comment:


  I need to know, the number of the group(s) which have/has for example
  1 house with one room. The output would be in the above case: 1, 5 ,6


as I said, based on your query criteria, we could adjust the awk output for next step. now I change the awk abovet to:

awk 'BEGIN{RS=""} 
        {print "";  gid=++g; 
        delete a;
        for(i=1;i<=NF;i++) if(i%2) a[$i]++;
        for(x in a) printf "%s %s %s\n", gid,x, a[x]; }' file


this will output:

1 0 1

2 0 3
2 1 2

3 0 3

4 0 1
4 1 1
4 2 1
4 3 1

5 0 1

6 0 1


the format is groupIdx houseIdx numberOfRooms and there is a blank line between groups. we save the text above to a file named decoded.txt

so your query could be done on this text:

kent$  awk 'BEGIN{RS="\n\n"}{if (NF==3 && $3==1)print $1}' decoded.txt
1
5
6


the last awk line above means, print the group number, if room number ($3) = 1 and there is only one line in the group block. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  臣服心动        
                
              
                            
                2021-01-16 03:12
              
            
            
                                                                       
Perl solution. It converts the input into this format:

1|0
2|1 2
3|2
4|0 0 0 0
5|0
6|0


The first column is group number, in second column there are number of rooms (minus one) of all its houses, sorted. To search for group with two different houses with 2 and 3 rooms, you can just grep '|1 2$', to search for groups with just one house with one room, grep '|0$'

#!/usr/bin/perl
#-*- cperl -*-

#use Data::Dumper;

use warnings;
use strict;

sub report {
    print join ' ', sort {$a <=> $b} @_;
    print "\n";
}

my $group = 1;
my @last = (0);
print '1|';
my @houses = ();
while (<>) {
    if (/^$/) { # group end
        report(@houses, $last[1]);
        undef @houses;
        print ++$group, '|';
        @last = (0);
    } else {
        my @tuple = split;
        if ($tuple[0] != $last[0]) { # new house
            push @houses, $last[1];
        }
        @last = @tuple;
    }
}

report(@houses, $last[1]);


It is based on the fact that for each house, only the last line is important.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复