Perl - find and save in an associative array word and word context

前端 未结 2 1214
眼角桃花
眼角桃花 2021-01-15 04:15

I have an array like this (it\'s just a little overview but it has 2000 and more lines like this):

@list = (
        \"affaire,chose,question\",
        \"ca         


        
相关标签:
2条回答
  • 2021-01-15 04:40

    I think I see what you're trying to do: index semantic links between words followed by lists of synonyms. Am I correct? :-)

    Where a word appears in more than one synonym list, then for that word you create a hash entry with the word as a key and using the keywords for which it was originally a synonym as values ... or something like that. Using a hash of arrays - as in the solution by @Lee Duhem - you get a list (array) of synonyms for each key word. This is a common pattern. You do end up with a lot of hash entries though.

    I've been playing with a neat module by @miygawa called Hash::MultiValue that takes a different approach to accessing a list of values associated with each hash key: multi-value hash. A few nice features are that you can create hash of array references on the fly from the multi-value hash, "flatten" the hash, write callbacks to go with the ->each() method, and other neat things so it's pretty flexible. I believe the module has no dependencies (other than for testing). Plus it's by @miyagawa (and other contributors) so using it and reading it is good for you :-)

    I'm no expert and I'm not sure it's appropriate for what you want - as a variation on Lee's approach you might have something like:

    #!/usr/bin/env perl
    use strict;
    use warnings;
    use Hash::MultiValue;
    
    my $words_hash = Hash::MultiValue->new();
    
    # set up the mvalue hash
    for my $words (<DATA>) {
      my @synonyms = split (',' , $words) ; 
      $words_hash->add( shift @synonyms => (@synonyms[0..$#synonyms]) ) ;
    };
    
    for my $key (keys %{ $words_hash } ) {
      print "$key --> ", join(", ",  $words_hash->get_all($key)) ;
    };
    
    print "\n";
    
    sub synonmize {
      my $bonmot = shift;
      my @bonmot_syns ;
    
      # check key "$bonmot" for word to search and show values
      push @bonmot_syns , $words_hash->get_all($bonmot);
    
      # now grab values but leave out synonym's synonyms
      foreach (keys %{ $words_hash } ) {
        if ($_ !~ /$bonmot/ && grep {/$bonmot/} $words_hash->get_all($_)) {
          push @bonmot_syns, grep {!/$bonmot/} $words_hash->get_all($_);
        }
      }
    
      # show the keys with values containing target word
      $words_hash->each(
        sub { push @bonmot_syns,  $_[0] if grep /$bonmot/ ,  @_[1..$#_] ; }
      );
    
      chomp @bonmot_syns ;
      print "synonymes pour \"$bonmot\": @bonmot_syns \n" ;
    }
    
    # find synonyms 
    synonmize("chose");
    synonmize("truc");
    synonmize("matière");
    
    __DATA__
    affaire,chose,question
    cause,chose,matière
    chose,truc,bidule
    fille,demoiselle,femme,dame
    

    Output:

    fille --> demoiselle, femme, dame
    affaire --> chose, question
    cause --> chose, matière
    chose --> truc, bidule
    
    synonymes pour "chose": truc bidule question matière affaire cause 
    synonymes pour "truc": bidule chose 
    synonymes pour "matière": chose cause
    

    Tie::Hash::MultiValue is another alternative. Kudos to @Lee for a quick clean solution :-)

    0 讨论(0)
  • 2021-01-15 04:40

    For each element in @list, split it at ,, and use each field as key of %te, push others to the value of that key:

    #!/usr/bin/perl
    
    use strict;
    use warnings;
    
    use Data::Dumper;
    
    my @list = (
        "affaire,chose,question",
        "cause,chose,matière",
    );
    
    my %te;
    
    foreach my $str (@list) {
        my @field = split /,/, $str;
        foreach my $key (@field) {
            my @other = grep { $_ ne $key } @field;
            push @{$te{$key}}, @other;
        }
    }
    
    print Dumper(\%te);
    

    Ouput:

    $ perl t.pl
    $VAR1 = {
              'question' => [
                              'affaire',
                              'chose'
                            ],
              'affaire' => [
                             'chose',
                             'question'
                           ],
              'matière' => [
                              'cause',
                              'chose'
                            ],
              'cause' => [
                           'chose',
                           'matière'
                         ],
              'chose' => [
                           'affaire',
                           'question',
                           'cause',
                           'matière'
                         ]
            };
    
    0 讨论(0)
提交回复
热议问题