Updating a row in a data file with values from another row

元气小坏坏 提交于 2019-12-12 04:32:53

问题


I have some data in tab delimited form that gives the result of device identification from user-agents (UAs). but there are several rows where the devices are wrongly identified and I need to change them to the correct ones.

For instance there are cases when and iphone or htc wildfire UA is identified as another phone. So for there cases I need to update the device information with the correct device by searching for certain keywords in the UA. for example,

781 Mozilla/5.0 (Linux; U; Android 2.1-update1; fi-fi; HTC_Wildfire_A3333 Build/ERE27) AppleWebKit/530.17 (KHTML, like Gecko) Version/4.0 Mobile Safari/530.17  htc_wildfire_ver1_suba3333    HTC       Wildfire    Android

this is correct but a similar case is wrong

775 Mozilla/5.0 (Linux; U; Android 2.1-update1; fi-fi; HTC Wildfire Build/ERE27) AppleWebKit/525.10+ (KHTML, like Gecko) Version/3.0.4 Mobile Safari/523.12.2 (AdMob-ANDROID-20100709)   T-Mobile       Pulse   Android

So, I have to do something like this. I know that if the UA column contains the term HTC and Wildfire it is that phone. So, I want to look for all the UAs that have the strings HTC and Wildfire but the columns 3 and 4 (manufucturer and model) are wrong and then update them with the correct device information from row 781 which I know is correct. I would manually put in the code that row 781 is correct and if the device is not correctly identified I would put the info from column 3 onwards of row 781 for all these cases.

Of course this is one case and there are several cases like this and I would repeat the same logic for each of them. Also there are other columns besides these four that I've not shown.

how would i accomplish this in a perl script (preferably, but a bash solution is also ok).


回答1:


  1. Create a file (devices) with all distinct (UA, Manufacturer, Model) triples by looping over the input file, storing the triple as keys in a hash; write sorted keys into devices
  2. Manually edit devices (delete 'wrong' lines)
  3. Load devices into a hash, use UA as key, (Manufacturer, Model) as value. Loop over the input file, use UA field of current line to lookup the device, change both fields using the good value from the hash (if necessary).

    my @Log = (
        [ 'HTC', 'badModelHTC'  ]
      , [ 'ABC', 'badModelABC' ]
      , [ 'HTC', 'goodModelHTC' ]
      , [ 'ABC', 'badModelABC' ]
      , [ 'ABC', 'goodModelABC' ]
      , [ 'HTC', 'goodModelHTC' ]
      , [ 'ABC', 'badModelABC' ]
    );
    my %Devs;
    printf "----------- Log org\n";
    for (@Log) {
      printf "%s %s\n", @{$_};
      my $key = join '-', @{$_};
      $Devs{ $key } = $_->[ 1 ];
    }
    printf "----------- Devs org\n";
    for (sort( keys( %Devs ) )) {
      printf "%s => %s\n", $_, $Devs{ $_ };
      if (/bad/) {
          delete $Devs{ $_ };  # fake manual removal
      }
    }
    # fake manual shortening of keys
    my %Tmp = %Devs;
    %Devs = ();
    for (keys %Tmp) {
      $Devs{ (split( /-/, $_))[ 0 ] } = $Tmp{ $_ };
    }
    printf "----------- Devs corrected\n";
    for (sort( keys( %Devs ) )) {
      printf "%s => %s\n", $_, $Devs{ $_ };
    }
    printf "----------- Log corrected\n";
    for (@Log) {
      $_->[ 1 ] = $Devs{ $_->[ 0 ] };
      printf "%s %s\n", @{$_};
    }

output:

    ----------- Log org
    HTC badModelHTC
    ABC badModelABC
    HTC goodModelHTC
    ABC badModelABC
    ABC goodModelABC
    HTC goodModelHTC
    ABC badModelABC
    ----------- Devs org
    ABC-badModelABC => badModelABC
    ABC-goodModelABC => goodModelABC
    HTC-badModelHTC => badModelHTC
    HTC-goodModelHTC => goodModelHTC
    ----------- Devs corrected
    ABC => goodModelABC
    HTC => goodModelHTC
    ----------- Log corrected
    HTC goodModelHTC
    ABC goodModelABC
    HTC goodModelHTC
    ABC goodModelABC
    ABC goodModelABC
    HTC goodModelHTC
    ABC goodModelABC


来源:https://stackoverflow.com/questions/4931783/updating-a-row-in-a-data-file-with-values-from-another-row

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!