问题
I have file with data
cell input out type fun level
AI20 A1,A2 Z comb ((A1A2)) 2
IA2 A1,A2,A3 Z comb ((!A1A2)A3) 3
XOR A1,A2,B1 Z comb (((A1A2)B1) 3
IAD A1,A2,A3 Z comb (!((A1A2)A3)) 3
INV I1 ZN comb (!I1) 1
BUF A1,A2,A3,B1 Z comb (!(((A1A2)A3)B1)) 4
From this data, I want to print rows whose
level field (6th column)
givessum 7
together.here to get
level sum
7
we can selectAI2O
,BUF
,INV
rows giving level sum as2
+4
+1
=7
and print themOr
can selectXOR
,IAD
,INV
giving sum3
+3
+1
=7
and print them. Any random selection of rows work butlevel sum
needs to be7
Output can be as
cell input out type fun level
AI20 A1,A2 Z comb ((A1A2)) 2
INV I1 ZN comb (!I1) 1
BUF A1,A2,A3,B1 Z comb (!(((A1A2)A3)B1)) 4
Or output can also be
cell input out type fun level
XOR A1,A2,B1 Z comb (((A1A2)B1) 3
IAD A1,A2,A3 Z comb (!((A1A2)A3)) 3
INV I1 ZN comb (!I1) 1
I tried it using awk
awk '{{ sum[i] += $6} for (i=1;i<8;i++) print $0}' file
But this is printing each row 7 times not the desired output.
Part 2. Prblm continue to part 1.
file2 with data
cell input out type fun level
CLK C Z seq Cq 1
DFk C,Cp Q seq IQ 1
DFR D,C Qn seq IN 1
SKN SE,Q Qp seq Iq 1
Output to get for part2
cell input out type fun level
AI20 A1,A2 Z comb ((A1A2)) 2
INV I1 ZN comb (!I1) 1
BUF A1,A2,A3,B1 Z comb (!(((A1A2)A3)B1)) 4
CLK C Z seq Cq 1
XOR A1,A2,B1 Z comb (((A1A2)B1) 3
IAD A1,A2,A3 Z comb (!((A1A2)A3)) 3
INV I1 ZN comb (!I1) 1
DFk C,Cp Q seq IQ 1
IA2 A1,A2,A3 Z comb ((!A1A2)A3) 3
XOR A1,A2,B1 Z comb (((A1A2)B1) 3
INV I1 ZN comb (!I1) 1
output for part2 is that when we get level sum as 7 for file1, insert first line from file2 after it. And again check for condition for level sum 7 and if true insert second line from file2. Then again check for level sum as 7. If true insert 3rd line from file2. This is done for execution 3 times.
回答1:
Here is an awk solution for this job:
cat rnd.awk
function rnd(max) { # generate a randon number between 2 and max
return int(rand()*max-1)+2
}
BEGIN {
srand() # seed random generation
}
NR == 1 { # for header row
print # print header record
next
}
{
rec[NR] = $0 # save each record in rec array with NR as key
num[NR] = $NF # save last column in num array with NR as key
}
END {
while(1) { # infinite loop
r = rnd(NR) # generate a randomm number between 2 and NR
if (!seen[r]++) # populate seen array with this random number
s += num[r] # get aggregate sum from num array
if (s == 7) # if sum is 7 then break the loop
break
else if (s > 7) { # if sum > 7 then restart the loop
delete seen
s = 0
continue
}
}
for (j in seen) # for each val in seen print rec array
print rec[j]
}
use it as:
awk -f rnd.awk file
cell input out type fun level
AI20 A1,A2 Z comb ((A1A2)) 2
INV I1 ZN comb (!I1) 1
BUF A1,A2,A3,B1 Z comb (!(((A1A2)A3)B1)) 4
and again:
awk -f rnd.awk file
cell input out type fun level
IA2 A1,A2,A3 Z comb ((!A1A2)A3) 3
XOR A1,A2,B1 Z comb (((A1A2)B1) 3
INV I1 ZN comb (!I1) 1
回答2:
There are two places where efficiency is important in this problem:
- Generation of all the possible combinations;
- Retrieving of the right line once the combination is known.
The first issue is extremely dependent on the number of possible values that you have as "level". If you have "hundreds" of different values the number of possible combinations giving you a requested sums is going to be very very large and thus, you want to optimize that part of the algorithm.
The second part is dependent on the number of lines you have in the file. To address this problem I would create an hash table where keys are the value of the "level" and values are arrays of string with each string being one of your line. Once you have a given combination, you can generate (virtually infinite) combinations almost instantaneously with the following steps:
- retrieve the array of strings associated to each of the
level
value present in the combination; - from each array of strings retrieve a random string;
3 repeat the process to get as many combinations of string as you want associated with a given combination of
level
numbers.
回答3:
The following function will return a random combination of rows where the sum of level column is equal with the target (currently 7 as per your question). It can work with any dataframe (as long as there is a numerical column 'level') and any target:
import random
def get_one(df, target):
indices=[]
values=[]
while sum(values)<target:
dftemp=df[(df['level']<=target-sum(values)) & (df['level']>0)]
ind1=random.choice([i for i in set(dftemp.index)-set(indices)])
indices.append(ind1)
values.append(df.loc[ind1, 'level'])
return df.loc[indices, :]
To get a result, just run the function using df and your target as parameteres:
>>>get_one(df, 7)
cell input out type fun level
AI20 A1,A2 Z comb ((A1A2)) 2
INV I1 ZN comb (!I1) 1
BUF A1,A2,A3,B1 Z comb (!(((A1A2)A3)B1)) 4
If you want other total, you can change the parameter, for example:
>>>get_one(df, 10)
>>>get_one(df, 15)
etc
来源:https://stackoverflow.com/questions/64680226/print-rows-with-condition-on-field-data