I have two lists, let\'s say:
keys1 = [\'A\', \'B\', \'C\', \'D\', \'E\', \'H\', \'I\']
keys2 = [\'A\', \'B\', \'E\', \'F\', \'G\', \'H\',
I recently had stumbled upon a similar issue while implementing a feature. I tried to clearly define the problem statement first. If I understand right, here is the problem statement
Write a function merge_lists which will merge a list of lists with overlapping items, while preserving the order of items.
If item A comes before item B in all the lists where they occur together, then item A must precede item B in the final list also
If item A and item B interchange order in different lists, ie in some lists A precedes B and in some others B precedes A, then the order of A and B in the final list should be the same as their order in the first list where they occur together. That is, if A precedes B in l1 and B precedes A in l2, then A should precede B in final list
If Item A and Item B do not occur together in any list, then their order must be decided by the position of the list in which each one occurs first. That is, if item A is in l1 and l3, item B is in l2 and l6, then the order in the final list must be A then B
l1 = ["Type and Size", "Orientation", "Material", "Locations", "Front Print Type", "Back Print Type"]
l2 = ["Type and Size", "Material", "Locations", "Front Print Type", "Front Print Size", "Back Print Type", "Back Print Size"]
l3 = ["Orientation", "Material", "Locations", "Color", "Front Print Type"]
merge_lists([l1,l2,l3])
['Type and Size', 'Orientation', 'Material', 'Locations', 'Color', 'Front Print Type', 'Front Print Size', 'Back Print Type', 'Back Print Size']
l1 = ["T", "V", "U", "B", "C", "I", "N"]
l2 = ["Y", "V", "U", "G", "B", "I"]
l3 = ["X", "T", "V", "M", "B", "C", "I"]
l4 = ["U", "P", "G"]
merge_lists([l1,l2,l3, l4])
['Y', 'X', 'T', 'V', 'U', 'M', 'P', 'G', 'B', 'C', 'I', 'N']
l1 = ["T", "V", "U", "B", "C", "I", "N"]
l2 = ["Y", "U", "V", "G", "B", "I"]
l3 = ["X", "T", "V", "M", "I", "C", "B"]
l4 = ["U", "P", "G"]
merge_lists([l1,l2,l3, l4])
['Y', 'X', 'T', 'V', 'U', 'M', 'P', 'G', 'B', 'C', 'I', 'N']
I arrived at a reasonable solution which solved it correctly for all the data I had. (It might be wrong for some other data set. Will leave it for others to comment that). Here is the solution
def remove_duplicates(l):
return list(set(l))
def flatten(list_of_lists):
return [item for sublist in list_of_lists for item in sublist]
def difference(list1, list2):
result = []
for item in list1:
if item not in list2:
result.append(item)
return result
def preceding_items_list(l, item):
if item not in l:
return []
return l[:l.index(item)]
def merge_lists(list_of_lists):
final_list = []
item_predecessors = {}
unique_items = remove_duplicates(flatten(list_of_lists))
item_priorities = {}
for item in unique_items:
preceding_items = remove_duplicates(flatten([preceding_items_list(l, item) for l in list_of_lists]))
for p_item in preceding_items:
if p_item in item_predecessors and item in item_predecessors[p_item]:
preceding_items.remove(p_item)
item_predecessors[item] = preceding_items
print "Item predecessors ", item_predecessors
items_to_be_checked = difference(unique_items, item_priorities.keys())
loop_ctr = -1
while len(items_to_be_checked) > 0:
loop_ctr += 1
print "Starting loop {0}".format(loop_ctr)
print "items to be checked ", items_to_be_checked
for item in items_to_be_checked:
predecessors = item_predecessors[item]
if len(predecessors) == 0:
item_priorities[item] = 0
else:
if all(pred in item_priorities for pred in predecessors):
item_priorities[item] = max([item_priorities[p] for p in predecessors]) + 1
print "item_priorities at end of loop ", item_priorities
items_to_be_checked = difference(unique_items, item_priorities.keys())
print "items to be checked at end of loop ", items_to_be_checked
print
final_list = sorted(unique_items, key=lambda item: item_priorities[item])
return final_list
I've also open sourced the code as a part of the library named toolspy. So you can just do this
pip install toolspy
from toolspy import merge_lists
lls=[['a', 'x', 'g'], ['x', 'v', 'g'], ['b', 'a', 'c', 'x']]
merge_lists(lls)