问题
I am trying to extract specific lines from a .txt file, corresponding to 7 particular devices (0-6), and then operate on that data.
Here is an example:
From a very large file, I extract an event (here 169139), which contains information from 6 of the 7 devices (here just 1,2,3,4,5,6 because Device 0 has no data). For each such event, I don't know a priori, how many devices will give active as their output. It can be all, it can be none, or it can be some.
=== 169139 ===
Start: 4.80374e+19
End: 4.80374e+19
--- 1 ---
Pix 9, 66
--- 2 ---
Pix 11, 31
Pix 12, 31
--- 3 ---
Pix 17, 53
Pix 16, 53
Pix 16, 54
--- 4 ---
Pix 44, 64
--- 5 ---
Pix 49, 133
Pix 48, 133
--- 6 ---
Pix 109, 143
Pix 108, 143
Pix 108, 144
Pix 109, 144
The events are easily iterable and I can select the whole information on the screen until the next one (here, the next line from the .txt would be === 169140 ===).
I am able to extract information from a particular device using the following code:
def start_stop_plane (list, dev):
start_reading = [i for i in range(len(list)) if list[i] == "--- " + str(dev) + " ---"][0]
stop_reading = [i for i in range(len(list)) if list[i] == "--- " + str(int(dev)+1) + " ---"][0]
return list[start_reading:stop_reading]
Here, list is the first code comment (the full event). It is a list produced in a similar manner to the code above, exchanging --- with === string occurrences (ie, the flag between events).
My problem: This works for everything from 0 to 5. For 6 it crashes because there is no int(dev)+1
. I tried putting an or
in the stop_reading
to identify an occurrence of ===
but it did not work.
In this case, How can I signal the end of the list and make sure I don't lose any device?
回答1:
You should prepare your "--- plane ---" marker and let python find it for you using basic functions such as in
and .index
.
To get the subset of data lines up to the next marker, you could use takewhile
from itertools:
data="""=== 169139 ===
Start: 4.80374e+19
End: 4.80374e+19
--- 1 ---
Pix 9, 66
--- 2 ---
Pix 11, 31
Pix 12, 31
--- 3 ---
Pix 17, 53
Pix 16, 53
Pix 16, 54
--- 4 ---
Pix 44, 64
--- 5 ---
Pix 49, 133
Pix 48, 133
--- 6 ---
Pix 109, 143
Pix 108, 143
Pix 108, 144
Pix 109, 144""".split("\n")
from itertools import takewhile
def planeData(data,plane):
marker = f"--- {plane} ---"
if marker not in data: return []
start = data.index(marker)+1
return list(takewhile(lambda d:not d.startswith("---"),data[start:]))
output:
for line in planeData(data,0): print(line)
# nothing printed
for line in planeData(data,5): print(line)
# Pix 49, 133
# Pix 48, 133
for line in planeData(data,6): print(line)
# Pix 49, 133
# Pix 48, 133
# Pix 109, 143
# Pix 108, 143
# Pix 108, 144
# Pix 109, 144
回答2:
You could use string Index
Code
def start_stop_dev(lst, dev):
" Assume you meant dev rather than plane "
try:
start_reading = lst.index("--- " + str(dev) + " ---")
except:
return "" # No device
try:
stop_reading = lst.index("--- " + str(dev+1) + " ---") - 1
except:
stop_reading = len(lst)
if start_reading:
return lst[start_reading:stop_reading]
else:
return None # not really possible since return "" earlier
Test
lst= """=== 169139 ===
Start: 4.80374e+19
End: 4.80374e+19
--- 1 ---
Pix 9, 66
--- 2 ---
Pix 11, 31
Pix 12, 31
--- 3 ---
Pix 17, 53
Pix 16, 53
Pix 16, 54
--- 4 ---
Pix 44, 64
--- 5 ---
Pix 49, 133
Pix 48, 133
--- 6 ---
Pix 109, 143
Pix 108, 143
Pix 108, 144
Pix 109, 144"""
# Retrieve and print data for each device
print('----------------Individual Device String Info-------------')
for dev in range(7):
print(f'device {dev}\n{start_stop_dev(lst, dev)}')
print('----------------Splits of String Info----------------------')
for dev in range(7):
dev_lst = start_stop_dev(lst,dev).split("\n")
print(f'dev {dev}: {dev_lst}')
Output ----------------Individual Device String Info-------------
device 0
device 1
--- 1 ---
Pix 9, 66
device 2
--- 2 ---
Pix 11, 31
Pix 12, 31
device 3
--- 3 ---
Pix 17, 53
Pix 16, 53
Pix 16, 54
device 4
--- 4 ---
Pix 44, 64
device 5
--- 5 ---
Pix 49, 133
Pix 48, 133
device 6
--- 6 ---
Pix 109, 143
Pix 108, 143
Pix 108, 144
Pix 109, 144
----------------Splits of String Info----------------------
dev 0: ['']
dev 1: ['--- 1 ---', 'Pix 9, 66']
dev 2: ['--- 2 ---', 'Pix 11, 31', 'Pix 12, 31']
dev 3: ['--- 3 ---', 'Pix 17, 53', 'Pix 16, 53', 'Pix 16, 54']
dev 4: ['--- 4 ---', 'Pix 44, 64']
dev 5: ['--- 5 ---', 'Pix 49, 133', 'Pix 48, 133']
dev 6: ['--- 6 ---', 'Pix 109, 143', 'Pix 108, 143', 'Pix 108, 144 ', 'Pix 109, 144']
来源:https://stackoverflow.com/questions/61428166/extracting-specific-text-between-strings