I expect Perl be faster. Just being curious, can you try the following?
#!/usr/bin/python
import re
import glob
import sys
import os
exists_re = re.compile(r'^(.*?) INFO.*Such a record already exists', re.I)
location_re = re.compile(r'^AwbLocation (.*?) insert into', re.I)
for mask in sys.argv[1:]:
for fname in glob.glob(mask):
if os.path.isfile(fname):
f = open(fname)
for line in f:
mex = exists_re.search(line)
if mex:
xlogtime = mex.group(1)
mloc = location_re.search(line)
if mloc:
print fname, xlogtime, mloc.group(1)
f.close()
Update as reaction to "it is too complex".
Of course it looks more complex than the Perl version. The Perl was built around the regular expressions. This way, you can hardly find interpreted language that is faster in regular expressions. The Perl syntax...
while (<>) {
...
}
... also hides a lot of things that have to be done somehow in a more general language. On the other hand, it is quite easy to make the Python code more readable if you move the unreadable part out:
#!/usr/bin/python
import re
import glob
import sys
import os
def input_files():
'''The generator loops through the files defined by masks from cmd.'''
for mask in sys.argv[1:]:
for fname in glob.glob(mask):
if os.path.isfile(fname):
yield fname
exists_re = re.compile(r'^(.*?) INFO.*Such a record already exists', re.I)
location_re = re.compile(r'^AwbLocation (.*?) insert into', re.I)
for fname in input_files():
with open(fname) as f: # Now the f.close() is done automatically
for line in f:
mex = exists_re.search(line)
if mex:
xlogtime = mex.group(1)
mloc = location_re.search(line)
if mloc:
print fname, xlogtime, mloc.group(1)
Here the def input_files()
could be placed elsewhere (say in another module), or it can be reused. It is possible to mimic even the Perl's while (<>) {...}
easily, even though not the same way syntactically:
#!/usr/bin/python
import re
import glob
import sys
import os
def input_lines():
'''The generator loops through the lines of the files defined by masks from cmd.'''
for mask in sys.argv[1:]:
for fname in glob.glob(mask):
if os.path.isfile(fname):
with open(fname) as f: # now the f.close() is done automatically
for line in f:
yield fname, line
exists_re = re.compile(r'^(.*?) INFO.*Such a record already exists', re.I)
location_re = re.compile(r'^AwbLocation (.*?) insert into', re.I)
for fname, line in input_lines():
mex = exists_re.search(line)
if mex:
xlogtime = mex.group(1)
mloc = location_re.search(line)
if mloc:
print fname, xlogtime, mloc.group(1)
Then the last for
may look as easy (in principle) as the Perl's while (<>) {...}
. Such readability enhancements are more difficult in Perl.
Anyway, it will not make the Python program faster. Perl will be faster again here. Perl is a file/text cruncher. But--in my opinion--Python is a better programming language for more general purposes.