问题
I have 243607 ips in the log file. the output of a function is displaying unique ips continuously so that i can't able to check whether the output ips are unique. So i want each ip to be print in seprate line. as i'm new to python i can't able to figure it out. is there any way to do it?
I also want the count of the ips printed
def unique_ips():
f = open('epiclogs.txt','r')
ips = set(line.split()[0]
for line in f:
if not line.isspace())
ip = line.split()[0]
ips.add(ip)
return ips
if name__=='__main':
print unique_ips()
回答1:
The requirements are not complete:
- The format of the log file is unknown.
- The format of the output file (e.g. sorted?)
My assumptions
- The IP addresses are located in the first column
- The output format should be '[count] [ip address]'
Test data
10.1.10.190 http://example.com/t1 404
10.1.10.171 http://example.com/t1 404
10.1.10.180 http://example.com/t2 200
10.1.10.190 http://example.com/t1 404
10.1.11.180 http://example.com/t3 302
Program
#!/usr/bin/env python
#
# Counts the IP addresses of a log file.
#
# Assumption: the IP address is logged in the first column.
# Example line: 10.1.10.190 http://example.com/t1 404
#
import sys
def extract_ip(line):
'''Extracts the IP address from the line.
Currently it is assumed, that the IP address is logged in
the first column and the columns are space separated.'''
return line.split()[0]
def increase_count(ip_dict, ip_addr):
'''Increases the count of the IP address.
If an IP address is not in the given dictionary,
it is initially created and the count is set to 1.'''
if ip_addr in ip_dict:
ip_dict[ip_addr] += 1
else:
ip_dict[ip_addr] = 1
def read_ips(infilename):
'''Read the IP addresses from the file and store (count)
them in a dictionary - returns the dictionary.'''
res_dict = {}
log_file = file(infilename)
for line in log_file:
if line.isspace():
continue
ip_addr = extract_ip(line)
increase_count(res_dict, ip_addr)
return res_dict
def write_ips(outfilename, ip_dict):
'''Write out the count and the IP addresses.'''
out_file = file(outfilename, "w")
for ip_addr, count in ip_dict.iteritems():
out_file.write("%5d\t%s\n" % (count, ip_addr))
out_file.close()
def parse_cmd_line_args():
'''Return the in and out file name.
If there are more or less than two parameters,
an error is logged in the program is exited.'''
if len(sys.argv)!=3:
print("Usage: %s [infilename] [outfilename]" % sys.argv[0])
sys.exit(1)
return sys.argv[1], sys.argv[2]
def main():
infilename, outfilename = parse_cmd_line_args()
ip_dict = read_ips(infilename)
write_ips(outfilename, ip_dict)
if __name__ == "__main__":
main()
Comment
I like small functions - each of them does exactly one thing. IMHO this makes the program easier to understand.
回答2:
Havn't checked your code works or not, but added new lines to it, which can achieve your task.
try this,
def unique_ips():
f = open('epiclogs.txt','r')
fout = open('uniqueip.txt','w') # Added
ips = set(line.split()[0]
for line in f:
if not line.isspace()):
ip = line.split()[0]
ips.add(ip)
fout.write("%s\n"%ip) # Added
f.close() # Added
fout.flush() # Added
fout.close() # Added
return ips
if name__=='__main':
print unique_ips()
回答3:
unique_ips()
returns a set
, which means each IP address only appears once. If you want to see the addresses line by line in a file, you can change the print unique_ips()
line to:
if __name__== '__main__':
f = file('ip_addresses', 'w')
for ip in unique_ips():
f.write(ip + '\n')
来源:https://stackoverflow.com/questions/9630096/how-to-print-the-output-returned-from-a-function-in-new-lines-using-python