Extract column from CSV file to use as nodelist in NetworkX

问题

I have a CSV file with 2 columns: user and locations. I want to create two lists: one with only users and the other with only locations so that I can use the draw_network_nodes(nodelist=...) function in networkx to draw users and locations separately as nodes with different shapes and colors (all users will be blue boxes and all locations will be red circles).

Also, there is a header in my CSV file, so I do not want the header names to be part of either lists.

回答1:

Since you provided no input, expected output, I am making some assumption about them. Assume that the input file is called data.csv:

user,location
john,seattle
alan,los angeles
trish,new york

The script to split the csv into two files is called csv_split.py:

import csv

with open('data.csv') as csv_in,          \
     open('users.txt', 'w') as users_out, \
     open('locations.txt', 'w') as locations_out:
    csv_dict_reader = csv.DictReader(csv_in)
    for line in csv_dict_reader:
        users_out.write(line['user'] + '\n')
        locations_out.write(line['location'] + '\n')

Discussion

My code is for demonstration purpose, hence does not provide any error checking.
The csv.DictReader() class assumes the first line to be header and use that as the keys for each line

回答2:

Building on top of Hai Vu's answer:

import csv
def reader(filename):
    for (lineno, line) in enumerate(open(filename)):
        if lineno > 0: # skip header
            yield line

filename = "locations.csv"
(users, locations) = zip(*( row for row in csv.reader(reader(filename))))
print "users     =", users
print "locations =", locations

Gives:

locations = ('seattle', 'los angeles', 'new york', 'london')
users     = ('john', 'alan', 'trish', 'jack')

From:

user,location
john,seattle
alan,los angeles
trish,new york
jack,london

来源：https://stackoverflow.com/questions/15009615/extract-column-from-csv-file-to-use-as-nodelist-in-networkx

标签

python

csv

header

networkx

nodelist