Adding UUID for each row being imported from a CSV file

前端 未结 1 781
忘掉有多难
忘掉有多难 2021-01-14 20:05

We want to import 100 thousand rows from a .csv file into a Cassandra table.

There is no unique value for each row, for this reason we want to add UUID to each impor

相关标签:
1条回答
  • 2021-01-14 20:41

    There's no way to do that directly from CQL's COPY command, but instead you could process the CSV file outside of Cassandra first.

    For example, here's a Python script that will read in from file in.csv, append a UUID column to each row, and write out to out.csv:

    #!/usr/bin/python
    # read in.csv adding one column for UUID
    
    import csv
    import uuid
    
    fin = open('in.csv', 'rb')
    fout = open('out.csv', 'w')
    
    reader = csv.reader(fin, delimiter=',', quotechar='"')
    writer = csv.writer(fout, delimiter=',', quotechar='"')
    
    firstrow = True
    for row in reader:
        if firstrow:
            row.append('UUID')
            firstrow = False
        else:
            row.append(uuid.uuid4())
        writer.writerow(row)
    

    The resulting file could be imported using CQL COPY (after you've created your schema accordingly). If you use this example, make sure to read up on Python's uuid functions to choose the one you need (probably uuid1 or uuid4).

    0 讨论(0)
提交回复
热议问题