问题
I'm trying to load a .csv file with Python into Apache Cassandra database. The command "COPY" integrated with session.execute seems don't work. It gives an unexpected indent in correspondance of =',' but...I red something about and I found that the command COPY in this way is not supported.
In this script time_test and p are two float variables
from cassandra.cluster import Cluster
cluster = Cluster()
session = cluster.connect('myKEYSPACE')
rows = session.execute('COPY table_test (time_test, p)
from'/home/mypc/Desktop/testfile.csv' with delimiter=',' and header=true;
')
print('DONE')
Thank you for help!
回答1:
Main problem here is that COPY
is not a CQL command, but a cqlsh
command, so it couldn't be executed via session.execute
.
I recommend to use DSBulk to load data into Cassandra - it's very flexible, performant, and doesn't require programming. For simplest case, when you have direct mapping of columns in header of CSV file into column names in database, then the command-line will be very simple:
dsbulk load -url file.csv -k keyspace -t table -header true
There is a series of blog posts about DSBulk that covers a lot of topics:
- https://www.datastax.com/blog/2019/03/datastax-bulk-loader-introduction-and-loading
- https://www.datastax.com/blog/2019/04/datastax-bulk-loader-more-loading
- https://www.datastax.com/blog/2019/04/datastax-bulk-loader-common-settings
- https://www.datastax.com/blog/2019/06/datastax-bulk-loader-unloading
- https://www.datastax.com/blog/2019/07/datastax-bulk-loader-counting
- https://www.datastax.com/blog/2019/12/datastax-bulk-loader-examples-loading-other-locations
来源:https://stackoverflow.com/questions/64666874/problem-to-load-csv-files-into-apache-cassandra-with-python