What is the most efficient way of inserting multiple rows in cassandra column family. Is it possible to do this in a single call.
Right now my approach is to addinsert multiple column and then execute. There in a single call I am persisting one row. I am looking for strategy so that I can do a batch insert.
CQL contains a BEGIN BATCH...APPLY BATCH
statement that allows you to group multiple inserts so that a developer can create and execute a series of requests
(see http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0).
The following worked for me (Scala):
PreparedStatement ps = session.prepare(
"BEGIN BATCH" +
"INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?);" +
"INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?);" +
"INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?);" +
"APPLY BATCH" );
session.execute(ps.bind(uid, mid1, title1, body1, uid, mid2, title2, body2, uid, mid3, title3, body3));
If you don't know in advance which statements you want to execute, you can use the following syntax (Scala):
var statement: PreparedStatement = session.prepare("INSERT INTO people (name,age) VALUES (?,?)")
var boundStatement = new BoundStatement(statement)
val batchStmt = new BatchStatement()
batchStmt.add(boundStatement.bind("User A", "10"))
batchStmt.add(boundStatement.bind("User B", "12"))
session.execute(batchStmt)
Note: BatchStatement
can only hold up to 65536 statements. I learned that the hard way. :-)
PreparedStatement and binding values might be a better option. Below are a couple of good articles on uses and misuses of Batch:
There is a batch insert operation in Cassandra. You can batch together inserts, even in different column families, to make insertion more efficient.
In Hector, you can use HFactory.createMutator
then use the add
methods on the returned Mutator to add operations to your batch. When ready, call execute()
.
If you're using CQL, then you group things into a batch by starting the batch with BEGIN BATCH
and ending with APPLY BATCH
.
you can add your multiple insert statements into a file and execute the file with 'cqlsh -f'.
You can also perform Batch insert with CQL into cassandra as described in below link: http://www.datastax.com/documentation/cassandra/1.2/index.html#cassandra/cql_reference/batch_r.html
来源:https://stackoverflow.com/questions/17885238/how-to-multi-insert-rows-in-cassandra