COPY cassandra table from csv file

后端 未结 2 1983
花落未央
花落未央 2021-02-15 11:46

I\'m setting up a demo landscape for Cassandra, Apache Spark and Flume on my Mac (Mac OS X Yosemite with Oracle jdk1.7.0_55). The landscape shall work as a proof of concept for

相关标签:
2条回答
  • 2021-02-15 12:44

    Loading csv file into cassandra table

    step1)install cassandra loader using this url
    sudo wget https://github.com/brianmhess/cassandra-loader/releases/download/v0.0.23/cassandra-loader

    step2)sudo chmod +x cassandra-loader

    a)csv file name is "pt_bms_tkt_success_record_details_new_2016_12_082017-01-0312-30-01.csv"

    b)keyspace name is "bms_test"

    c)Table name is "pt_bms_tkt_success_record_details_new"

    d)columns are "trx_id......trx_day"

    step3)csv file location and cassandra-loader is "cassandra3.7/bin/"

    step$)[stp@ril-srv-sp3 bin]$ ./cassandra-loader -f pt_bms_tkt_success_record_details_new_2016_12_082017-01-0312-30-01.csv -host 192.168.1.29 -schema "bms_test.pt_bms_tkt_success_record_details_new(trx_id,max_seq,trx_type,trx_record_type,trx_date,trx_show_date,cinema_str_id,session_id,ttype_code,item_id,item_var_sequence,trx_booking_id,venue_name,screen_by_tnum,price_group_code,area_cat_str_code,area_by_tnum,venue_capacity,amount_currentprice,venue_class,trx_booking_status_committed,booking_status,amount_paymentstatus,event_application,venue_cinema_companyname,venue_cinema_name,venue_cinema_type,venue_cinema_application,region_str_code,venue_city_name,sub_region_str_code,sub_region_str_name,event_code,event_type,event_name,event_language,event_genre,event_censor_rating,event_release_date,event_producer_code,event_item_name,event_itemvariable_name,event_quantity,amount_amount,amount_bookingfee,amount_deliveryfee,amount_additionalcharges,amount_final,amount_tax,offer_isapplied,offer_type,offer_name,offer_amount,payment_lastmode,payment_lastamount,payment_reference1,payment_reference2,payment_bank,customer_loginid,customer_loginstring,offer_referral,customer_mailid,customer_mobile,trans_str_sales_status_at_venue,trans_mny_trans_value_at_venue,payment_ismypayment,click_recordsource,campaign,source,keyword,medium,venue_multiplex,venue_state,mobile_type,transaction_range,life_cyclestate_from,transactions_after_offer,is_premium_transaction,city_type,holiday_season,week_type,event_popularity,transactionrange_after_discount,showminusbooking,input_source_name,channel,time_stamp,life_cyclestate_to,record_status,week_name,number_of_active_customers,event_genre1,event_genre2,event_genre3,event_genre4,event_language1,event_language2,event_language3,event_language4,event_release_date_range,showminusbooking_range,reserve1,reserve2,reserve3,reserve4,reserve5,payment_mode,payment_type,date_of_first_transaction,transaction_time_in_hours,showtime_in_hours,trx_day)";

    0 讨论(0)
  • 2021-02-15 12:45

    cqlsh's COPY command can be touchy. However, in the COPY documentation is this line:

    The number of columns in the CSV input is the same as the number of columns in the Cassandra table metadata.

    Keeping that in-mind, I did manage to get your data to import with a COPY FROM, by naming the empty fields (processstarttimeuuid and processendtimeuuid, respectively):

    aploetz@cqlsh:stackoverflow> COPY process (processuuid, processid, processnumber, 
    processname, processstarttime, processstarttimeuuid, processendtime, 
    processendtimeuuid, processstatus, orderer, vorgangsnummer, vehicleid, fin, reference, 
    referencetype) FROM 'Process_BulkData.csv' WITH DELIMITER = ';' AND HEADER = TRUE;
    
    1 rows imported in 0.018 seconds.
    aploetz@cqlsh:stackoverflow> SELECT * FROM process ;
    
     processuuid                          | fin               | orderer | processendtime            | processendtimeuuid | processid         | processname        | processnumber | processstarttime          | processstarttimeuuid | processstatus | reference  | referencetype | vehicleid | vorgangsnummer
    --------------------------------------+-------------------+---------+---------------------------+--------------------+-------------------+--------------------+---------------+---------------------------+----------------------+---------------+------------+---------------+-----------+----------------
     0f0d1498-d149-4fcc-87c9-f12783fdf769 | WAU2345CX67890876 |    SIXT | 2011-02-16 22:05:00+-0600 |               null | AbmeldungKl‰rfall | Abmeldung Kl‰rfall |             1 | 2011-02-02 22:05:00+-0600 |                 null |      Finished | KLA-BR4278 |      internal |    A-XA 1 |           4278
    
    (1 rows)
    
    0 讨论(0)
提交回复
热议问题