Solutions to put different values for a row-key but the same timestamps in hbase?

前端未结

关注

 2  877

I\'m new at Hbase. I\'m facing a problem when bulk loading data from a text file into Hbase. Assuming I have a following table:

Key_id | f1:c1 | f2:c2
row1


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  攒了一身酷        
                
              
                            
                2021-01-14 02:34
              
            
            
                                                                       
Q1: Hbase maintains versions using timestamps. If you wont provide it will take default provided by hbase system.

In the put request you can update custom time as well if you have such requirement. It doesn't not effect performance.

Q2 : You can do it in 2 ways.


Simple java client with batching technique shown below. 
Mapreduce importtsv(batch client) 


Ex: #1 Simple java client with batching technique.

I used hbase puts in batch List objects of 100000 record for parsing json(similar to your standalone csv client )

Below is code snippet through which I achieved this. Same thing can be done while parsing other formats as well)

May be you need to call this method in 2 places 

1) with Batch of 100000 records.

2) For processing reminder of your batch records are less than 100000 

  public void addRecord(final ArrayList<Put> puts, final String tableName) throws Exception {
        try {
            final HTable table = new HTable(HBaseConnection.getHBaseConfiguration(), getTable(tableName));
            table.put(puts);
            LOG.info("INSERT record[s] " + puts.size() + " to table " + tableName + " OK.");
        } catch (final Throwable e) {
            e.printStackTrace();
        } finally {
            LOG.info("Processed ---> " + puts.size());
            if (puts != null) {
                puts.clear();
            }
        }
    }


Note : Batch size internally it is controlled by hbase.client.write.buffer like below in one of your config xmls

<property>
         <name>hbase.client.write.buffer</name>
         <value>20971520</value> // around 2 mb i guess
 </property>


which has default value say 2mb size. once you buffer is filled then it will flush all puts to actually insert in to your table.


  Furthermore, Either mapreduce client or stand alone client with batch
  technique. batching is controlled by above buffer property

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  北海茫月        
                
              
                            
                2021-01-14 02:57
              
            
            
                                                                       
If you need to overwrite record, you can configure hbase table to remember only one version. 

This page explains how to do Bulk loading to hbase at maximum possible speed:

How to use hbase bulk loading and why
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复