How to reduce SQLite memory consumption?

前端未结

关注

 4  1604

I\'m looking for ways to reduce memory consumption by SQLite3 in my application.

At each execution it creates a table with the following schema:

(main TE


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  青春惊慌失措        
                
              
                            
                2021-01-31 21:03
              
            
            
                                                                       
It seems that the high memory consumption may be caused by the fact that too many operations are concentrated in one big transaction. Trying to commit smaller transaction like per 1M operations may help. 5M operations per transaction consumes too much memory. 

However, we'd balance the operation speed and memory usage.

If smaller transaction is not an option, PRAGMA shrink_memory may be a choice.

Use sqlite3_status() with SQLITE_STATUS_MEMORY_USED to trace the dynamic memory allocation and locate the bottleneck.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  盖世英雄少女心        
                
              
                            
                2021-01-31 21:12
              
            
            
                                                                       
I would:


prepare the statements (if you're not doing it already)
lower the amount of INSERTs per transaction (10 sec = 500,000 sounds appropriate)
use PRAGMA locking_mode = EXCLUSIVE; if you can


Also, (I'm not sure if you know) the PRAGMA cache_size is in pages, not in MBs. Make sure you define your target memory in as PRAGMA cache_size * PRAGMA page_size or in SQLite >= 3.7.10 you can also do PRAGMA cache_size = -kibibytes;. Setting it to 1 M(illion) would result in 1 or 2 GB.

I'm curious how cache_size  helps in INSERTs though...

You can also try and benchmark if the PRAGMA temp_store = FILE; makes a difference.

And of course, whenever your database is not being written to:


PRAGMA shrink_memory;
VACUUM;


Depending on what you're doing with the database, these might also help:


PRAGMA auto_vacuum = 1|2;
PRAGMA secure_delete = ON;




I ran some tests with the following pragmas:

busy_timeout=0;
cache_size=8192;
encoding="UTF-8";
foreign_keys=ON;
journal_mode=WAL;
legacy_file_format=OFF;
synchronous=NORMAL;
temp_store=MEMORY;


Test #1:

INSERT OR IGNORE INTO test (time) VALUES (?);
UPDATE test SET count = count + 1 WHERE time = ?;


Peaked ~109k updates per second.

Test #2:

REPLACE INTO test (time, count) VALUES
(?, coalesce((SELECT count FROM test WHERE time = ? LIMIT 1) + 1, 1));


Peaked at ~120k updates per second.



I also tried PRAGMA temp_store = FILE; and the updates dropped by ~1-2k per second.



For 7M updates in a transaction, the journal_mode=WAL is slower than all the others.



I populated a database with 35,839,987 records and now my setup is taking nearly 4 seconds per each batch of 65521 updates - however, it doesn't even reach 16 MB of memory consumption.



Ok, here's another one:


  Indexes on INTEGER PRIMARY KEY columns (don't do it)
  
  When you create a column with INTEGER PRIMARY KEY, SQLite uses this
  column as the key for (index to) the table structure. This is a hidden
  index (as it isn't displayed in SQLite_Master table) on this column.
  Adding another index on the column is not needed and will never be
  used. In addition it will slow INSERT, DELETE and UPDATE operations
  down.


You seem to be defining your PK as NOT NULL + UNIQUE. PK is UNIQUE implicitly.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  既然无缘        
                
              
                            
                2021-01-31 21:12
              
            
            
                                                                       
In the spirit of brainstorming I will venture an answer. I have not done any testing like this fellow:

Improve INSERT-per-second performance of SQLite?

My hypothesis is that the index on the text primary key might be more RAM-intensive than a couple of indexes on two integer columns (what you'd need to simulate a hashed-table). 

EDIT: Actually, you dont' even need a primary key for this:

      create table foo( slot integer, myval text, occurrences int);
      create index ix_foo on foo(slot);  // not a unique index


An integer primary key (or a non-unique index on slot) would leave you with no quick way to determine if your text value were already on file. So to address that requirement, you might try implementing something I suggested to another poster, simulating a hashed-key:

SQLite Optimization for Millions of Entries?

A hash-key-function would allow you to determine where the text-value would be stored if it did exist.

http://www.cs.princeton.edu/courses/archive/fall08/cos521/hash.pdf
http://www.fearme.com/misc/alg/node28.html
http://cs.mwsu.edu/~griffin/courses/2133/downloads/Spring11/p677-pearson.pdf
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  梦谈多话        
                
              
                            
                2021-01-31 21:15
              
            
            
                                                                       
Assuming that all the operations in one transaction are distributed all over the table so that all pages of the table need to be accessed, the size of the working set is:


about 1 GB for the table's data, plus
about 1 GB for the index on the main column, plus
about 1 GB for the original data of all the table's pages changed in the transaction (probably all of them).




You could try to reduce the amount of data that gets changed for each operation by moving the count column into a separate table:



CREATE TABLE main_lookup(main TEXT NOT NULL UNIQUE, rowid INTEGER PRIMARY KEY);
CREATE TABLE counters(rowid INTEGER PRIMARY KEY, count INTEGER DEFAULT 0);


Then, for each operation:

SELECT rowid FROM main_lookup WHERE main = @SEQ;
if not exists:
    INSERT INTO main_lookup(main) VALUES(@SEQ);
    --read the inserted rowid
    INSERT INTO counters VALUES(@rowid, 0);
UPDATE counters SET count=count+1 WHERE rowid = @rowid;


In C, the inserted rowid is read with sqlite3_last_insert_rowid.

Doing a separate SELECT and INSERT is not any slower than INSERT OR IGNORE; SQLite does the same work in either case.

This optimization is useful only if most operations update a counter that already exists.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复