Unexpectedly large Realm file size

后端 未结 3 1406
北海茫月
北海茫月 2020-12-11 06:31

This question is about using two different ways to insert objects into a Realm. I noticed that the first method is a lot faster, but the size result is huge comparing with t

相关标签:
3条回答
  • 2020-12-11 06:32

    The large file size when adding all of the objects in a single transaction is due to an unfortunate interaction between Realm's transaction log subsystem and Realm's memory allocation algorithm for large blobs. Realm's memory layout algorithm requires that the file size be at least 8x the size of the largest single blob stored in the Realm file. Transaction log entries, summarizing the modifications made during a single transaction, are stored as blobs within the Realm file.

    When you add 40,000 objects in one transaction, you end up with a single transaction log entry that's around 5MB in size. This means that the file has to be at least 40MB in size in order to store it. (I'm not quite sure how it ends up being nearly twice that size again. It might be that the blob size is rounded up to a power of two somewhere along the line…)

    When you add one object in 40,000 transactions, you still end up with a single transaction log entry only this time it's on a hundred or so bytes in size. This happens because when Realm commits a transaction, it attempts to first reclaim unused transaction log entries before allocating space for new entries. Since the Realm file is not open elsewhere, the previous entry can be reclaimed as each new commit is performed.

    realm/realm-core#2343 tracks improving how Realm stores transaction log entries to avoid the significant overallocation you're seeing.

    For now my suggestion would be to split the difference between the two approaches and add groups of objects per write transaction. This will trade off a little performance by increasing the number of commits but will reduce the impact of the memory layout algorithm by reducing the size of the largest transaction log entry you create. From a quick test, committing every 2,000 objects results in a file size of around 4MB, while being significantly quicker than adding each object in a separate write transaction.

    0 讨论(0)
  • 2020-12-11 06:47

    You should in most cases try to minimize the number of write transactions. A write transaction has a significant overhead, hence if you start a new write transaction for every object you want to add to realm, your code will be significantly slower than if you added all objects using a single write transaction.

    In my experience, the best way to add several elements to realm is to create the elements, add them to an array and then add the array as a whole to Realm using a single write transaction.

    So this is what you should be doing:

    var objects = [realmObj]()
    for i in 1...40000{
        let newRealmObj = realmObj(value: ["id" : incrementID(), "a": "123","b": 12.12,"c": 66,"d": 13.13,"e": 0.6,"f": "01100110","g": DateTime, "h": 3])
        objects.append(newRealmObj)
    }
    try! realm.write {
        realm.add(objects)
    }
    

    As for the size issue, see the Limitations - File Size part of the Realm documentation. I am not 100% sure on the cause of the issue, but I would say that the issue is caused by writing code inside the write transaction that doesn't need to happen there and shouldn't happen inside the write transaction. I guess due to this, Realm creates a lot of intermediate versions of your objects and since releasing reserved storage capacity is quite an expensive operation, it doesn't happen by the time you are checking the file size.

    Keep in mind, that the creation of objects doesn't need to happen inside a write transaction. You only need to create a write transaction for modifying persisted data in Realm (which includes adding new objects to Realm, deleting persisted objects and modifying persisted objects directly).

    0 讨论(0)
  • 2020-12-11 06:56

    Thanks everyone. I found an optimized way to do the task using your tips. I just did the .write, in batches instead of sending all the content in a single operation. Follows some data to compare:

    Batch Size (Objects) | File Size (mb)

    10.000 = 23.1mb
    5.000 = 11.5mb
    2.500 = 5.8mb
    1.250 = 4.2mb
    625 = 3.7mb
    300 = 3.7mb
    100 = 3.1mb
    50 = 3.1mb
    10 = 3.4mb
    5 = 3.1mb

    So in my humble opinion working with batches of 1000 is the best size / speed for this case.

    Here is the code i used for this test. The only thing changed was the for 1...XXX interation.

        let realm = try! Realm(fileURL: banco_url!)
    
        var objects = [realm_obj]()
        var ids = incrementID()
    
        while (ids < 40000) {
    
            for i in 1...5{
    
                let new_realm_obj = realm_obj(value: ["id" : ids,
                                                    "a": "123",
                                                    "b": 12.12,
                                                    "c": 66,
                                                    "d": 13.13,
                                                    "e": 0.6,
                                                    "f": "01100110",
                                                    "g": someDateTime,
                                                    "h": 3])
                objects.append(new_realm_obj)
                ids += 1 
            }
    
            try! realm.write {
                realm.add(objects)
            }
        }
    
    0 讨论(0)
提交回复
热议问题