ideas to avoid hitting memory limit when using dbWriteTable to save an R data table inside a SQLite database

▼魔方 西西 提交于 2019-12-12 16:17:37

问题


a data frame that's small enough to be loaded into R still occasionally hits the memory-limit ceiling during a dbWriteTable call if it was near the maximum amount of RAM available. i'm wondering if there's any better solution than reading the table into RAM in chunks like the code below?

i'm trying to write code that will work on older computers, so i'm using the windows 32-bit version of R to re-create these memory errors..

# this example will only work on a computer with at least 3GB of RAM
# because it intentionally maxes out the 32-bit limit

# create a data frame that's barely fits inside 32-bit R's memory capacity
x <- mtcars[ rep( seq( nrow( mtcars ) ) , 400000 ) , ]

# check how many records this table contains..
nrow( x )

# create a connection to a SQLite database
# not stored in memory
library( RSQLite )
tf <- tempfile()
db <- dbConnect( SQLite() , tf )


# storing `x` in the database with dbWriteTable breaks.
# this line causes a memory error
# dbWriteTable( db , 'x' , x )

# but storing it in chunks works!
chunks <- 100

starts.stops <- floor( seq( 1 , nrow( x ) , length.out = chunks ) )


for ( i in 2:( length( starts.stops ) )  ){

    if ( i == 2 ){
        rows.to.add <- ( starts.stops[ i - 1 ] ):( starts.stops[ i ] )
    } else {
        rows.to.add <- ( starts.stops[ i - 1 ] + 1 ):( starts.stops[ i ] )
    }

    # storing `x` in the database with dbWriteTable in chunks works.
    dbWriteTable( db , 'x' , x[ rows.to.add , ] , append = TRUE )
}


# and it's the correct number of lines.
dbGetQuery( db , "select count(*) from x" )

来源:https://stackoverflow.com/questions/14246683/ideas-to-avoid-hitting-memory-limit-when-using-dbwritetable-to-save-an-r-data-ta

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!