In python, how can I load a sqlite db completely to memory before connecting to it?
Solution 1
apsw is an alternate wrapper for sqlite, which enables you to backup an on-disk database to memory before doing operations.
From the docs:
###
### Backup to memory
###
# We will copy the disk database into a memory database
memcon=apsw.Connection(":memory:")
# Copy into memory
with memcon.backup("main", connection, "main") as backup:
backup.step() # copy whole database in one go
# There will be no disk accesses for this query
for row in memcon.cursor().execute("select * from s"):
pass
connection
above is your on-disk db.
Solution 2
- Get an in-memory database running (standard stuff)
- Attach the disk database (file).
- Recreate tables / indexes and copy over contents.
- Detach the disk database (file)
Here's an example (taken from here) in Tcl (could be useful for getting the general idea along):
proc loadDB {dbhandle filename} {
if {$filename != ""} {
#attach persistent DB to target DB
$dbhandle eval "ATTACH DATABASE '$filename' AS loadfrom"
#copy each table to the target DB
foreach {tablename} [$dbhandle eval "SELECT name FROM loadfrom.sqlite_master WHERE type = 'table'"] {
$dbhandle eval "CREATE TABLE '$tablename' AS SELECT * FROM loadfrom.'$tablename'"
}
#create indizes in loaded table
foreach {sql_exp} [$dbhandle eval "SELECT sql FROM loadfrom.sqlite_master WHERE type = 'index'"] {
$dbhandle eval $sql_exp
}
#detach the source DB
$dbhandle eval {DETACH loadfrom}
}
}
Solution 3
If you are using Linux, you can try tmpfs which is a memory-based file system.
It's very easy to use it:
- mount tmpfs to a directory.
- copy sqlite db file to the directory.
- open it as normal sqlite db file.
Remember, anything in tmpfs will be lost after reboot. So, you may copy db file back to disk if it changed.
Solution 4
Note that you may not need to explicitly load the database into SQLite's memory at all. Simply prime your operating system disk cache by copying it to null.
Windows: copy file.db nul:
Unix/Mac: cp file.db /dev/null
This has the advantage of the operating system taking care of memory management, especially discarding it if something more important comes along.
Comments
-
relima almost 2 years
I have a 100 mega bytes sqlite db file that I would like to load to memory before performing sql queries. Is it possible to do that in python?
Thanks
-
relima over 13 yearsI like your solution but there is only one problem, I use a lot of row_factory feature of pysqlite; and it seems that apsw does not have this feature.
-
relima over 13 yearsThis has really solved my problem. My queries are MUCH faster now.
-
relima over 13 yearsimport apsw mem_db_loader=apsw.Connection(file_sqlite_db) connection=apsw.Connection(":memory:") connection.backup("main", mem_db_loader, "main").step() cursor = connection.cursor()
-
relima over 13 yearsIt may be only my computer, but this technique didn't really improve my performance. (Win 7 x64, 8gb ram).
-
Roger Binns over 13 yearsIt has worked for many other people on the SQLite mailing list in the past especially after a machine has just booted as it primes the file system cache. In your case it is most likely that file didn't end up in the file system cache. (Some copy tools tell the OS to bypass the cache so that they don't throw out existing "good" content in it.)
-
Peter Rust over 12 yearsThe "nul:" trick didn't work for me on Win7, but a real copy (to temp.db) does. It's a little annoying b/c I have to delete the temp file to prevent taking excessive space on the HD, but it gets the file into the disk cache (makes the 1st query just as fast as subsequent queries).
-
kxr almost 7 yearswhen anyway in a programming language (Python), you could just dummy-read the whole file before doing work. Any experience how this cache priming performs vs the
:memory:
backup method?