Quantcast
Channel: Finn Årup Nielsen's blog
Viewing all articles
Browse latest Browse all 20

Small solutions for big data and python shelve concurrency

$
0
0

I am still on the lookout for a good database system: Movable, big, concurrent, fast, flexible and not necessarily requiring root access.

MySQL, good in many aspects, lacks flexibility: An ALTER TABLE can take hours.

MongoDB has a 2GB size limit on 32-bit.

For some reason I thought that SQLite was limited to 2GB on 32-bit (where on earth did I get that idea from?). But SQLite can potential store 140 terabytes. It may be limited by OS/filesystem. So what is that? 32-bit ext3 file size limit is from 16GiB to 2TiB says Wikipedia. Apparently my block sizes are 4KiB (reported with $ sudo /sbin/dumpe2fs /dev/sda7 | grep "Block size"), so if we can trust this online encyclopedia that anyone can edit it may be that I can have 2TiB SQLite databases. SQLite still has the ALTER TABLE problem, but my first attempt used SQLite as a key-value store with the values as JSON. News on Wikipedia also reports that Mr. Hipp is working on document-oriented UnQLite.

I was also considering the Python key-value store 'shelve' and its underlying databases (e.g., bsddb). However, somewhere in the documentation you can read that "The shelve module does not support concurrent read/write access". I was slightly surprised by how wrong it goes when I executed the code below.

https://gist.github.com/3904327

Permalink | Leave a comment  »


Viewing all articles
Browse latest Browse all 20

Trending Articles