Tuesday, July 15, 2008

SQLite

If your standalone application requires a database, you may want to consider using SQLite instead of flat file or a full-fledge database like MySQL, PostgreSQL, MS-SQL, Oracle, etc. As what the official site mentioned, SQLite is a in-process library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine. So you don't need a database administrator to help you to install and configure.

So, who is using it anyway, some well-know users are deploying it in their applications. BTW, it is included in the Python 2.5 distribution. Simply "import sqlite3" in the shell to load in the SQLite module. Also, Solaris 10 is using it to manage their smf - service management facility.

In this blog, I will try to show you what SQLite can offer.

I am going to create a new database and populate it with a few records. After that I will query the database. As you can see, all the commands are typical SQL statements. You can even turn on the timer command to benchmark your SQL query. If you are a shell script guy like me, you can run "sqlite3" as command line to fetch the records.

$ ls

$ sqlite3 newdb.sqlite3
SQLite version 3.5.2
Enter ".help" for instructions
sqlite> .help
.bail ON|OFF           Stop after hitting an error.  Default OFF
.databases             List names and files of attached databases
.dump ?TABLE? ...      Dump the database in an SQL text format
.echo ON|OFF           Turn command echo on or off
.exit                  Exit this program
.explain ON|OFF        Turn output mode suitable for EXPLAIN on or off.
.header(s) ON|OFF      Turn display of headers on or off
.help                  Show this message
.import FILE TABLE     Import data from FILE into TABLE
.indices TABLE         Show names of all indices on TABLE
.mode MODE ?TABLE?     Set output mode where MODE is one of:
                         csv      Comma-separated values
                         column   Left-aligned columns.  (See .width)
                         html     HTML <table> code
                         insert   SQL insert statements for TABLE
                         line     One value per line
                         list     Values delimited by .separator string
                         tabs     Tab-separated values
                         tcl      TCL list elements
.nullvalue STRING      Print STRING in place of NULL values
.output FILENAME       Send output to FILENAME
.output stdout         Send output to the screen
.prompt MAIN CONTINUE  Replace the standard prompts
.quit                  Exit this program
.read FILENAME         Execute SQL in FILENAME
.schema ?TABLE?        Show the CREATE statements
.separator STRING      Change separator used by output mode and .import
.show                  Show the current values for various settings
.tables ?PATTERN?      List names of tables matching a LIKE pattern
.timeout MS            Try opening locked tables for MS milliseconds
.timer ON|OFF          Turn the CPU timer measurement on or off
.width NUM NUM ...     Set column widths for "column" mode

sqlite> CREATE TABLE staff (lastname TEXT, firstname TEXT, age INT);

sqlite> INSERT INTO staff VALUES ('Chan','Chi Hung',45);

sqlite> INSERT INTO staff VALUES ('Chi','Chan Hung',21);

sqlite> INSERT INTO staff VALUES ('Hung','Chi Chan',11);

sqlite> SELECT * FROM staff;
Chan|Chi Hung|45
Chi|Chan Hung|21
Hung|Chi Chan|11

sqlite> .schema
CREATE TABLE staff (lastname text, firstname text, age int);

sqlite> .timer ON

sqlite> SELECT * FROM staff WHERE lastname like 'C%' ORDER BY age ASC;
Chi|Chan Hung|21
Chan|Chi Hung|45
CPU Time: user 0.000225 sys 0.000116

sqlite> .quit

$ ls -l
total 4
-rw-r--r--   1 chihung  Chihung      2048 Jul 15 20:21 newdb.sqlite3

$ sqlite3 newdb.sqlite3 "SELECT * FROM staff where lastname='Chan'"
Chan|Chi Hung|45

If you have been reading my blog, you will realise that I started using SQLite when I try to tackle the Netflix Prize competition. Although I did not make it to the leaderboard, I learned how to make use of SQLite to work with massive dataset (100+ million). If you design your database schema properly with indices, your query (even inner join) can be extremely fast. Below shows you SQLite from Tcl shell. BTW, it has binding to most of the programming languages.

$ ls -lh movie-64bit.db
-r--r--r--   2 chihung  chihung       6.5G Nov 20  2007 movie-64bit.db

$ tclsh
% package require sqlite3
3.5.2

% sqlite3 dbHandler movie-64bit.db -readonly 1

% set movieN [dbHandler eval {SELECT COUNT(*) FROM movie}]
100480507

% set m1m2 [dbHandler eval {
SELECT t1.customer_id FROM movie t1
INNER JOIN movie t2 ON t1.customer_id=t2.customer_id WHERE
t1.movie_id=1 AND t2.movie_id=2}]
305344 387418 515436 636262 1374216 1398626 1664010 1806515 2118461 2439493

% time "dbHandler eval {
SELECT t1.customer_id FROM movie t1
INNER JOIN movie t2 ON t1.customer_id=t2.customer_id WHERE
t1.movie_id=1 AND t2.movie_id=2}"
4688 microseconds per iteration

Solaris 10 is using SQLite, if you are not aware of that.

$ uname -a
SunOS thumper 5.10 Generic_118855-36 i86pc i386 i86pc

$ cat /etc/release
                        Solaris 10 11/06 s10x_u3wos_10 X86
           Copyright 2006 Sun Microsystems, Inc.  All Rights Reserved.
                        Use is subject to license terms.
                           Assembled 14 November 2006


$ /lib/svc/bin/sqlite /lib/svc/seed/global.db
SQLite version 2.8.15-repcached-Generic Patch
Enter ".help" for instructions
sqlite> .schema
CREATE TABLE id_tbl (id_name         STRING NOT NULL,id_next         INTEGER NOT NULL);
CREATE TABLE instance_tbl (instance_id     INTEGER PRIMARY KEY,instance_name   CHAR(256) NOT NULL,instance_svc    INTEGER NOT NULL);
CREATE TABLE pg_tbl (pg_id           INTEGER PRIMARY KEY,pg_parent_id    INTEGER NOT NULL,pg_name         CHAR(256) NOT NULL,pg_type         CHAR(256) NOT NULL,pg_flags        INTEGER NOT NULL,pg_gen_id       INTEGER NOT NULL);
CREATE TABLE prop_lnk_tbl (lnk_prop_id     INTEGER PRIMARY KEY,lnk_pg_id       INTEGER NOT NULL,lnk_gen_id      INTEGER NOT NULL,lnk_prop_name   CHAR(256) NOT NULL,lnk_prop_type   CHAR(2) NOT NULL,lnk_val_id      INTEGER);
CREATE TABLE schema_version (schema_version  INTEGER);
CREATE TABLE service_tbl (svc_id          INTEGER PRIMARY KEY,svc_name        CHAR(256) NOT NULL);
CREATE TABLE snaplevel_lnk_tbl (snaplvl_level_id INTEGER NOT NULL,snaplvl_pg_id    INTEGER NOT NULL,snaplvl_pg_name  CHAR(256) NOT NULL,snaplvl_pg_type  CHAR(256) NOT NULL,snaplvl_pg_flags INTEGER NOT NULL,snaplvl_gen_id   INTEGER NOT NULL);
CREATE TABLE snaplevel_tbl (snap_id                 INTEGER NOT NULL,snap_level_num          INTEGER NOT NULL,snap_level_id           INTEGER NOT NULL,snap_level_service_id   INTEGER NOT NULL,snap_level_service      CHAR(256) NOT NULL,snap_level_instance_id  INTEGER NULL,snap_level_instance     CHAR(256) NULL);
CREATE TABLE snapshot_lnk_tbl (lnk_id          INTEGER PRIMARY KEY,lnk_inst_id     INTEGER NOT NULL,lnk_snap_name   CHAR(256) NOT NULL,lnk_snap_id     INTEGER NOT NULL);
CREATE TABLE value_tbl (value_id        INTEGER NOT NULL,value_type      CHAR(1) NOT NULL,value_value     VARCHAR NOT NULL);
CREATE INDEX id_tbl_id ON id_tbl (id_name);
CREATE INDEX instance_tbl_name ON instance_tbl (instance_svc, instance_name);
CREATE INDEX pg_tbl_name ON pg_tbl (pg_parent_id, pg_name);
CREATE INDEX pg_tbl_parent ON pg_tbl (pg_parent_id);
CREATE INDEX pg_tbl_type ON pg_tbl (pg_parent_id, pg_type);
CREATE INDEX prop_lnk_tbl_base ON prop_lnk_tbl (lnk_pg_id, lnk_gen_id);
CREATE INDEX prop_lnk_tbl_val ON prop_lnk_tbl (lnk_val_id);
CREATE INDEX service_tbl_name ON service_tbl (svc_name);
CREATE INDEX snaplevel_lnk_tbl_id ON snaplevel_lnk_tbl (snaplvl_pg_id);
CREATE INDEX snaplevel_lnk_tbl_level ON snaplevel_lnk_tbl (snaplvl_level_id);
CREATE INDEX snaplevel_tbl_id ON snaplevel_tbl (snap_id);
CREATE INDEX snapshot_lnk_tbl_name ON snapshot_lnk_tbl (lnk_inst_id, lnk_snap_name);
CREATE INDEX snapshot_lnk_tbl_snapid ON snapshot_lnk_tbl (lnk_snap_id);
CREATE INDEX value_tbl_id ON value_tbl (value_id);

sqlite> select * from service_tbl;
2|system/boot-archive
6|system/device/local
10|milestone/devices
14|system/identity
20|system/filesystem/local
24|system/manifest-import
28|system/filesystem/minimal
32|milestone/multi-user
36|milestone/name-services
40|network/initial
44|network/loopback
48|network/physical
52|system/svc/restarter
56|system/filesystem/root
60|milestone/single-user
64|system/filesystem/usr
68|network/rpc/bind
72|system/console-login
76|milestone/multi-user-server
80|network/inetd-upgrade
84|system/utmp
88|system/metainit
92|network/pfil
93|system/sysidtool

sqlite> .q

If you want to know more about SQLite, our National Library carries a copy of The Definitive Guide to SQLite. Alternatively, watch An Introduction to SQLite by the author (Richard Hipp) who delivered the talked at Google TechTalks May 31, 2006.

SQLite is a small C library that implements a self-contained, embeddable, zero-configuration SQL database engine. SQLite implements a large subset of SQL-92 and stores a complete database in a single disk file. The library footprint is less than 250 KB making is suitable for use in embedded devices and applications where memory space is scarce. This talk provides a quick overview of SQLite, its history, its strengths and weaknesses, and describes situations where it is much more useful than a traditional client/server database. The talk concludes with a discussion of the lessons learned from the development of SQLite and how those lessons can be applied to other projects

Labels: ,

0 Comments:

Post a Comment

<< Home