Specification of QDBM Version 1

Copyright (C) 2000-2003 Mikio Hirabayashi
Last Update: Wed, 09 Apr 2003 21:59:41 +0900
[Japanese] [Home]

Table of Contents

  1. Overview
  2. Features
  3. Installation
  4. Depot: Basic API
  5. Commands for Depot
  6. Curia: Extended API
  7. Commands for Curia
  8. Relic: NDBM-compatible API
  9. Commands for Relic
  10. Hovel: GDBM-compatible API
  11. Commands for Hovel
  12. File Format
  13. Bugs
  14. FAQ
  15. Copying

Overview

QDBM is a library of routines for managing a database. The database is a simple data file containing records, each is a pair of a key and a value. Every key and value is serial bytes with variable length. Both binary data and character string can be used as a key and a value. There is neither concept of data tables nor data types. Each key must be unique within a database, so it is impossible to store two or more records with a key overlaps.

The following access methods are provided to the database: storing a record with a key and a value, deleting a record by a key, retrieving a record by a key. Moreover, traversal access to every key are provided, although the order is arbitrary. These access methods are similar to ones of DBM (or its compatibles, NDBM and GDBM) library defined in the UNIX standard. QDBM is an alternative for DBM because of its higher performance.


Features

QDBM is developed referring to GDBM for the purpose of the following three points: higher processing speed, smaller size of a database file, and simpler API. They have been achieved. Moreover, the following three restrictions of traditional DBM: a process can handle only one database, the size of a key and a value is bounded, a database file is sparse, are cleared.

QDBM uses hash algorithm to retrieve records. If a bucket array has sufficient number of elements, the time complexity of retrieval is `O(1)'. That is, time required for retrieving a record is constant, regardless of the scale of a database. It is also the same about storing and deleting. Collision of hash values is managed by separate chains. Data structure of the chains is binary search tree. Even if a bucket array has unusually scarce elements, the time complexity of retrieval is `O(log n)'.

QDBM attains improvement in retrieval by loading RAM with the whole of a bucket array. If a bucket array is on RAM, it is possible to access a region of a target record by about one pass of file operations. A bucket array saved in a file is not read into RAM with the `read' call but directly mapped to RAM with the `mmap' call. Therefore, preparation time on connecting to a database is very short, and two or more processes can share the same memory map.

If the number of elements of a bucket array is about half of records stored within a database, although it depends on characteristic of the input, the probability of collision of hash values are about 56.7% (36.8% if the same, 21.3% if twice, 11.5% if four times, 6.0% if eight times). In such case, it is possible to retrieve a record by two or less passes of file operations. If it is made into a performance index, in order to handle a database containing one million of records, a bucket array with half a million of elements is needed. The size of each element is 4 bytes. That is, if 2M bytes of RAM is available, a database containing one million records can be handled.

QDBM provides two kinds of methods to connect to a database. A `reader' can perform retrieving but neither storing nor deleting. A `writer' can perform all access methods. Exclusion control between processes is performed when connecting to a database by file locking. While a writer is connecting to a database, no other readers and no other writers can connect. While a reader is connecting to a database, other readers can connect, but no other writers. According to this mechanism, adjustment of simultaneous connections to a database is guaranteed in multitasking environment.

Traditional DBM provides two modes of the storing operation, `insert' and `replace'. In a case a key overlaps an existing record, the insert mode keeps the existing value, while the replace mode transposes it to the specified value. In addition to the two modes, QDBM provides `concatenate' mode. In the mode, the specified value is concatenated at the end of the existing value and stored. This feature is useful when adding a element to a value as an array. Moreover, although DBM has a method to fetch out a value from a database only by reading the whole of a region of a record, QDBM has a method to fetch out a part of a region of a value. When a value is treated as an array, this feature is also useful.

If data alignment is assigned to a database, each record will place in a file with vacating suitable padding bytes. When it is going to overwrite a value with larger size than the size of an existing value, it is necessary to remove the region of the existing record to another position of the file. However, if the increasing size is settled in the size of the padding of the existing value, removing the region is not necessary. The time complexity of removing a region of a record depends on the size of the region. However, average performance of updating database is robust from size of records because the probability of removing are made low by enlargement of paddings according to size of records. Generally speaking, while succession of updating, fragmentation of available regions occurs, and the size of a database grows rapidly. QDBM deal with this problem by coalescence of dispensable regions and reuse of them.

QDBM provides four kinds of APIs: the basic API, the extended API, the NDBM-compatible API, and the GDBM-compatible API. In the basic API, a database is treated as a file. In the extended API, a database is treated as a directory containing one or more database files. If it is going to store all data of a database in one file, the file size may exceed the restriction of a file system. In the extended API, a database can be divided into one or more files in a directory. Because the basic API and the extended API are resembled mutually, it is easy to porting an application between each API. The NDBM-compatible API and GDBM-compatible API are for porting an application from NDBM and GDBM to QDBM. Besides, QDBM has command line interfaces corresponding to the four APIs. They are useful for unit tests, debugging applications and so on. They will also support prototyping of database applications with script languages.

APIs for C++ and Java are also provided. The C++ API encapsulates database handling functions of the basic API and the extended API of QDBM with class mechanism of C++. The Java API has native methods calling the basic API and the extended API of QDBM with Java Native Interface. Both APIs are thread-safe.


Installation

To install QDBM from a source package, GCC, GNU make and GNU Binutils are required. That is, such commands as `gcc', `make', `cpp', `cc1', `as', `ld' and so on are to appear on the command search path, ahead of their namesakes.

When an archive file of QDBM is extracted, change the current working directory to the generated directory and perform installation.

Run the configuration script.

./configure

Build programs.

make

Perform self-diagnostic test.

make check

Install programs. This operation must be carried out by the root user.

make install

When a series of work finishes, header files, `depot.h', `curia.h', `relic.h' and `hovel.h' will be installed in `/usr/local/include', libraries, `libqdbm.a', `libqdbm.so' and so on will be installed in `/usr/local/lib', executable commands, `dpmgr', `dptest', `dptsv', `crmgr', `crtest', `rlmgr', `rltest', `hvmgr' and `hvtest' will be installed in `/usr/local/bin'.

To uninstall QDBM, execute the following command after `./configure'. This operation must be carried out by the root user.

make uninstall

If an old version of QDBM is installed on your system, uninstall it before installation of a new one.

The C++ API and the Java API are not installed by default. Refer to `plus/xspex.html' to know how to install the C++ API. Refer to `java/jspex.html' to know how to install the Java API.

To install QDBM from such a binary package as RPM, refer to the manual of the package manager. For example, if you use RPM, execute like the following command by the root user.

rpm -ivh qdbm-1.x.x-x.i386.rpm

On Windows (Cygwin), you should follow the procedures below for installation.

Run the configuration script.

./configure

Build programs.

make win

Perform self-diagnostic test.

make check

Install programs. As well, perform `make uninstall-win' to uninstall them.

make install-win

On Windows, the import library `libqdbm.dll.a' is created instead of the static library `libqdbm.a', and the dynamic linking library `qdbm.dll' is created instead of such shared libraries as `libqdbm.so'. `qdbm.dll' is installed into such system directory as `C:\WINNT\SYSTEM32'.


Depot: Basic API

Depot is the basic API of QDBM. All features for managing a database provided by QDBM are implemented by Depot. Other APIs are no more than wrappers of Depot. Depot is the fastest in all APIs of QDBM.

In order to use Depot, you should include `depot.h' and `stdlib.h' in the source files. Usually, the following description will be near the beginning of a source file.

#include <depot.h>
#include <stdlib.h>

A pointer to `DEPOT' is used as a database handle. It is like that some file I/O routines of `stdio.h' use a pointer to `FILE'. A database handle is opened with the function `dpopen' and closed with `dpclose'. You should not refer directly to any member of the handle. If a fatal error occurs in a database, any access method via the handle except `dpclose' will not work and return error status. Although a process is allowed to use multiple database handles at the same time, handles of the same database file should not be used.

The external variable `dpversion' is the string containing the version information.

extern const char *dpversion;

The external variable `dpdbgfd' is a file descriptor to output debugging information.

extern int dpdbgfd;
The initial value of this variable is -1. If the value is negative, debugging output is not performed.

The external variable `dpecode' is assigned with the last happened error code. Refer to `depot.h' for details of the error codes.

extern int dpecode;
The initial value of this variable is `DP_NOERROR'.

The function `dperrmsg' is used in order to get a message string corresponding to an error code.

const char *dperrmsg(int ecode);
`ecode' specifies an error code. The return value is the message string of the error code. The region of the return value is not writable.

The function `dpopen' is used in order to get a database handle.

DEPOT *dpopen(const char *name, int omode, int bnum);
`name' specifies the name of a database file. `omode' specifies the connection mode: `DP_OWRITER' as a writer, `DP_OREADER' as a reader. If the mode is `DP_OWRITER', the following may be added by bitwise or: `DP_OCREAT', which means it creates a new database if not exist, `DP_OTRUNC', which means it creates a new database regardless if one exists. `bnum' specifies the number of elements of the bucket array. If it is not more than 0, the default value is specified. The size of a bucket array is determined on creating, and can not be changed except for by optimization of the database. Suggested size of a bucket array is about from 0.5 to 4 times of the number of all records to store. The return value is the database handle or `NULL' if it is not successful. While connecting as a writer, an exclusive lock is invoked to the database file. While connecting as a reader, a shared lock is invoked to the database file. The thread blocks until the lock is achieved.

The function `dpclose' is used in order to close a database handle.

int dpclose(DEPOT *depot);
`depot' specifies a database handle. If successful, the return value is true, else, it is false. Because the region of a closed handle is released, it becomes impossible to use the handle. Updating a database is assured to be written when the handle is closed. If a writer opens a database but does not close it appropriately, the database will be broken.

The function `dpput' is used in order to store a record.

int dpput(DEPOT *depot, const char *kbuf, int ksiz, const char *vbuf, int vsiz, int dmode);
`depot' specifies a database handle connected as a writer. `kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. `vbuf' specifies the pointer to the region of a value. `vsiz' specifies the size of the region of the value. If it is negative, the size is assigned with `strlen(vbuf)'. `dmode' specifies behavior when the key overlaps, by the following values: `DP_DOVER', which means the specified value overwrites the existing one, `DP_DKEEP', which means the existing value is kept, `DP_DCAT', which means the specified value is concatenated at the end of the existing value. If successful, the return value is true, else, it is false.

The function `dpout' is used in order to delete a record.

int dpout(DEPOT *depot, const char *kbuf, int ksiz);
`depot' specifies a database handle connected as a writer. `kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. If successful, the return value is true, else, it is false. false is returned when no record corresponds to the specified key.

The function `dpget' is used in order to retrieve a record.

char *dpget(DEPOT *depot, const char *kbuf, int ksiz, int start, int max, int *sp);
`depot' specifies a database handle. `kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. `start' specifies the offset address of the beginning of the region of the value to be read. `max' specifies the max size to be read. If it is negative, the size to read is unlimited. `sp' specifies a pointer to the variable to which the size of the region of the return value assigned. If it is `NULL', it is not used. If successful, the return value is the pointer to the region of the value of the corresponding record, else, it is `NULL'. `NULL' is returned when no record corresponds to the specified key or the size of the value of the corresponding record is less than `start'. Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if it is no longer in use.

The function `dpvsiz' is used in order to get the size of the value of a record.

int dpvsiz(DEPOT *depot, const char *kbuf, int ksiz);
`depot' specifies a database handle. `kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. If successful, the return value is the size of the value of the corresponding record, else, it is -1. Because this function does not read the entity of a record, it is faster than `dpget'.

The function `dpiterinit' is used in order to initialize the iterator of a database handle.

int dpiterinit(DEPOT *depot);
`depot' specifies a database handle. If successful, the return value is true, else, it is false. The iterator is used in order to access the key of every record stored in a database.

The function `dpiternext' is used in order to get the next key of the iterator.

char *dpiternext(DEPOT *depot, int *sp);
`depot' specifies a database handle. `sp' specifies a pointer to the variable to which the size of the region of the return value assigned. If it is `NULL', it is not used. If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record is to be get out of the iterator. Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if it is no longer in use. It is possible to access every record by iteration of calling this function. However, it is not assured if updating the database is occurred while the iteration. Besides, the order of this traversal access method is arbitrary, so it is not assured that the order of storing matches the one of the traversal access.

The function `dpsetalign' is used in order to set alignment of a database handle.

int dpsetalign(DEPOT *depot, int align);
`depot' specifies a database handle connected as a writer. `align' specifies the basic size of alignment. If successful, the return value is true, else, it is false. If alignment is set to a database, the efficiency of overwriting values are improved. The basic size of alignment is suggested to be average size of the values of the records to be stored. When a record is storing, the size of the region reserved for its value is determined as multiple of the alignment size. The alignment size is determined as multiple of the basic size of alignment. Because alignment setting is not saved in a database, you should specify alignment every opening a database.

The function `dpsync' is used in order to synchronize contents of updating a database with the file and the device.

int dpsync(DEPOT *depot);
`depot' specifies a database handle connected as a writer. If successful, the return value is true, else, it is false. This function is useful when another process uses the connected database file.

The function `dpoptimize' is used in order to optimize a database.

int dpoptimize(DEPOT *depot, int bnum);
`depot' specifies a database handle connected as a writer. `bnum' specifies the number of the elements of the bucket array. If it is not more than 0, the default value is specified. If successful, the return value is true, else, it is false. In an alternating succession of deleting and storing with overwrite or concatenate, dispensable regions accumulate. This function is useful to do away with them.

The function `dpname' is used in order to get the name of a database.

char *dpname(DEPOT *depot);
`depot' specifies a database handle. If successful, the return value is the pointer to the region of the name of the database, else, it is `NULL'. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if it is no longer in use.

The function `dpfsiz' is used in order to get the size of a database file.

int dpfsiz(DEPOT *depot);
`depot' specifies a database handle. If successful, the return value is the size of the database file, else, it is -1.

The function `dpbnum' is used in order to get the number of the elements of the bucket array.

int dpbnum(DEPOT *depot);
`depot' specifies a database handle. If successful, the return value is the number of the elements of the bucket array, else, it is -1.

The function `dpbusenum' is used in order to get the number of the used elements of the bucket array.

int dpbusenum(DEPOT *depot);
`depot' specifies a database handle. If successful, the return value is the number of the used elements of the bucket array, else, it is -1. This function is inefficient because it accesses all elements of the bucket array.

The function `dprnum' is used in order to get the number of the records stored in a database.

int dprnum(DEPOT *depot);
`depot' specifies a database handle. If successful, the return value is the number of the records stored in the database, else, it is -1.

The function `dpwritable' is used in order to check whether a database handle is a writer or not.

int dpwritable(DEPOT *depot);
`depot' specifies a database handle. The return value is true if the handle is a writer, false if not.

The function `dpfatalerror' is used in order to check whether a database has a fatal error or not.

int dpfatalerror(DEPOT *depot);
`depot' specifies a database handle. The return value is true if the database has a fatal error, false if not.

The function `dpinode' is used in order to get the inode number of a database file.

int dpinode(DEPOT *depot);
`depot' specifies a database handle. The return value is the inode number of the database file.

The function `dpfdesc' is used in order to get the file descriptor of a database file.

int dpfdesc(DEPOT *depot);
`depot' specifies a database handle. The return value is the file descriptor of the database file. Handling the file descriptor of a database file directly is not suggested.

The function `dpremove' is used in order to remove a database file.

int dpremove(const char *name);
`name' specifies the name of a database file. If successful, the return value is true, else, it is false.

The function `dpinnerhash' is a hash function used inside of Depot.

int dpinnerhash(const char *kbuf, int ksiz);
`kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. The return value is the hash value of 31 bits length computed from the key. This function is useful when an application calculates the state of the inside bucket array.

The function `dpouterhash' is a hash function which is independent from the hash functions used inside of Depot.

int dpouterhash(const char *kbuf, int ksiz);
`kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. The return value is the hash value of 31 bits length computed from the key. This function is useful when an application uses its own hash algorithm outside of Depot.

The function `dpprimenum' is used in order to get a prime number not less than a number.

int dpprimenum(int num);
`num' specified a positive number. The return value is a prime number not less than the specified number. This function is useful when an application determines the size of a bucket array of its own hash algorithm.

The following example stores and retrieves a phone number, using the name as the key.

#include <depot.h>
#include <stdlib.h>
#include <stdio.h>

#define NAME     "mikio"
#define NUMBER   "000-1234-5678"
#define DBNAME   "book"

int main(int argc, char **argv){
  DEPOT *depot;
  char *val;

  /* open the database */
  if(!(depot = dpopen(DBNAME, DP_OWRITER | DP_OCREAT, -1))){
    fprintf(stderr, "%s\n", dperrmsg(dpecode));
    return 1;
  }

  /* store the record */
  if(!dpput(depot, NAME, -1, NUMBER, -1, DP_DOVER)){
    fprintf(stderr, "%s\n", dperrmsg(dpecode));
  }

  /* retrieve the record */
  if(!(val = dpget(depot, NAME, -1, 0, -1, NULL))){
    fprintf(stderr, "%s\n", dperrmsg(dpecode));
  } else {
    printf("Name: %s\n", NAME);
    printf("Number: %s\n", val);
    free(val);
  }

  /* close the database */
  if(!dpclose(depot)){
    fprintf(stderr, "%s\n", dperrmsg(dpecode));
    return 1;
  }

  return 0;
}

For building a program using Depot, the program should be linked with a library file `libqdbm.a' or `libqdbm.so'. For example, the following command is executed to build `sample' from `sample.c'.

gcc -I/usr/local/include -o sample sample.c -L/usr/local/lib -lqdbm

Though each function of Depot is not reentrant, it does not use any static object internally. So, it can be used as a thread-safe function if each calling and reference to the external variable `dpecode' are under exclusion control, on the assumption that `errno', `malloc' and so on are thread-safe.


Commands for Depot

Depot has the following command line interfaces.

The command `dpmgr' is a utility for debugging Depot and its applications. It features editing and checking of a database. It can be used for database applications with shell scripts. This command is used in the following format. `name' specifies a database name. `key' specifies the key of a record. `val' specifies the value of a record.

Create a database file.
dpmgr create [-v] [-bnum num] name
Store a record with a key and a value.
dpmgr put [-v] [-kx] [-vx] [-vf] [-keep] [-cat] [-na] name key val
Delete a record with a key.
dpmgr out [-v] [-kx] name key
Retrieve a record with a key and output it to the standard output.
dpmgr get [-v] [-kx] [-start num] [-max num] [-ox] [-n] name key
List all keys delimited with line-feed to the standard output.
dpmgr list [-v] [-ox] name
Optimize a database.
dpmgr optimize [-v] [-bnum num] [-na] name
Output miscellaneous information to the standard output.
dpmgr inform [-v] name
Remove a database file.
dpmgr remove [-v] name
Output version information of QDBM to the standard output.
dpmgr version
Options feature the following.
-v : output debug information.
-bnum num : specifies the number of the elements of the bucket array.
-kx : treat `key' as a binary expression of hexadecimal notation.
-vx : treat `val' as a binary expression of hexadecimal notation.
-vf : read the value from a file specified with `val'.
-keep : specify the storing mode for `DP_OKEEP'.
-cat : specify the storing mode for `DP_OCAT'.
-na : do not set alignment.
-start : specify the beginning offset of a value to fetch.
-max : specify the max size of a value to fetch.
-ox : treat the output as a binary expression of hexadecimal notation.
-n : do not output the tailing newline.

This command returns 0 on success, another on failure.

The command `dptest' is a utility for facility test and performance test. Check a database generated by the command or measure the execution time of the command. This command is used in the following format. `name' specifies a database name. `rnum' specifies the number of the records. `bnum' specifies the number of the elements of the bucket array.

Store records with keys of 8 bytes. They change as `00000001', `00000002'...
dptest write [-cat num] [-align num] name rnum bnum
Retrieve all records of the database above.
dptest read name
Perform combination test of various operations.
dptest combo name
Store records with random data with random size.
dptest rand [-cat num] [-align num] [-poor] name rnum bnum
Perform updating operations selected at random.
dptest wicked name rnum
Options feature the following.
-cat num : specify repeating times and storing mode for `DP_OCAT'.
-align num : specify the basic size of alignment.
-poor : reduce random patterns and arouse duplication of keys.

This command returns 0 on success, another on failure.

The command `dptsv' features mutual conversion between a database of Depot and a TSV text. This command is used in the following format. `name' specifies a database name. The subcommand `export' reads TSV data from the standard input. If a key overlaps, the latter is adopted. `-bnum' specifies the number of the elements of the bucket array. The subcommand `import' writes TSV data to the standard output.

Create a database from TSV.
dptsv import [-bnum num] name
Write TSV data of a database.
dptsv export name

This command returns 0 on success, another on failure.


Curia: Extended API

Curia is the extended API of QDBM. It provides routines for managing multiple database files in a directory. Restrictions of some file systems that the size of each file is limited are escaped by dividing a database file into two or more. If the database files deploy on multiple devices, the scalability is improved.

Although Depot creates a database with a file name, Curia creates a database with a directory name. A database file named `depot' places in the specified directory. Although it keeps the attribute of the database, it does not keep the entities of the records. Besides, sub directories are created by the number of division of the database, named with 4 digits. The database files place in the subdirectories. The entities of the records are stored in the database file. For example, in the case that a database directory named `casket' and the number of division is 3, `casket/depot', `casket/0001/depot', `casket/0002/depot' and `casket/0003/depot' are created. No error occurs even if the namesake directory exists when creating a database. So, if sub directories exists and some devices are mounted on the sub directories, the database files deploy on the multiple devices. It is possible for the database files to deploy on multiple file servers using NFS and so on.

Curia features managing large objects. Although usual records are stored in some database files, records of large objects are stored in individual files. Because the files of large objects are deployed in different directories named with the hash values, the access speed is part-way robust although it is slower than the speed of usual records. Large and not often accessed data should be secluded as large objects. By doing this, the access speed of usual records are improved. the directory hierarchies of large objects are places in the directory named `lob' in the sub directories of the database. Because the key spaces of the usual records and the large objects are different, the operations keep out of each other.

In order to use Curia, you should include `depot.h', `curia.h' and `stdlib.h' in the source files. Usually, the following description will be near the beginning of a source file.

#include <depot.h>
#include <curia.h>
#include <stdlib.h>

A pointer to `Curia' is used as a database handle. It is like that some file I/O routines of `stdio.h' use a pointer to `FILE'. A database handle is opened with the function `cropen' and closed with `crclose'. You should not refer directly to any member of the handle. If a fatal error occurs in a database, any access method via the handle except `crclose' will not work and return error status. Although a process is allowed to use multiple database handles at the same time, handles of the same database directory should not be used.

Curia also assign the external variable `dpecode' with the error code. The function `dperrmsg' is used in order to get the message of the error code.

The function `cropen' is used in order to get a database handle.

CURIA *cropen(const char *name, int omode, int bnum, int dnum);
`name' specifies the name of a database directory. `omode' specifies the connection mode: `CR_OWRITER' as a writer, `CR_OREADER' as a reader. If the mode is `CR_OWRITER', the following may be added by bitwise or: `CR_OCREAT', which means it creates a new database if not exist, `CR_OTRUNC', which means it creates a new database regardless if one exists. `bnum' specifies the number of elements of each bucket array. If it is not more than 0, the default value is specified. The size of each bucket array is determined on creating, and can not be changed except for by optimization of the database. Suggested size of each bucket array is about from 0.5 to 4 times of the number of all records to store. `dnum' specifies the number of division of the database. If it is not more than 0, the default value is specified. The number of division can not be changed from the initial value. The return value is the database handle or `NULL' if it is not successful. While connecting as a writer, an exclusive lock is invoked to the database directory. While connecting as a reader, a shared lock is invoked to the database directory. The thread blocks until the lock is achieved.

The function `crclose' is used in order to close a database handle.

int crclose(CURIA *curia);
`curia' specifies a database handle. If successful, the return value is true, else, it is false. Because the region of a closed handle is released, it becomes impossible to use the handle. Updating a database is assured to be written when the handle is closed. If a writer opens a database but does not close it appropriately, the database will be broken.

The function `crput' is used in order to store a record.

int crput(CURIA *curia, const char *kbuf, int ksiz, const char *vbuf, int vsiz, int dmode);
`curia' specifies a database handle connected as a writer. `kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. `vbuf' specifies the pointer to the region of a value. `vsiz' specifies the size of the region of the value. If it is negative, the size is assigned with `strlen(vbuf)'. `dmode' specifies behavior when the key overlaps, by the following values: `CR_DOVER', which means the specified value overwrites the existing one, `CR_DKEEP', which means the existing value is kept, `CR_DCAT', which means the specified value is concatenated at the end of the existing value. If successful, the return value is true, else, it is false.

The function `crout' is used in order to delete a record.

int crout(CURIA *curia, const char *kbuf, int ksiz);
`curia' specifies a database handle connected as a writer. `kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. If successful, the return value is true, else, it is false. false is returned when no record corresponds to the specified key.

The function `crget' is used in order to retrieve a record.

char *crget(CURIA *curia, const char *kbuf, int ksiz, int start, int max, int *sp);
`curia' specifies a database handle. `kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. `start' specifies the offset address of the beginning of the region of the value to be read. `max' specifies the max size to be read. If it is negative, the size to read is unlimited. `sp' specifies a pointer to the variable to which the size of the region of the return value assigned. If it is `NULL', it is not used. If successful, the return value is the pointer to the region of the value of the corresponding record, else, it is `NULL'. `NULL' is returned when no record corresponds to the specified key or the size of the value of the corresponding record is less than `start'. Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if it is no longer in use.

The function `crvsiz' is used in order to get the size of the value of a record.

int crvsiz(CURIA *curia, const char *kbuf, int ksiz);
`curia' specifies a database handle. `kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. If successful, the return value is the size of the value of the corresponding record, else, it is -1. Because this function does not read the entity of a record, it is faster than `crget'.

The function `criterinit' is used in order to initialize the iterator of a database handle.

int criterinit(CURIA *curia);
`curia' specifies a database handle. If successful, the return value is true, else, it is false. The iterator is used in order to access the key of every record stored in a database.

The function `criternext' is used in order to get the next key of the iterator.

char *criternext(CURIA *curia, int *sp);
`curia' specifies a database handle. `sp' specifies a pointer to the variable to which the size of the region of the return value assigned. If it is `NULL', it is not used. If successful, the return value is the pointer to the region of the next key, else, it is `NULL'. `NULL' is returned when no record is to be get out of the iterator. Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if it is no longer in use. It is possible to access every record by iteration of calling this function. However, it is not assured if updating the database is occurred while the iteration. Besides, the order of this traversal access method is arbitrary, so it is not assured that the order of storing matches the one of the traversal access.

The function `crsetalign' is used in order to set alignment of a database handle.

int crsetalign(CURIA *curia, int align);
`curia' specifies a database handle connected as a writer. `align' specifies the basic size of alignment. If successful, the return value is true, else, it is false. If alignment is set to a database, the efficiency of overwriting values are improved. The basic size of alignment is suggested to be average size of the values of the records to be stored. When a record is storing, size of the region reserved for its value is determined as multiple of the alignment size. The alignment size is determined as multiple of the basic size of alignment. Because alignment setting is not saved in a database, you should specify alignment every opening a database.

The function `crsync' is used in order to synchronize contents of updating a database with the files and the devices.

int crsync(CURIA *curia);
`curia' specifies a database handle connected as a writer. If successful, the return value is true, else, it is false. This function is useful when another process uses the connected database directory.

The function `croptimize' is used in order to optimize a database.

int croptimize(CURIA *curia, int bnum);
`curia' specifies a database handle connected as a writer. `bnum' specifies the number of the elements of each bucket array. If it is not more than 0, the default value is specified. In an alternating succession of deleting and storing with overwrite or concatenate, dispensable regions accumulate. This function is useful to do away with them.

The function `crname' is used in order to get the name of a database.

char *crname(CURIA *curia);
`curia' specifies a database handle. If successful, the return value is the pointer to the region of the name of the database, else, it is `NULL'. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if it is no longer in use.

The function `crfsiz' is used in order to get the total size of database files.

int crfsiz(CURIA *curia);
`curia' specifies a database handle. If successful, the return value is the total size of the database files, else, it is -1.

The function `crbnum' is used in order to get the total number of the elements of each bucket array.

int crbnum(CURIA *curia);
`curia' specifies a database handle. If successful, the return value is the total number of the elements of each bucket array, else, it is -1.

The function `crbusenum' is used in order to get the total number of the used elements of each bucket array.

int crbusenum(CURIA *curia);
`curia' specifies a database handle. If successful, the return value is the total number of the used elements of each bucket array, else, it is -1. This function is inefficient because it accesses all elements of each bucket array.

The function `crrnum' is used in order to get the number of the records stored in a database.

int crrnum(CURIA *curia);
`curia' specifies a database handle. If successful, the return value is the number of the records stored in the database, else, it is -1.

The function `crwritable' is used in order to check whether a database handle is a writer or not.

int crwritable(CURIA *curia);
`curia' specifies a database handle. The return value is true if the handle is a writer, false if not.

The function `crfatalerror' is used in order to check whether a database has a fatal error or not.

int crfatalerror(CURIA *curia);
`curia' specifies a database handle. The return value is true if the database has a fatal error, false if not.

The function `crinode' is used in order to get the inode number of a database directory.

int crinode(CURIA *curia);
`curia' specifies a database handle. The return value is the inode number of the database directory.

The function `crremove' is used in order to remove a database directory.

int crremove(const char *name);
`name' specifies the name of a database directory. If successful, the return value is true, else, it is false.

The function `crputlob' is used in order to store a large object.

int crputlob(CURIA *curia, const char *kbuf, int ksiz, const char *vbuf, int vsiz, int dmode);
`curia' specifies a database handle connected as a writer. `kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. `vbuf' specifies the pointer to the region of a value. `vsiz' specifies the size of the region of the value. If it is negative, the size is assigned with `strlen(vbuf)'. `dmode' specifies behavior when the key overlaps, by the following values: `CR_DOVER', which means the specified value overwrites the existing one, `CR_DKEEP', which means the existing value is kept, `CR_DCAT', which means the specified value is concatenated at the end of the existing value. If successful, the return value is true, else, it is false.

The function `croutlob' is used in order to delete a large object.

int croutlob(CURIA *curia, const char *kbuf, int ksiz);
`curia' specifies a database handle connected as a writer. `kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. If successful, the return value is true, else, it is false. false is returned when no large object corresponds to the specified key.

The function `crgetlob' is used in order to retrieve a large object.

char *crgetlob(CURIA *curia, const char *kbuf, int ksiz, int start, int max, int *sp);
`curia' specifies a database handle. `kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. `start' specifies the offset address of the beginning of the region of the value to be read. `max' specifies the max size to be read. If it is negative, the size to read is unlimited. `sp' specifies a pointer to the variable to which the size of the region of the return value assigned. If it is `NULL', it is not used. If successful, the return value is the pointer to the region of the value of the corresponding large object, else, it is `NULL'. `NULL' is returned when no large object corresponds to the specified key or the size of the value of the corresponding large object is less than `start'. Because an additional zero code is appended at the end of the region of the return value, the return value can be treated as a character string. Because the region of the return value is allocated with the `malloc' call, it should be released with the `free' call if it is no longer in use.

The function `crvsizlob' is used in order to get the size of the value of a large object.

int crvsizlob(CURIA *curia, const char *kbuf, int ksiz);
`curia' specifies a database handle. `kbuf' specifies the pointer to the region of a key. `ksiz' specifies the size of the region of the key. If it is negative, the size is assigned with `strlen(kbuf)'. If successful, the return value is the size of the value of the corresponding large object, else, it is -1. Because this function does not read the entity of a large object, it is faster than `crgetlob'.

The function `crrnumlob' is used in order to get the number of the large objects stored in a database.

int crrnumlob(CURIA *curia);
`curia' specifies a database handle. If successful, the return value is the number of the large objects stored in the database, else, it is -1.

The following example stores and retrieves a phone number, using the name as the key.

#include <depot.h>
#include <curia.h>
#include <stdlib.h>
#include <stdio.h>

#define NAME     "mikio"
#define NUMBER   "000-1234-5678"
#define DBNAME   "book"

int main(int argc, char **argv){
  CURIA *curia;
  char *val;

  /* open the database */
  if(!(curia = cropen(DBNAME, CR_OWRITER | CR_OCREAT, -1, -1))){
    fprintf(stderr, "%s\n", dperrmsg(dpecode));
    return 1;
  }

  /* store the record */
  if(!crput(curia, NAME, -1, NUMBER, -1, CR_DOVER)){
    fprintf(stderr, "%s\n", dperrmsg(dpecode));
  }

  /* retrieve the record */
  if(!(val = crget(curia, NAME, -1, 0, -1, NULL))){
    fprintf(stderr, "%s\n", dperrmsg(dpecode));
  } else {
    printf("Name: %s\n", NAME);
    printf("Number: %s\n", val);
    free(val);
  }

  /* close the database */
  if(!crclose(curia)){
    fprintf(stderr, "%s\n", dperrmsg(dpecode));
    return 1;
  }

  return 0;
}

How to build programs using Curia is the same as the case of Depot.

Though each function of Curia is not reentrant, it does not use any static object internally. So, it can be used as a thread-safe function if each calling and reference to the external variable `dpecode' are under exclusion control, on the assumption that `errno', `malloc' and so on are thread-safe.


Commands for Curia

Curia has the following command line interfaces.

The command `crmgr' is a utility for debugging Curia and its applications. It features editing and checking of a database. It can be used for the database applications with shell scripts. This command is used in the following format. `name' specifies a database name. `key' specifies the key of a record. `val' specifies the value of a record.

Create a database file.
crmgr create [-v] [-bnum num] name
Store a record with a key and a value.
crmgr put [-v] [-kx] [-vx] [-vf] [-keep] [-cat] [-lob] [-na] name key val
Delete a record with a key.
crmgr out [-v] [-kx] [-lob] name key
Retrieve a record with a key and output it to the standard output.
crmgr get [-v] [-kx] [-start num] [-max num] [-ox] [-lob] [-n] name key
List all keys delimited with line-feed to the standard output.
crmgr list [-v] [-ox] name
Optimize a database.
crmgr optimize [-v] [-bnum num] [-na] name
Output miscellaneous information to the standard output.
crmgr inform [-v] name
Remove a database directory.
crmgr remove [-v] name
Output version information of QDBM to the standard output.
crmgr version
Options feature the following.
-v : output debug information.
-bnum num : specifies the number of elements of each bucket array.
-kx : treat `key' as a binary expression of hexadecimal notation.
-vx : treat `val' as a binary expression of hexadecimal notation.
-vf : read the value from a file specified with `val'.
-keep : specify the storing mode for `CR_OKEEP'.
-cat : specify the storing mode for `CR_OCAT'.
-na : do not set alignment.
-start : specify the beginning offset of a value to fetch.
-max : specify the max size of a value to fetch.
-ox : treat the output as a binary expression of hexadecimal notation.
-lob : handle large objects.
-n : do not output the tailing newline.

This command returns 0 on success, another on failure.

The command `crtest' is a utility for facility test and performance test. Check a database generated by the command or measure the execution time of the command. This command is used in the following format. `name' specifies a database name. `rnum' specifies the number of records. `bnum' specifies the number of elements of a bucket array. `dnum' specifies the number of division of a database.

Store records with keys of 8 bytes. They change as `00000001', `00000002'...
crtest write [-cat num] [-align num] [-lob] name rnum bnum dnum
Retrieve all records of the database above.
crtest read name
Perform combination test of various operations.
crtest combo name
Options feature the following.
-cat num : specify repeating times and storing mode for `CR_OCAT'.
-align num : specify the basic size of alignment.
-lob : handle large objects.

This command returns 0 on success, another on failure.


Relic: NDBM-compatible API

Relic is the API which is compatible with NDBM. So, Relic wraps functions of Depot as API of NDBM. It is easy to port an application from NDBM to QDBM. In most cases, you should only replace the includings of `ndbm.h' with `relic.h' and replace the linking option `-lndbm' with `-lqdbm'.

The original NDBM treats a database as a pair of files. One, `a directory file', has a name with suffix `.dir' and stores a bit map of keys. The other, `a data file', has a name with suffix `.pag' and stores entities of each records. Relic creates the directory file as a mere dummy file and creates the data file as a database. Relic has no restriction about the size of each record. Relic cannot handle database files made by the original NDBM.

In order to use Relic, you should include `relic.h', `stdlib.h', `sys/types.h', `sys/stat.h' and `fcntl.h' in the source files. Usually, the following description will be near the beginning of a source file.

#include <relic.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

A pointer to `DBM' is used as a database handle. A database handle is opened with the function `dbm_open' and closed with `dbm_close'. You should not refer directly to any member of a handle.

Structures of `datum' type is used in order to give and receive data of keys and values with functions of Relic.

typedef struct { void *dptr; size_t dsize; } datum;
`dptr' specifies the pointer to the region of a key or a value. `dsize' specifies the size of the region.

The function `dbm_open' is used in order to get a database handle.

DBM *dbm_open(char *name, int flags, int mode);
`name' specifies the name of a database. The file names are concatenated with suffixes. `flags' is the same as one of `open' call, although `O_WRONLY' is treated as `O_RDWR' and additional flags except for `O_CREAT' and `O_TRUNC' have no effect. `mode' specifies the mode of the database file as one of `open' call does. The return value is the database handle or `NULL' if it is not successful.

The function `dbm_close' is used in order to close a database handle.

void dbm_close(DBM *db);
`db' specifies a database handle. Because the region of the closed handle is released, it becomes impossible to use the handle.

The function `dbm_store' is used in order to store a record.

int dbm_store(DBM *db, datum key, datum content, int flags);
`db' specifies a database handle. `key' specifies a structure of a key. `content' specifies a structure of a value. `flags' specifies behavior when the key overlaps, by the following values: `DBM_REPLACE', which means the specified value overwrites the existing one, `DBM_INSERT', which means the existing value is kept. The return value is 0 if it is successful, 1 if it gives up because of overlaps of the key, -1 if other error occurs.

The function `dbm_delete' is used in order to delete a record.

int dbm_delete(DBM *db, datum key);
`db' specifies a database handle. `key' specifies a structure of a key. The return value is 0 if it is successful, -1 if some errors occur.

The function `dbm_fetch' is used in order to retrieve a record.

datum dbm_fetch(DBM *db, datum key);
`db' specifies a database handle. `key' specifies a structure of a key. The return value is a structure of the result. If a record corresponds, the member `dptr' of the structure is the pointer to the region of the value. If no record corresponds or some errors occur, `dptr' is `NULL'. `dptr' points to the region related with the handle. The region is available until the next time of calling this function with the same handle.

The function `dbm_firstkey' is used in order to get the first key of a database.

datum dbm_firstkey(DBM *db);
`db' specifies a database handle. The return value is a structure of the result. If a record corresponds, the member `dptr' of the structure is the pointer to the region of the first key. If no record corresponds or some errors occur, `dptr' is `NULL'. `dptr' points to the region related with the handle. The region is available until the next time of calling this function or the function `dbm_nextkey' with the same handle.

The function `dbm_nextkey' is used in order to get the next key of a database.

datum dbm_nextkey(DBM *db);
`db' specifies a database handle. The return value is a structure of the result. If a record corresponds, the member `dptr' of the structure is the pointer to the region of the next key. If no record corresponds or some errors occur, `dptr' is `NULL'. `dptr' points to the region related with the handle. The region is available until the next time of calling this function or the function `dbm_firstkey' with the same handle.

The function `dbm_error' is used in order to check whether a database has a fatal error or not.

int dbm_error(DBM *db);
`db' specifies a database handle. The return value is true if the database has a fatal error, false if not.

The function `dbm_clearerr' has no effect.

int dbm_clearerr(DBM *db);
`db' specifies a database handle. The return value is 0. The function is only for compatibility.

The function `dbm_rdonly' is used in order to check whether a handle is read-only or not.

int dbm_rdonly(DBM *db);
`db' specifies a database handle. The return value is true if the handle is read-only, or false if not read-only.

The function `dbm_dirfno' is used in order to get the file descriptor of a directory file.

int dbm_dirfno(DBM *db);
`db' specifies a database handle. The return value is the file descriptor of the directory file.

The function `dbm_pagfno' is used in order to get the file descriptor of a data file.

int dbm_pagfno(DBM *db);
`db' specifies a database handle. The return value is the file descriptor of the data file.

How to build programs using Relic is the same as the case of Depot. Note that an option to be given to a linker is not `-lndbm', but `-lqdbm'.

Each function of Relic is not reentrant, and not thread-safe.


Commands for Relic

Relic has the following command line interfaces.

The command `rlmgr' is a utility for debugging Relic and its applications. It features editing and checking of a database. It can be used for database applications with shell scripts. This command is used in the following format. `name' specifies a database name. `key' specifies the key of a record. `val' specifies the value of a record.

Create a database file.
rlmgr create name
Store a record with a key and a value.
rlmgr store [-kx] [-vx] [-vf] [-insert] name key val
Delete a record with a key.
rlmgr delete [-kx] name key
Retrieve a record with a key and output to the standard output.
rlmgr fetch [-kx] [-ox] [-n] name key
List all keys delimited with line-feed to the standard output.
rlmgr list [-v] [-ox] name
Options feature the following.
-kx : treat `key' as a binary expression of hexadecimal notation.
-vx : treat `val' as a binary expression of hexadecimal notation.
-vf : read the value from a file specified with `val'.
-insert : specify the storing mode for `DBM_INSERT'.
-ox : treat the output as a binary expression of hexadecimal notation.
-n : do not output the tailing newline.

This command returns 0 on success, another on failure.

The command `rltest' is a utility for facility test and performance test. Check a database generated by the command or measure the execution time of the command. This command is used in the following format. `name' specifies a database name. `rnum' specifies the number of records.

Store records with keys of 8 bytes. They change as `00000001', `00000002'...
rltest write name rnum
Retrieve records of the database above.
rltest read name rnum

This command returns 0 on success, another on failure.


Hovel: GDBM-compatible API

Hovel is the API which is compatible with GDBM. So, Hovel wraps functions of Depot and Curia as API of GDBM. It is easy to port an application from GDBM to QDBM. In most cases, you should only replace the includings of `gdbm.h' with `hovel.h' and replace the linking option `-lgdbm' with `-lqdbm'. Hovel cannot handle database files made by the original GDBM.

In order to use Hovel, you should include `hovel.h', `stdlib.h', `sys/types.h' and `sys/stat.h' in the source files. Usually, the following description will be near the beginning of a source file.

#include <hovel.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>

An object of `GDBM_FILE' is used as a database handle. A database handle is opened with the function `gdbm_open' and closed with `gdbm_close'. You should not refer directly to any member of a handle. Although Hovel works as a wrapper of Depot and handles a database file by default, if the function `gdbm_setcuria' is called before opening a database handle, Hovel works as a wrapper of Curia and handles a database directory.

Structures of `datum' type is used in order to give and receive data of keys and values with functions of Hovel.

typedef struct { char *dptr; size_t dsize; } datum;
`dptr' specifies the pointer to the region of a key or a value. `dsize' specifies the size of the region.

The external variable `gdbm_version' is the string containing the version information.

extern char *gdbm_version;

The external variable `gdbm_error' is assigned with the last happened error code. Refer to `hovel.h' for details of the error codes.

extern gdbm_error gdbm_errno;

The function `gdbm_strerror' is used in order to get a message string corresponding to an error code.

char *gdbm_strerror(gdbm_error gdbmerrno);
`gdbmerrno' specifies an error code. The return value is the message string of the error code. The region of the return value is not writable.

The function `gdbm_open' is used in order to get a database handle.

GDBM_FILE gdbm_open(char *name, int block_size, int read_write, int mode, void (*fatal_func)(void));
`name' specifies a name of a database. `block_size' is ignored. `read_write' specifies the connection mode: `GDBM_READER' as a reader, `GDBM_WRITER', `GDBM_WRCREAT' and `GDBM_NEWDB' as a writer. `GDBM_WRCREAT' makes a database file if it does not exist. `GDBM_NEWDB' makes a new database even if it exists. You can add the following to writer modes by bitwise or: `GDBM_SYNC', `GDBM_NOLOCK' and `GDBM_FAST'. `GDBM_SYNC' means a database is synchronized after every updating method. The other two are ignored. `mode' specifies mode of a database file as one of `open' call does. The return value is the database handle or `NULL' if it is not successful.

The function `gdbm_close' is used in order to close a database handle.

void gdbm_close(GDBM_FILE dbf);
`dbf' specifies a database handle. Because the region of the closed handle is released, it becomes impossible to use the handle.

The function `gdbm_store' is used in order to store a record.

int gdbm_store(GDBM_FILE dbf, datum key, datum content, int flag);
`dbf' specifies a database handle connected as a writer. `key' specifies a structure of a key. `content' specifies a structure of a value. `flag' specifies behavior when the key overlaps, by the following values: `GDBM_REPLACE', which means the specified value overwrites the existing one, `GDBM_INSERT', which means the existing value is kept. The return value is 0 if it is successful, 1 if it gives up because of overlaps of the key, -1 if other error occurs.

The function `gdbm_delete' is used in order to delete a record.

int gdbm_delete(GDBM_FILE dbf, datum key);
`dbf' specifies a database handle connected as a writer. `key' specifies a structure of a key. The return value is 0 if it is successful, -1 if some errors occur.

The function `gdbm_fetch' is used in order to retrieve a record.

datum gdbm_fetch(GDBM_FILE dbf, datum key);
`dbf' specifies a database handle. `key' specifies a structure of a key. The return value is a structure of the result. If a record corresponds, the member `dptr' of the structure is the pointer to the region of the value. If no record corresponds or some errors occur, `dptr' is `NULL'. Because the region pointed to by `dptr' is allocated with the `malloc' call, it should be released with the `free' call if it is no longer in use.

The function `gdbm_exists' is used in order to check whether a record exists or not.

int gdbm_exists(GDBM_FILE dbf, datum key);
`dbf' specifies a database handle. `key' specifies a structure of a key. The return value is true if a record corresponds and no error occurs, or false, else, it is false.

The function `gdbm_firstkey' is used in order to get the first key of a database.

datum gdbm_firstkey(GDBM_FILE dbf);
`dbf' specifies a database handle. The return value is a structure of the result. If a record corresponds, the member `dptr' of the structure is the pointer to the region of the first key. If no record corresponds or some errors occur, `dptr' is `NULL'. Because the region pointed to by `dptr' is allocated with the `malloc' call, it should be released with the `free' call if it is no longer in use.

The function `gdbm_nextkey' is used in order to get the next key of a database.

datum gdbm_nextkey(GDBM_FILE dbf, datum key);
`dbf' specifies a database handle. The return value is a structure of the result. If a record corresponds, the member `dptr' of the structure is the pointer to the region of the next key. If no record corresponds or some errors occur, `dptr' is `NULL'. Because the region pointed to by `dptr' is allocated with the `malloc' call, it should be released with the `free' call if it is no longer in use.

The function `gdbm_sync' is used in order to synchronize contents of updating with the file and the device.

void gdbm_sync(GDBM_FILE dbf);
`dbf' specifies a database handle connected as a writer.

The function `gdbm_reorganize' is used in order to reorganize a database.

int gdbm_reorganize(GDBM_FILE dbf);
`dbf' specifies a database handle connected as a writer. If successful, the return value is 0, else -1.

The function `gdbm_fdesc' is used in order to get the file descriptor of a database file.

int gdbm_fdesc(GDBM_FILE dbf);
`dbf' specifies a database handle connected as a writer. The return value is the file descriptor of the database file.

The function `gdbm_setopt' has no effect.

int gdbm_setopt(GDBM_FILE dbf, int option, int *value, int size);
`dbf' specifies a database handle. `option' is ignored. `size' is ignored. The return value is 0. The function is only for compatibility.

The function `gdbm_setcuria' is used in order to switch behavior of each function of Hovel to wrapper of Curia.

void gdbm_setcuria(int bnum, int dnum, int align, int mode);
`bnum' specifies the number of elements of each bucket array. `dnum' specifies the number of division of the database. `align' specifies the basic size of alignment. `mode' specifies mode of a database directory. This function should be called before opening a database. Because this setting is not saved in a database, you should call this function every opening a database. The function `gdbm_fdesc' does not works after calling this function.

How to build programs using Hovel is the same as the case of Depot. Note that an option to be given to a linker is not `-lgdbm', but `-lqdbm'.

Each functions of Hovel is not reentrant, and not thread-safe.


Commands for Hovel

Hovel has the following command line interfaces.

The command `hvmgr' is a utility for debugging Hovel and its applications. It features editing and checking of a database. It can be used for database applications with shell scripts. This command is used in the following format. `name' specifies a database name. `key' specifies the key of a record. `val' specifies the value of a record.

Create a database file.
hvmgr create [-dir] name
Store a record with a key and a value.
hvmgr store [-dir] [-kx] [-vx] [-vf] [-insert] name key val
Delete a record with a key.
hvmgr delete [-dir] [-kx] name key
Retrieve a record with a key and output to the standard output.
hvmgr fetch [-dir] [-kx] [-ox] [-n] name key
List all keys delimited with line-feed to the standard output.
hvmgr list [-dir] [-v] [-ox] name
Optimize a database.
hvmgr optimize [-dir] name
Options feature the following.
-dir : handle a database directory using Curia.
-kx : treat `key' as a binary expression of hexadecimal notation.
-vx : treat `val' as a binary expression of hexadecimal notation.
-vf : read the value from a file specified with `val'.
-insert : specify the storing mode for `GDBM_INSERT'.
-ox : treat the output as a binary expression of hexadecimal notation.
-n : do not output the trailing newline.

This command returns 0 on success, another on failure.

The command `hvtest' is a utility for facility test and performance test. Check a database generated by the command or measure the execution time of the command. This command is used in the following format. `name' specifies a database name. `rnum' specifies the number of records.

Store records with keys of 8 bytes. They changes as `00000001', `00000002'...
hvtest write name rnum
Retrieve records of the database above.
hvtest read name rnum

This command returns 0 on success, another on failure.


File Format

The contents of a database file managed by Depot can by devided roughly into the following three sections: the header section, the bucket section and the record section.

The header section places at the beginning of the file and its length is constant 48 bytes. The following information are stored in the header section.

  1. magic number: from offset 0, contain "[depot]\n\f"
  2. file size: from offset 16, type of `int'
  3. number of the bucket: from offset 24, type of `int'
  4. number of records: from offset 32, type of `int'

The bucket section places after the header section and its length is determined according to the number of the bucket. Each element of the bucket stores an offset of the root node of each separate chain.

The record section places after the bucket section and occupies to the end of the file. The element of the record section contains the following information.

  1. flag (for deleting): type of `int'
  2. second hash value: type of `int'
  3. size of the key: type of `int'
  4. size of the value: type of `int'
  5. size of the padding: type of `int'
  6. offset of the left child: type of `int'
  7. offset of the right child: type of `int'
  8. entity of the key: serial bytes with variable length
  9. entity of the value: serial bytes with variable length
  10. padding data: void serial bytes with variable length

Because the database file is not sparse, move, copy, unlink and ftp and so on with the file are possible. Because Depot read and write data without normalization of the byte order, it is impossible to share the same file between the environment with the different byte order.

For the command `file' to recognize database files, append the following expression into `magic' file.

0       string          [depot]\n\f     database file of QDBM
>16     long            x               \b, filesize:%d
>24     long            x               \b, buckets:%d
>32     long            x               \b, records:%d
0       string          [depot]\0\v     dummy file of QDBM

Bugs

There is no such bug which are found but not fixed, as crash by segmentation fault, unexpected data vanishing, memory leak and so on.

Database files are vulnerable for I/O error. QDBM has neither feature for rollback nor backup. Applications are responsible for handling such errors as disk-full and killing interruptive signals.

If you find any bugs, report it to the author, with the information of the version of QDBM, the operating system and the compiler.


FAQ

Q.: What platform does QDBM work on?
A.: QDBM works on operating systems compatible with POSIX. At least, QDBM is available on Linux 2.4 and SunOS 5.8. Cygwin is also supported. But, MinGW is not supported.
Q.: Are there good sample codes for applications?
A.: Read `dptsv.c', `dptest.c' and `dpmgr.c' in the order. The package of QDBM contains them.
Q.: How are the performance indexes?
A.: The author measured the real time for storing and retrieving with the command `dptest' and `time'. The number of the elements of the bucket array is the twice of the number of the records. On a machine, with 2.53GHz Pentium 4, with 333MHz 1GB RAM, with Linux 2.4, it spends 5.0 seconds to store 1,000,000 records, it spends 4.5 seconds to retrieve all of the records, it spends 50.9 seconds to store 10,000,000 records, it spends 44.9 seconds to retrieve all of the records. On a machine, with 500MHz Pentium 3, with 133MHz 192MB RAM, with Linux 2.4, it spends 12.0 seconds to store 1,000,000 records, it spends 11.1 seconds to retrieve all of the records, it spends 495.0 seconds to store 10,000,000 records, it spends 414.8 seconds to retrieve all of the records.
Q.: Why don't you use B-tree?
A.: QDBM does not support access methods in sequence. Although B-tree excels at sequential access, hash is superior in performance of retrieving and updating. The reason binary search tree is used to manage the separate chains is that the size of each node of binary search tree is smaller than one of B-tree. So, the elements of the bucket array are to be more.
Q: How shuold I use alignment?
A: If your application repeats writing with overwrite or concatenate mode. Alignment saves the rapid growth of the size of the database file. Because the best suited size of alignment of each application is defferent, you should leran it by experiment. For the meantime, about 32 is suitable.
Q.: What does `QDBM' mean?
A.: `QDBM' stands for `Quick DataBase Manager'. It means that processing speed is high, and that you can write the applications quickly.

Copying

QDBM is written by Mikio Hirabayashi and distributed as a free software. You can redistribute it and/or modify it under the terms of the GNU General Public License Version 2. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

You can contact the author by e-mail to <mikio1234@lycos.jp>. Any suggestion or bug report is welcome to the author.