B Tree Library History
The B Tree routines in this library were originally written in Fortran 77 during the late 1980’s for use in a gazetteer.
The routines were ported to C over a number years (as time permitted), but still very much organised as the original Fortran. This version was utilised in the anag and dict programs, for anagram solving.
The source was re-organised in early 2001 to adopt more closely a C organisation style (note the source code itself still has a Fortran flavour). This was known as "bt_new".
The original B Tree implementation was designed solely as an index handler. Data was expected to be managed by the client application. In addition, the routines were only able to manage exclusive access to the index file - shared access would result in corrupt index files. Lastly, only one index file could be open at a time - effectively preventing any application-mediated copying capability.
"bt_shared" was derived to resolve these three issues: to provide a combined index and data file (yes, something like CP-V/CP-6 keyed files), to allow shared access to a B Tree file by concurrent processes, and to permit a single process to open multiple index files concurrently.
These capabilities were provided in a 32 bit implementation, meaning the largest file that could be supported was 2GiB. Version 3.0 was developed to add support for larger files, i.e those requiring 64 bit addressing. This is dependent on kernel and gcc Large File Support.
Introduction
The B Tree (BT) library offers a set of C language functions which implement a generalised index file capability, based on the B tree indexing scheme. The B tree was originally described by Bayer and McCraight. A B tree is a multiway balanced tree: i.e. there is more than one key per node, and all leaf nodes are the same distance from the root node.
The BT functions implement a 'classical' B tree, not one of the later variants (B* or B+ tree).
The B Tree is stored in a UNIX disk file. The file can contain many B Trees, each of which is referred to by a name assigned by the user (or application). The system allows many such files to exist.
In order for BT to function efficiently on different hardware platforms, the important constants relating to disk block size, maximum number of keys per block, etc. are defined as constants, and may be modified prior to compilation.
System Description
Overview
The B Tree is stored in a standard UNIX file. To support efficient processing, the size of a B Tree node should be the same as the hardware’s disk block size. In the following description, the terms node and block are interchangeable.
A B Tree has a root, which acts as the starting point for all insertions, deletions and searches. A B Tree has a master root stored at file address 0 (zero), which is termed the "superroot". The superroot holds the names and root block addresses of all the B Trees in the file. The superroot also holds the free block list for the file and other administrative information.
Each block contains a number of keys, an associated integer value and pointers to other blocks. The maximum number of keys that can be stored in a block depends on the size of a block and the maximum number of bytes permitted for a key. The integer value associated with a key can be used as desired by the application program. If the record storage facilities of the B Tree are used, it will contain the block address of an associated data record.
Version 3.1 and above allows the definition of duplicate keys, which is enabled on a per-root basis. If not enabled, this version with behave as previous versions, that is, duplicate keys will be rejected.
When a B Tree file is created, the superroot is initialised with two named roots: itself and the default root. These are defined as "super" and "default" respectively. The application may create more roots as required. When a B Tree file is first opened, the default root ($$default) is always selected.
The maximum size of a B Tree file is governed by the implementation. For an implementation with a 32 bit word length, the maximum file size supported is 2GiB. If the B Tree library is built with Large File Support, that limit is removed.
Opening multiple B Tree files
An application program may have more than one B Tree file opened at any
one time. When a B Tree file is created or opened, a B Tree context
pointer is returned to the application. All B Tree functions must be
passed this context pointer to indicate which open B Tree file is to be
operated on. This parameter is identified as btact
in each function
description.
Shared Access
B Tree files may be created/opened in exclusive or shared mode. Application programs that use shared access should be prepared to handle a busy return from a read or update access to the B Tree file.
Large File Support
In order to support large files (i.e. those > 2GiB), a new type has been
introduced into the BT library, BTint, which can be 32 bits (i.e int)
when compiled without Large File Support, or 64 bits (i.e. long long),
when compiled with Large File Support. BTint is a typedef, which will
be declared as appropriate. BT library function arguments which must be
declared as BTint are described in the API, but version 2.x users of
bfndky
, bdbug
, binsky
, bnxtky
or bupdky
should be aware of the
need to change argument declarations from int to BTint.
Duplicate Keys
Version 3.1 added support for duplicate keys. By default, duplicate keys
are not permitted, so this (and later) versions behaves as previous
versions. Duplicate key support is enabled on a per-root basis, using
the btdups
API function.
Finding a duplicate key (via bfndky
) will leave the
index at the first instance of the key. bnxtky
may be
used to walk through the set of duplicate keys. The
bprvky
function has been added to allow reverse key
navigation.
To faciltate the management of duplicate keys, a number of BTree
functions have been modified to operate against the current key, as
selected by the bfndky
, bnxtky
or
bprvky
functions. These are: bupdky
,
bdelky
, btupd
, btdel
and
btrecs
. Passing these functions a key pointer of NULL
will invoke the desired operation against the currently selected key.
See individual function descriptions for further details.
Sample Program
A very simple use of the BTree API is shown below. This program creates a BTree file and inserts one key ("akey") with the value of 99. Error checking has been omitted for clarity.
#include "btree.h" int main(int argc, char *argv[]) { BTA *btfile; btinit(); btfile = btcrt("test_db",0,FALSE); binsky(btfile,"akey",99); btcls(btfile); return 0; }
If the program source resides in the bt directory, the command to compile and link the program will be (assuming no Large File Support):
gcc -o simple simple.c -Iinc -Llib -lbt
A couple of additional sample programs, using the BT Library API, can be
found the the samples
sub-directory. A Makefile
is provided to build
the sample programs. The makefile assumes it will be run in the
samples
directory.
Function Descriptions
Introduction
This chapter describes each of the functions offered by the BT API. Rather than present the functions in alphabetic order (as any sensible document would), the functions are described in order of probable usage by an application program. To make it even more difficult to use as a reference manual, the functions are titled by their functionality, not their names.
Initialising the B Tree library
#include "btree.h"
int btinit(void);
The btinit
function initialises the B Tree library. It must be invoked
before any other B Tree routine. Failure to do so will result in strange
errors.
Calling btinit
more than once in the execution lifetime of the B Tree
library will cause it to return an error (QINERR). btinit also checks
that the block size, in bytes, of the B Tree library is a power of two.
An error return will result for non-conformant block sizes. Successful
initialisation is indicated by a return value of zero.
Creating a B Tree File
#include "btree.h"
BTA* btcrt(char* fid, int nkeys, int shared);
The btcrt
function will create and initialise a new B Tree file. The
parameter fid
must be set to the name of the file to create. The
nkeys
defines the maximum number of keys that can be stored in the B
Tree. This parameter should always be set to 0 for those operating
systems, such as UNIX, that support dynamic file growth. The parameter
shared
should be set to 0 to disallow shared access to the newly
created B Tree, or non-zero to allow shared access.
btcrt
will return a pointer to the BT activation context for the newly
opened file (BTA*), or NULL in the case of an error. To determine the
cause of an error, invoke the btcerr
function.
If the B Tree index file has been successfully created, the default root is selected, the file becomes the in-use B Tree file and is ready for further operations.
WARNING: The btcrt
function will unconditionally create a new
file, even if a file of the same name already exists.
Opening a B Tree File
#include "btree.h"
BTA* btopn(char* fid, int mode, int shared);
The btopn
function will open an existing B Tree file. The parameter
fid
must be set to the name of the file to open. The mode
parameter
determines if the B Tree file can be updated. A value of zero indicates
that updates are allowed, a non-zero value will prohibit updates. The
parameter shared
should be set to zero to disallow shared access to
the B Tree file, or non-zero to allow shared access.
btopn
will return a pointer to the BT activation context for the newly
opened file (BTA*), or NULL in the case of an error. To determine the
cause of an error, invoke the btcerr
function.
If the B Tree index file has been successfully opened, the default root is selected, and the file is ready for further operations.
Closing a B Tree File
#include "btree.h"
int btcls(BTA* btact);
The btcls
function will close the file associated with the btact
context pointer.
A non-zero return code indicates an error occurred in closing the file.
Set/unset support for duplicate keys
#include "btree.h"
int btdups(BTA* btact, int dups);
The btdups
controls support for duplicate keys in the current root of
the index file associated with the btact
context pointer. Setting the
value of the dups
to non-zero (TRUE) will enable support for duplicate
keys in the current root. A value of zero (FALSE) will disable duplicate
key support for the current root. Enabling duplicate key support on the
superroot is not permitted.
Disabling duplicate key support on a root that previously permitted them merely prevents further duplicate keys from being inserted into the root BTree index. Existing duplicates will remain and must be managed by the application.
A non-zero return code indicates an error occurred.
Set write through threshold for index file blocks
#include "btree.h"
int btthresh(BTA* btact, int threshold);
The btthresh
function sets the write threshold for the btree index
file associated with the btact
context pointer. The threshold
defines the number of updates on a block that will cause it to be
written to disk. A value of zero (the default for a btree index) means
that a block is not written to disk until the memory it occupies is
required for a new block.
btthresh
offers finer-grained control over disk writes than in
previous versions of Btree, which was either only when necessary (in
exclusive mode), or after every API call (in shared mode). The intention
is to allow the application program to reduce the chance of lost data in
a btree index should a hardware or software falure interrupt the running
program before the indexes are closed and dirty blocks flushed to disk.
NB: If threshold
is set to a small value, it may reduce performance of
the BTree application.
A non-zero return code indicates an error occurred.
Inserting a key
#include "btree.h"
int binsky(BTA* btact, char* key, BTint value);
The binsky
function inserts a new key and associated integer value
into the current root of the file associated with the btact
context
pointer. The key, a character string, is passed in key
, while value
holds the associated integer value. value
is declared as a BTint,
which is normally a typedef for int, but with Large File Support will
be a typedef for long long.
If the key has been inserted successfully, binsky
returns zero,
otherwise an error code is returned.
Keys longer than the maximum key length (BT constant ZKYLEN) will be silently truncated to the maximum key length.
A non-zero return from binsky
indicates an error occurred during the
key insertion process.
Finding a key
#include "btree.h"
int bfndky(BTA* btact, char* key, BTint* value);
The bfndky
function searches for a key in the current root of the file
associated with the btact
context pointer. The key, a character
string, is passed as a pointer in key
. If the key is found, the
associated value will be returned in the integer location identified by
value
. value
is declared as a BTint, which is normally a typedef for
int, but with Large File Support will be a typedef for long long.
If the key is found, bfndky
returns zero. If the key is not found,
bfndky
will return an error code of QNOKEY
.
Whether or not the key is located, the B Tree context is left at the
next highest key within the B Tree file. A call to
bnxtky
will return this key. The function
bprvky
may be called to return the previous key.
If the current root supports duplicate keys (enabled by a call to
btdups
, and the target of the bfndky
function has
duplicates, the context of the B Tree index is positioned at the start
of the duplicate key set.
A non-zero return from bfndky
indicates an error occurred during the
key location process.
Finding a sequence of keys
#include "btree.h"
int bnxtky(BTA* btact, char* key, BTint* value);
The bnxtky
function returns the next key from the current root in the
file associated with the btact
context pointer. The key, a character
string, is returned via the pointer in key
. The value associated with
the key will be returned in the integer location identified by value
.
value
is declared as a BTint, which is normally a typedef for int,
but with Large File Support will be a typedef for long long.
bnxtky
returns zero to indicate the next key has been located. If no
next key exists, bnxtky
returns the error code QNOKEY
.
To initialise the B Tree position, a call to bfndky
or
btpos
must be made before the first call to bnxtky
.
Thereafter, repeated calls to bnxtky
may be made. Calls to
bprvky
may be freely intermingled with calls to
bnxtky
.
A non-zero return from bnxtky
indicates an error occurred during the
key location process.
Finding a reverse sequence of keys
#include "btree.h"
int bprvky(BTA* btact, char* key, BTint* value);
The bprvky
function returns the previous key from the current root in
the file associated with the btact
context pointer. The key, a
character string, is returned via the pointer in key
. The value
associated with the key will be returned in the integer location
identified by value
. value
is declared as a BTint, which is normally
a typedef for int, but with Large File Support will be a typedef for
long long.
bprvky
returns zero to indicate the previous key has been located. If
no previous key exists, bnxtky
returns the error code QNOKEY
.
To initialise the B Tree position, a call to bfndky
or
btpos
must be made before the first call to bprvky
.
Thereafter, repeated calls to bprvky
may be made. Calls to
bnxtky
may be freely intermingled with calls to
bprvky
.
A non-zero return from bprvky
indicates an error occurred during the
key location process.
Setting the position within a B Tree index
#include "btree.h"
int btpos(BTA* btact, int pos);
The btpos
function sets the position in the current root in the file
associated with the btact
context pointer. The desired position is
indicated by the pos
; a value of 1 positions before the first key in
the index, a value of 2 will position after the last key in the index.
These values correspond to the B Tree constants ZSTART and ZEND,
respectively.
Following a call to btpos
, calls to bnxtky
and
bprvky
may be made to return successive or previous
keys.
btpos
returns zero to indicate success, otherwise the error code if an
error was encountered.
Deleting a key
#include "btree.h"
int bdelky(BTA* btact, char* key);
The bdelky
function deletes a key from the current root in the file
associated with the btact
context pointer. The key, a character
string, is passed via the pointer in key
. If the key does not exist,
bdelky
returns the error code QNOKEY
. bdelky
returns zero on
successful deletion of a key.
If bdelky
is called with a key
value of NULL, the delete operation
will act against the current key, as selected by bfndky
, bnxtky
or
bprvky
operations. This capability is designed to allow deletion of a
duplicate key, presumably based on other, application managed,
attributes.
A non-zero return from bdelky
indicates an error occurred during the
key deletion process.
Updating the value of a key
#include "btree.h"
int bupdky(BTA* btact, char* key, BTint value);
The bupdky
function updates the value of an existing key in the
current root of the file associated with the btact
context pointer.
The key, a character string, is passed via the pointer in key
. The new
value is passed via value
. value
is declared as a BTint, which is
normally a typedef for int, but with Large File Support will be a
typedef for long long.
If the key does not exist, bupdky
returns the error code QNOKEY
.
If bupdky
is called with a key
value of NULL, the update operation
will act against the current key, as selected by bfndky
, bnxtky
or
bprvky
operations. This capability is designed to allow update of a
duplicate key, presumably based on other, application managed,
attributes.
bupdky
returns zero to indicate a successful update, error code
otherwise.
Creating a root
#include "btree.h"
int btcrtr(BTA* btact, char* root);
The btcrtr
function creates a new root within the file associated with
the btact
context pointer. The root name, a character string, is
passed via the pointer in root
. If the new root is created
successfully, btcrtr
returns zero.
On successful creation of a new root, on return from btcrtr
, the new
root will have been made current. If the root could not be created, the
current root is unchanged.
A non-zero return from btcrtr
indicates an error occurred during the
root creation process.
Changing the current root
#include "btree.h"
int btchgr(BTA* btact, char* root);
The btchgr
function changes the current root within the file
associated with the btact
context pointer. The target root name, a
character string, is passed via the pointer in root
. If the switch to
the target root is successful, btchgr
returns zero.
On successful change to the target root, on return from btchgr
, the
target root will have been made current. If the root could not be
switched, the current root is unchanged.
A non-zero return from btchgr
indicates an error occurred during the
root change process.
Deleting a root
#include "btree.h"
int btdelr(BTA* btact, char* root);
The btdelr
function deletes the named root within the file associated
with the btact
context pointer. The target root name for deletion, a
character string, is passed via the pointer in root
. If the deletion
of the target root is successful, btdelr
returns zero.
All blocks owned by the target root are deleted, and returned to the free list. Whether or not the target root is deleted, the current root is left unchanged.
A non-zero return from btdelr
indicates an error occurred during the
root delete process. It is considered an error to attempt to delete the
current root.
Gaining exclusive access to a B Tree file
#include "btree.h"
int btlock(BTA* btact);
The btlock
function enables a process to gain exclusive access to a B
Tree file, originally opened in shared mode. btlock
is passed btact
,
which holds the context pointer of the file for which exclusive access
is required.
btlock
will return zero on success, error code otherwise. Applications
should be ready to handle a QBUSY error return, indicating that
exclusive access could not be gained. btlock
waits for ZSLEEP seconds
before giving up the attempt to gain exclusive access. ZSLEEP is an
implementation defined constant. The default is five seconds.
Releasing exclusive access on a B Tree file
#include "btree.h"
int btunlock(BTA* btact);
The btunlock
function enables a process to relinquish exclusive access
to a file, originally gained from a call to btlock
. btunlock
is
passed btact
, which holds the context pointer of the B Tree file for
which exclusive access is no longer required.
If the B Tree file is not locked, or has been opened for exclusive
access, btunlock
has no effect.
A non-zero return from btunlock
indicates an error occurred.
Inserting a key and data
#include "btree.h"
int btins(BTA* btact, char* key, char* data, int dsize);
The btins
function inserts a key and data record into a file
associated with the btact
context pointer. Both key
and data
are
character pointers. Since the data may legitimately contain null (x'00')
characters, the length of the data, in bytes, is passed in dsize
.
dsize
must be zero or greater. If the key and data is successfully
stored in the B Tree file, btins
returns zero.
A non-zero return from btins
indicates an error occurred.
Updating data for an existing key
#include "btree.h"
int btupd(BTA* btact, char* key, char* data, int dsize);
The btupd
function updates the data record of an existing key in the
file associated with the btact
context pointer. Both key
and data
are character pointers. Since the data may legitimately contain null
(x'00) characters, the length of the data, in bytes, must be passed in
dsize
. If the replacement data is successfully stored in the B Tree
file, btupd
returns zero.
If btupd
is called with a key
value of NULL, the update operation
will act against the current key, as selected by btsel
, btseln
or
btselp
operations. This capability is designed to allow update of a
duplicate key, presumably based on other, application managed,
attributes.
A non-zero return from btupd
indicates an error occurred.
Locating data for an existing key
#include "btree.h"
int btsel(BTA* btact, char* key, char* data, int dsize, int* rsize);
The btsel
function locates and returns the data record of an existing
key in the file associated with the btact
context pointer. Both key
and data
are character pointers. The dsize
parameter must contain
the maximum number of bytes to be returned. The caller should ensure
that the data
pointer refers to an area of memory of at least dsize
bytes. The actual number of bytes returned is returned in rsize
. Even
if the data record contains more than dsize
bytes, only dsize
bytes
will be returned. If the data record is successfully retrieved (even
partially), btsel
returns zero.
An application program can determine the number of bytes occupied by a
data record through the btrecs
function.
A non-zero return from btsel
indicates an error occurred.
Deleting a key and associated data
#include "btree.h"
int btdel(BTA* btact, char* key);
The btdel
function deletes a key and data record in the file
associated with the btact
context pointer. key
is a character
pointer, identifying the key to delete. If deletion of the key and data
is successful, btdel
returns zero.
If btdel
is called with a key
value of NULL, the delete operation
will act against the current key, as selected by btsel
, btseln
or
btselp
operations. This capability is designed to allow deletion of a
duplicate key, presumably based on other, application managed,
attributes.
A non-zero return from btdel
indicates an error occurred.
Locating data for the next key in sequence
#include "btree.h"
int btseln(BTA* btact, char* key, char* data, int dsize, int* rsize);
The btseln
function locates and returns the next key and data record
in the file associated with the btact
context pointer. Before using
btseln
, a call to btsel
or btpos
must be
made to initialise the position within the B Tree. Calls to
btselp
may be freely intermingled with calls to
btseln
.
Both key
and data
are character pointers. The dsize
parameter must
contain the maximum number of bytes to be returned. The caller should
ensure that the data
pointer refers to an area of memory of at least
dsize
bytes. The actual number of bytes returned is returned in
rsize
. Even if the data record contains more than dsize
bytes, only
dsize
bytes will be returned. If the data record is successfully
retrieved (even partially), btseln
returns zero.
If no next key exists, btseln
will return the error code QNOKEY
.
An application program can determine the number of bytes occupied by a
data record through the btrecs
function.
A non-zero return from btseln
indicates an error occurred.
Locating data for the previous key in sequence
#include "btree.h"
int btselp(BTA* btact, char* key, char* data, int dsize, int* rsize);
The btselp
function locates and returns the previous key and data
record in the file associated with the btact
context pointer. Before
using btselp
, a call to btsel
or btpos
must be made to initialise the position within the B Tree. Calls to
btseln
may be freely intermingled with calls to
btselp
.
Both key
and data
are character pointers. The dsize
parameter must
contain the maximum number of bytes to be returned. The caller should
ensure that the data
pointer refers to an area of memory of at least
dsize
bytes. The actual number of bytes returned is returned in
rsize
. Even if the data record contains more than dsize
bytes, only
dsize
bytes will be returned. If the data record is successfully
retrieved (even partially), btselp
returns zero.
If no previous key exists, btselp
will return the error code QNOKEY
.
An application program can determine the number of bytes occupied by a
data record through the btrecs
function.
A non-zero return from btseln
indicates an error occurred.
Determine size of data record for specific key
#include "btree.h"
int btrecs(BTA* btact, char* key, int* rsize);
The btrecs
function returns the number of bytes occupied by the data
record of a key in the file associated with the btact
context pointer.
The key
parameter is a character pointer, identifying the key to
query. The number of bytes occupied by the data record is returned in
rsize
. If the key is located and the data size of the record returned
successfully, btrecs
returns zero.
If btrecs
is called with a key
value of NULL, the size operation
will act against the current key, as selected by btsel
, btseln
or
btselp
operations. This capability is designed to allow the
determination of the size of the data record of a duplicate key,
presumably based on other, application managed, attributes.
If btrecs
is invoked for a key without an associated data record,
the results are undefined.
A non-zero return from btrecs
indicates an error occurred.
#include "btree.h"
int bdbug(BTA* btact, char* opt, BTint blk);
The bdbug
function provides a debug capability for the B Tree package.
The following options can be passed via the opt
parameter:
control |
- |
displays the in-memory block information, together with the last key found details |
super |
- |
displays superroot information i.e. block usage, free list etc. |
stack |
- |
displays the tree stack (i.e. key context) |
space |
- |
displays occupancy statistics |
stats |
- |
displays B Tree operating statistics |
block |
- |
displays the contents of the block identified by |
structure |
- |
Performs a structure check of the currently active BTree
file. If |
A non-zero return from bdbug
indicates an error occurred during the
display of debugging information.
Retrieving error message text
#include "btree.h"
void btcerr(int* ierr, int* ioerr, char* srname, char* msg);
The btcerr
function returns the error code (in ierr
) and (if
relevant) the I/O error code (in ioerr
) of the last error encountered
by the B Tree system. In addition, it will return the name of the
function which detected the error (in srname
) and an error message in
msg
.
The maximum number of chars returned in srname
is BT constant
ZRNAMESZ. The maximum number of chars returned in msg
is BT constant
ZMSGSZ. Both char arrays will be zero-padded to ZRNAMESZ and
ZMSGSZ respectively. Declaring these arrays to be smaller than the BT
constants will ensure btcerr
acts as a very effective stack smasher.
Error Messages
This section lists the errors that may be encountered when using the B Tree system. The occurrence of most of these errors indicates a serious failure in the B Tree system, with the following exceptions:
- QNOKEY
-
The key given as a parameter to
bfndky
(or its brethren) does not exist. - QDUP
-
The key given as a parameter to
binsky
(or its brethren) already exists in the index. Duplicate keys are not permitted. - QBUSY
-
File busy, a normal hazard when using shared access mode in a multiuser environment.
- QNOWRT
-
The B Tree file was originally opened with read-only permission, and a write has subsequently been attempted. Probably an application program error.
- QNOBTF
-
Attempt to perform operation on B Tree file, but there is no file attached to the context pointer provided, likely an application error.
- QINERR
-
Attempt made to open the same file again, likely an application error.
- QDELCR
-
An attempt has been made to delete the current root, or worse, the super root. This is forbidden by the BT library.
- QBADOP
-
Unknown debug option passed to bdbug, likely an application error.
- QNOACT
-
Maximum number of concurrently open B Tree files reached - may be an application error.
- QBADAP
-
Illegal context pointer passed to a B Tree function - may be an application error.
- QDNEG
-
A negative length data record has been passed to a B Tree function.
- QBADVR
-
The B Tree index file was created using an older version of the B Tree library, and cannot be accessed safely with this version. Extract data using a program based on the previous version of the B Tree library, and import into a index file created with the new. Alternatively, it may be possible to use the
btr
recovery tool to migrate an older BTree index file to the latest version. - QDAOVR
-
A new data record cannot be entered as the maximum value of a data block pointer has been exceeded.
- QF2BIG
-
The index file has reached its maximum size for this implementation.
- QBADAL
-
Unable to set alarm for for file lock handling. This may be a problem with the underlying OS.
- QBADCTXT
-
Index context invalid for current key operation. An attempt was made to delete or update the current key, but the context is not valid. A valid context is set by bfndky, bnxtky, bprvky, btsel, btseln, or btselp.
- QNODUPS
-
Duplicates are not permitted in the superroot. Attempting to permit duplicate keys in the superroot, via btdups(..,TRUE), is prohibited.
- QNOT64BIT
-
Index file was created with a non-64 bit version (LFS=0) of the library. However, access is being attempted with a 64 bit version.
- Q64BIT
-
Index file was created with a 64 bit version (LFS=1) of the library. However, access is being attempted with a non-64 bit version.
1 |
QBLKNR |
Block %s is not a root block |
2 |
QCLSIO |
Unable to close index file: "file" |
3 |
QCRTIO |
Unable to create index file: "file" |
4 |
QCPBLK |
Unable to read source or destination block |
5 |
QWRBLK |
I/O error writing block |
6 |
QRDSUP |
I/O error reading super root |
7 |
QWRSUP |
I/O error writing super root |
8 |
QOPNIO |
I/O error opening index file: "file" |
9 |
QRDBLK |
I/O error reading block |
10 |
QIXOPN |
An index file is already open |
11 |
QSPLIT |
Can’t split full block |
12 |
QINFER |
Bad info block index used |
13 |
QNOMEM |
Unable to acquire a free memory block |
14 |
QSTKUF |
Stack underflow |
15 |
QSTKOF |
Stack overflow |
16 |
QBLKFL |
Can’t insert key at block: %s |
17 |
QLOCTB |
Replace location out of range |
18 |
QSPKEY |
Split: search for middle key failed |
19 |
QWRMEM |
Requested write block not in memory |
20 |
QBALSE |
Balance: search for key failed |
21 |
QDELEX |
Exact flag not set for delete |
22 |
QDELER |
Internal inconsistency in delete operation |
23 |
QDELRP |
Search for deleted key replacement failed (in block %s) |
24 |
QDEMSE |
Demote search failed |
25 |
QDEMSP |
Demote split failed |
26 |
QJNSE |
Join search failed |
27 |
QNODEF |
Cannot locate default root ($$default) |
28 |
QDELCR |
Deletion of the current or super root is forbidden |
29 |
QBADIX |
Negative in-memory index encountered |
30 |
QNOBTF |
No index file open for this operation |
31 |
QINERR |
Index file already in use |
32 |
QBADOP |
Debug option not recognised |
33 |
QNOACT |
No more index files may be opened (limit reached) |
34 |
QBADAP |
Invalid index file context pointer |
35 |
QBUSY |
File is busy |
37 |
QNOBLK |
No block available for data storage |
38 |
QNEGSZ |
Data block usage gone bad: %s |
39 |
QNOTDA |
Data segment header references a non-data block: %s |
40 |
QBADCTXT |
Index context invalid for current key operation |
41 |
QDLOOP |
Circular data segment pointer encountered |
42 |
QUNLCK |
Unlock operation failed |
43 |
QLRUER |
LRU queue corrupt - index not in list |
44 |
QDAERR |
Unable to insert data record |
45 |
QDNEG |
Data record cannot be negative |
46 |
QDUP |
Key "key" already exists in index |
47 |
QNOKEY |
Key "key" does not exist in index |
48 |
QNOWRT |
Write access to index prohibited |
49 |
QNOTFR |
Block on free list is not marked as free |
50 |
QBADVR |
Index file is incompatible with current version: "version" |
51 |
QDAOVR |
Data capacity exceeded at block: "block" |
52 |
QF2BIG |
Index file is at maximum size |
53 |
QBADAL |
Unable to set alarm for locking |
54 |
QDRANEG |
Data record address is negative: "address" |
55 |
QBLKSZERR |
Defined block size is not a power of two: "size" |
56 |
QNODUPS |
Duplicates keys are not allowed for the superroot |
57 |
QPOSERR |
Location search exceeds key count at block: %s |
58 |
QNOT64BIT |
Index file likely not LFS (64bit) enabled; doesn’t match library. |
59 |
Q64BIT |
Index file likely LFS (64bit) enabled; doesn’t match library. |
60 |
QNOTDUP |
Duplicate key address does not reference a duplicate block: %s. |
61 |
QDUPSZ |
Duplicate key entry has wrong size. |
62 |
QBADIR |
Bad direction parameter. |
B Tree Test Harness
The B Tree library is distributed with a test harness, bt
, which
exercises all of the functions supplied by the B Tree library.
Most bt
commands correspond directly to a matching B Tree library
function call. Additional commands are available to automate testing
scripts and manage concurrently open files. bt
reads from stdin and
writes normal output to stdout. Terminal error messages go to stderr. A
prompt of bt: is issued prior to reading from stdin. Long running
commands may be interrupted using cntrl-c, which will return to the
command prompt.
bt
is built with GNU readline support, if readline libraries and
include files are detected when building the BT library and supporting
tools. Readline enables bt
to offer command editing, command history
and file completion. More information on the capabilities provided by
readline can be found in the
full GNU documentation
.
A typical bt
session might look like:
$ bt bt: c test bt: d newkey 55 bt: f newkey Key: 'newkey' = 55 bt: dd datakey some_text_string bt: fd datakey Data returned: 'some_text_string' bt: fd datakey d some_text_string bt: b abuf 512 bt: dd bufkey *abuf bt: fd bufkey Data record: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaa' bt: sd bufkey Key 'bufkey' record size: 512 bytes bt: q $
bt
commands have a both a full and abbreviated versions. The
descriptions below show the full command first, followed by the
abbreviated version (comma separated). A command may optionally be
followed by an argument and a qualifier. The following table lists the
commands supported by bt
:
buffer,b
<bufname> <size> <filename>-
Buffer: Creates a data buffer called bufname. If the numeric size argument is given, the buffer is created with that number of bytes. The buffer is filled with the first character of the bufname. If the size argument is non-numeric, it is assumed to be a file name, and the contents of the file are read into the buffer. The data buffer can subsequently be specified as data for a
define-data
command. buffer-delete,bd
<bufname>-
Buffer Delete: Deletes an existing data buffer identified by bufname.
buffer-list,bl
-
Buffer List: Lists the names of the currently defined data buffers on stdout.
check-order,co
s c-
Check Order: Checks the lexicographic order of keys in the current root, starting from the current position within the BTree. If the s argument is given, the check is performed from the first key of the BTree index.
If a disordered index is discovered, the keys at fault are displayed.
Otherwise, check-order
is silent, unless the c argument is specified,
which causes the number of keys checked to be displayed.
create,c
<filename> s-
Create file: Creates a new B Tree file. If a file of the same name already exists, it will be silently overwritten. If the s qualifier is given, the file will be created in shared mode. The newly created B Tree file becomes the current file; use the
file-list
command to view the list of open files. change-root,cr
<rootname>-
Change Root: Switches the current root to the root named rootname in the in-use B Tree file. If switch is successful, rootname becomes the current root. All subsequent key and/or data operations will take place against rootname.
close,x
-
Close: Closes the in-use B Tree file. The next available open file, if one exists, is automatically made the in-use B Tree file. If there are no candidate B Tree files, a warning message is issued.
define,d
<key> <value>-
Define key: Defines a new key in the current root of the in-use B Tree index file. The new key name is defined by key, and is assigned value. If value is omitted, zero is assumed.
data-address,da
<key> i-
Data Address: Prints, in a decoded form, the data segment address associated with key. If the i qualifier is given, the key is interpreted as data segment address in integer form and decoded immediately.
define-data,dd
<key> <string> <*bufname>-
Define key with Data: Defines a new key with an associated data record in the current root of the in-use B Tree index file. key defines the key name. Data can be provided in one of two ways: either a plain string or the name of a previously defined buffer can be specified. If the latter, it should be indicated by a leading *.
define-root,dr
<rootname>-
Define Root: Creates a new B Tree index root, named rootname in the currently selected B Tree file. If creation is successful, the current root becomes the new root. All subsequent key and/or data operations will take place against the new root.
duplicates,dups
on off-
Duplicates: Sets or unsets support for duplicate keys in the current root. When on is specified, duplicate keys are permitted. When off, duplicate keys are not permitted .
echo,ec
on off-
a Echo: When
echo
is on, commands read from anexecute
file are echoed to stdout. If off, no echo is performed.
If no argument is given to echo
, the current status of the echo
setting is displayed.
error,er
on off-
a Error: When
error
is on, an execution error while reading commands from anexecute
file will cause termination of the execute file. If off, command execution will continue when errors are encountered.
If no argument is given to error
, the current status of the error
setting is displayed.
execute,e
<filename>-
a Execute: Causes commands to be read and executed from the file denoted by filename.
execute
commands can be nested, currently up to five deep. No command prompts will be issued while reading commands from a file.
See also the echo
and error
command
descriptions for more information on execution control when reading
commands from a file.
find,f
<key>-
a Find: Attempts to locate key in the current root of the in-use B Tree index file. If found, the value associated with the key is printed.
If key is omitted, the index is positioned prior to the first key
(like position
s).
find-data,fd
<key> d-
Find Data: Attempts to locate key in the current root of the in-use B Tree index file. If found, the first 80 bytes of the data record associated with the key is displayed. If the d qualifier is given, the whole of the data record is displayed. Note that the data record is displayed as character data; control characters or escape sequences in the data record could cause strangenesses on display.
file-list,fl
-
File List: Lists the set of open B Tree index files. To change the current file, issue a
use
command. list,l
c-
a List: Displays all key and associated value, starting from the current key in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
,previous
orprevious-data
command.
If the c argument is given, the count of keys listed will be displayed in addition.
list-data,ld
-
List Data: Displays all keys, and associated data records, starting from the current key in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
previous
orprevious-data
command. list-data-prev,ldp
-
List Data Previous: Displays all keys, and associated data records, prior to the current key in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
previous
orprevious-data
command. list-prev,lp
c-
a List Previous: Displays all keys, and associated value, prior to the current key in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
,previous
orprevious-data
command.
If the c argument is given, the count of keys listed will be displayed in addition.
list-keys-only,lko
-
List Keys Only: Displays all keys, but not associated value, starting from the current key in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
,previous
orprevious-data
command. lock,lk
-
Lock: Acquires exclusive access to the in-use B Tree file which was originally opened in shared mode. If the file was opened in exclusive mode (the default),
lock
will have no effect. next,n
-
Next: Displays the key following the current key in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
,previous
orprevious-data
command. next-data,nd
-
Next Data: Displays the key and associated data record following the current key in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
,previous
orprevious-data
command. open,o
<filename> s-
Open: Opens the existing B Tree file identified by filename. If the optional s qualifier is given, the file will be opened in shared mode. More than one B Tree index file may be open currently; the newly opened file is made the in-use B Tree file. The in-use file may be changed by the
use
command, while the list of open files is displayed through thefile-list
command. open-readonly,or
<filename> s-
Open Readonly: Opens the existing B Tree file identified by filename in read-only mode. If the optional s qualifier is given, the file will be opened in shared mode. More than one B Tree index file may be open currently; the newly opened file is made the in-use B Tree file. The in-use file may be changed by the
use
command, while the list of open files is displayed through thefile-list
command. position,pos
s e-
Position: Sets the position in the current root. s will cause the position to be set prior to the first key in the index, e will cause the position to be set after the last key.
prompt,p
-
Prompt: Toggles the display of a command prompt when
bt
is ready for the next user command. previous,prv
-
Previous: Displays the key prior to the current key in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
,previous
orprevious-data
command. previous-data,pd
-
Previous Data: Displays the key and associated data record prior to the current key in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
,previous
orprevious-data
command. quit,q
-
Quit: Terminates
bt
. Any open B Tree files will be closed automatically. remove,r
<key>-
Remove key: Removes a previously defined key in the current root of the in-use B Tree index file. The key name is specified by key.
remove-cur,rc
-
Remove Current key: Removes the current key in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
,previous
orprevious-data
command. remove-data,rd
<key>-
Removes key with Data: Removes a previously defined key and its associated data record in the current root of the in-use B Tree index file. key defines the key name.
remove-data-cur,rdc
-
Removes Current Data: Removes the current key and its associated data record in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
,previous
orprevious-data
command. remove-root,rr
<rootname>-
Remove Root: Removes an existing B Tree index root, named rootname in the in-use B Tree file. If removal is successful, all blocks used by the root (both index and data) will be returned to the free list. It is not permitted to remove the current root.
show,s
control super stats space stack block <n> structure v-
a Show: Displays B Tree debug information. The option specified is passed directly to the
bdbug
function. See thebdebug
description for the information provided by each option.
For the structure option, specifying v will cause a detailed report on the structure to be displayed. Otherwise, only a summary is displayed.
size-data,sd
<key>-
a Size Data: Displays the number of bytes occupied by the data record associated with key. If key is omitted, the size of the data record associated with the current key is displayed. The current key is set by the last
find
,find-data
,next
,next-data
,previous
orprevious-data
command.
If key has no associated data record, results are undefined.
system,!
<command>-
System: Passes the text following the system command to the shell for execution.
use,u
<filename>-
Use: Changes the in-use B Tree file to filename. The file must have already been opened or created, using the
open
orcreate
command. update-data,ud
<key> <string> <*bufname>-
Update Data: Updates an existing key with a new associated data record in the current root of the in-use B Tree index file. key defines the key name. Data can be provided in one of two ways: either a plain string or the name of a previously defined buffer can be specified. If the latter, it should be indicated by a leading *.
update-data-cur,udc
<string> <*bufname>-
Update Data Current: Updates the current key with a new associated data record in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
,previous
orprevious-data
command. Data can be provided in one of two ways: either a plain string or the name of a previously defined buffer can be specified. If the latter, it should be indicated by a leading *. unlock,ulk
-
Unlock: Unlocks the in-use B Tree file, if it was locked with the
lk
command. If not locked, or the file was originally opened in exclusive mode,unlock
has no effect. update-value,uv
<key> <value>-
Update Value: Modifies the value associated with key in the current root of the in-use B Tree index file.
update-value,uv
<value>-
Update Value Current: Modifies the value associated with current key key in the current root of the in-use B Tree index file. The current key is set by the last
find
,find-data
,next
,next-data
,previous
orprevious-data
command. write-threshold,wt
<threshold>-
Write Threshold: Sets the number of index block updates beyond which the block contents must be writtent to disk. A threshold of zero means writes will not take place unless a block must be flushed to disk.
help,?
cmd-
Help: Displays a list of
bt
commands and a terse description of syntax and usage. If cmd is specified, only help on that command will be displayed. comment,#
-
Comment:
bt
will ignore any line starting with a#
. Note thatbt
will also ignore blank lines.
B Tree Recovery
btr
provides a B Tree recovery facility. Recovery of a B Tree index
file may be required if it is left in an inconsistent state due to a
hardware or software failure during update processing.
btr
may also be used to migrate a Btree index file created with an
earlier version of the Btree library to current version. Due to limited
recovery information in earlier versions of the Btree index file, such
migration is really only applicable to single-rooted B Tree index files.
Name
btr - B Tree index recovery
Synopsis
btr [-a | -d | -f | -k | -n cnt | -r | -v | — ] <old_file> <new_file>
Description
btr
will attempt to recover key (and optionally data) information from
the B Tree index file identified by <old_file>. The recovered contents
are written to the B Tree index file identified by <new_file>. btr
recovery is controlled by a number of arguments:
Command Arguments
-k | -d |
Specifies recovery mode; -k for keys, -d for keys and data. Default if omitted is -k. |
-n <cnt> |
Sets maximum number of io errors to ignore before terminating the recovery. Default is 0. |
-v |
Causes |
-a |
If specified, the B Tree index in <new_file> will allow duplicate keys. The default is not to allow duplicates. |
-f |
If specified, the B Tree index represented by <new_file> will be overwritten. Default is to preserve <new_file>, should it exist. |
-r |
Requests |
— |
Causes |
Recovery Processing
btr
will attempt to open the input btree file using the btree library
version of btopn
. If this fails, the btr
version of btopn
is used
instead, which bypasses the consistency checks.
An attempt to read the superroot is made. If successful, the root names and root blocks are recorded. Only the roots present in the superroot are retained. Root names in any child blocks are ignored; the index structure may be damaged and therefore no attempt to traverse it is made.
Each block, starting from 1, is read. If marked as ZROOT, ZINUSE or ZDUP
the keys and values are extracted directly from the in-memory array. If
-k specified, the key and value are written to the <new_btree> index
file. If -d specified, and the value is a valid disk record address, an
attempt is made to read the data record. Data record addresses are
stored in a supporting bt index file (.bt_da.db
), to enable detection
of circular references. If the data record is read OK, the key and data
record is written to the <new_file> btree file. If the data record
cannot be read, only the key is written.
In version 4 (and later) of the BTree index, each ZINUSE block contains
the root block it belongs to. This data allows btr
to partition keys
by their roots. Only those roots recovered from the superroot will be
named as in the <old_file>. Keys from other roots will be copied to new
roots, named after their root block number (e.g. root_19834).
btr
will display summary statistics about the recovery on stdout when
complete.
Exit Status
0 if OK.
Notes
btr
can be used to recover (or indeed migrate) data from earlier (i.e.
pre-version 4) versions of a Btree index file. Since the root block
numbers are not held in the ZINUSE blocks, keys can not be partitioned
by root. Therefore, this facility is only really applicable to
single-rooted B Tree index files.
Customisation
All compile time constants are defined in the header file bc.h
. The
following constants may be altered for different hardware environments.
The values used in the example given are from the original UNIX
implementation.
- ZBPW
-
The number of bytes in a word.
- ZBYTEW
-
The number of bits in a byte.
- ZMXBLK
-
Maximum number of in-memory disk blocks that can be stored. The minimum value for this parameter is 3; there is no maximum. The more in-memory blocks defined, the lower the disk I/O requirements will be.
- ZBLKSZ
-
The number of bytes allocated to a disk block. This value should be set to a multiple of the physical disk block size. This must be defined as a power of two.
- ZKEYSZ
-
The maximum size (in bytes) of a key.
- ZTHRES
-
Threshold for block joining. This value determines the number of free key slots that must exist before two blocks are considered candidates for joining.
- ZMXACT
-
The maximum number of B Tree files that may be open concurrently.
- ZSLEEP
-
The number of seconds to wait for a B Tree file to become unlocked, when in shared mode.
- ZRNAMESZ
-
The maximum number of bytes returned for the name of the function reporting a BT error (via btcerr)/
- ZMSGSZ
-
The maximum number of bytes returned for the error message of the corresponding to a BT library error (via btcerr).
These compile time constants are assigned the following values in the distributed version:
ZBPW = 4 ZBYTEW = 8 ZMXBLK = 3 (16 when LFS=1) ZBLKSZ = 1024 (8192 when LFS=1) ZKEYSZ = 32 ZTHRES = 3 ZMXACT = 5 ZSLEEP = 5 ZRNAMESZ = 16 ZMSGSZ = 123
These values result in the minimum memory usage. If memory is not a constraint, increasing the values for ZMXBLK and ZBLKSZ will make the B Tree implementation much faster, e.g try:
ZMXBLK = 100 ZBLKSZ = 8192
The number of keys that can be stored in a block is determined at compile time, using the following definition:
#define ZMXKEY ((ZBLKSZ-ZBPW-ZINFSZ*ZBPW)/(ZKYLEN+2*ZBPW))
N.B. ZINFSZ is the number of information words that a block must carry as overhead. This value is six in this implementation.
Building and installing the BT Library
The BT library is distributed as a tar file, which contains a set of C
source and header files, a Makefile
and a set of testcases.
First, unpack the tar file into a convenient directory. cd to the
directory containing the source files and issue the command
make clean;make
. This will compile each BT library file, and create
the UNIX static library libbt.a
in the lib
sub-directory. make
will also built the BT test harness bt
, a utility, kcp
, which
performs intelligent copies of BT index files, a BTree index recovery
tool btr
and two additional testing utilities, bigt
and bigtdel
,
for large file handling.
The default BT library build will create a 32-bit version, in which the
maximum size of the index file is 2GiB. A version of BT with Large File
Support can be built with the command make clean;make LFS=1
.
When compiling programs against a LFS version of BT, if you need to manipulate the BTint value associated with a key, ensure you set the compile-time flag _FILE_OFFSET_BITS=64, e.g:
$ gcc -o yak yak.c -Iinc -Llib -lbt -D_FILE_OFFSET_BITS=64
This is not necessary if you are just using the in-built data record
functions (btsel
etc).
In order to test the newly created BT library, you can use the bt
test
harness for ad-hoc testing. Alternatively, the Makefile
provides a
means of automated testing. make test_run
will run a set of testcases,
held in the Testcases
sub-directory. These testcases use bt
scripts
to test key components of the BT library, comparing the results against
known, good, output templates. The output templates distributed with BT
should suffice for most standard Linux and FreeBSD builds.
Revision History
Revision |
Date |
Comment |
5.0.2 |
3rd May, 2024 |
Documentation converted to AsciiDoc and placed in source code tree |
5.0.1 |
1st July, 2020 |
Minor bug fixes. Modifications to accomodate changes in the toolchain since 2012 (GNU Make and C compilers). |
5.0.0 |
26th November, 2012 |
Revised duplicate key handling to remove index navigation restrictions when in shared mode. |
4.0.0 |
24th June, 2011 |
Add support for btree index recovery. |
3.1.2 |
3rd January, 2011 |
Document revised handling of duplicate keys in shared mode. |
3.1.1 |
21st December, 2010 |
Bump rev to match library version. No other changes. |
3.1.0 |
10th December, 2010 |
Added support for previous key search, duplicate keys. Programs that link against the 3.0.x library will operate unchanged with 3.1.0. |
3.0.1 |
2nd July, 2010 |
Enhanced BT test harness; bug fixes. |
3.0.0 |
4th June, 2010 |
Added support for large files (> 2GiB) |
2.0.4 |
10th May, 2008 |
Shared access sleep time added as implementation constant |
2.0.3 |
27th December, 2005 |
Changes to reflect that zero-length data records are valid |
2.0.2 |
3rd October, 2004 |
New text on implementation constants and file sizes |
2.0 |
1st June, 2004 |
Reflect changes to API in version 2.0 |
1.0 |
13th April, 2003 |
First release |
Colophon
This manual was written in AsciiDoc using Emacs. Html is generated by Asciidoctor.