libmba
A library of generic C modules

The libmba package is a collection of mostly independent C modules potentially useful to any project. There are the usual ADTs including a linkedlist, hashmap, pool, stack, and varray, a flexible memory allocator, CSV parser, path canonicalization routine, I18N text abstraction, configuration file module, portable semaphores, condition variables and more. The code is designed so that individual modules can be integrated into existing codebases rather than requiring the user to commit to the entire library. The code has no typedefs, few comments, and extensive man pages and HTML documentation.

Links

Download
API Reference
Browse The Source
Libmba Mailing List

Similar Projects

These projects look similar in purpose to libmba although in most cases that has not be confirmed and their presence here is in not necessarily an endorsement of quality. They are listed here (not in any particular order) only to help developers focus their search.

SGLIB
Libtc
netsw.org
OSSP
LibAST
skalibs
libslack
LIBH
Hackerlab
Kazlib
SGLIB
Matt's C Utility Library
ubiqx
gdsl
smbase

News

libmba-0.8.10 released
Sat Aug 28, 2004
Two bugs have been found and fixed in the csv module. If a non-ASCII character was read with csv_row_parse, parsing would stop prematurely due to a signedness error. The csv module now uses unsigned char throughout to properly support internationalized text. Note csv_row_fread was unaffected by this bug. Second, if the character preceeding EOF was a double quote (as opposed to a newline) an error would occur. The csv module will now correctly process the final element.

Also related, a few example programs are now included with the distribution. One such example is the csvprint utility which prints data in a csv file using a format string.

examples$ ./csvprint data.csv "%2|%1|FOO(%2)\n"
three|two|FOO(three)
...

This is suprisingly useful for reordering fields, generating source code, etc.

The bitset_find_first function will now set errno to ENOENT if the target bit was not found.

Some issues regarding the initialization of svsem(3m) semaphores have been fixed. The module should now properly handle the initialization race outlined in Stevens' UNPv2 in addition to the scenario where a semaphore is removed during initialization.

Finally the eval(3m) module now provides for a context parameter to be specifed that will be passed to the user supplied symlook function. This is necessary for full reentrance.

All documentation has been updated accordingly.
libmba-0.8.9 released
Fri May 21, 2004
The sho_loop function now accepts a pattern vector and timeout like sho_expect and the cfg module has been modified to more closely support Java Properties escape sequences for spaces and Unicode characters.
libmba-0.8.8 released
Thu May 6, 2004
The purpose of this project is to provide generic C implementations of concepts elemental to a wide variety of programming problems. The latest addition to libmba is the diff module and it is a fine example of a non-trivial algorithm that is crucial to the function and efficiency of many common applications such as spell checkers, version control systems, spam filters, speech recognition, and more. The code is generic such that anything that can be indexed and compared with user supplied callbacks can be used such as strings, linked lists, pointers to lines in files, etc.

The algorithm is perhaps best known for it's use in the GNU diff(1) program for generating a "diff" of two files. Formally it is known as the shortest edit script (SES) problem and is solved efficiently using the dynamic programming algorithm described by Myers [1] and in linear space with the Hirschberg refinement. The objective is to compute the minimum set of edit operations necessary to transform sequence A of length N into B of lenth M. This can be performed in O((N+M)D^2) expected time where D is the edit distance (the number of elements deleted and inserted).

[1] E. Myers, ``An O(ND) Difference Algorithm and Its Variations,'' Algorithmica 1, 2 (1986), 251-266. http://www.cs.arizona.edu/people/gene/PAPERS/diff.ps

Also, in this release, the path module, which has been in libmba for a some time, is now documented. This module provides a high quality filesystem path canonicalization routine. Path canonicalization is notoriously unforgiving because the parsing rountine is complex and yet it is not uncommon for programs to be required to accept paths from potentially malicous sources. This implementation uses a state machine approach to reduce complexity and has been tested with a wide range of inputs (see tcase/test/data/PathCanonExamples.data). Certain conditions are enforced that minimize the potential for exploits. For exmaple, only one input character is examined with each iteration of the outer loop so that it can be certain that the slim and dlim limit pointers are checked with the advance of every input character. A canonicalized path cannot begin with a path separator unless the input began with a path separator. Because of the state machine structure, if there is a flaw in the implementation the fix is more likely to be a local adjustment which limits the potential for creating new flaws.
libmba-0.8.5 released
Wed Mar 10, 2004
This release includes several bug fixes. The hashmap_remove function could corrupt the integrity of the ADT resulting in lost elements. The hashmap_clear function was largely incorrect. These problems have been fixed. The svcond_wait function could return without reaquiring the specified lock if a signal was received. The function has been modified to assure that the lock is reaquired before the function returns if an EINTR signal is received.

There have also been some minor enhancements. The csv functions now accept a CSV_QUOTES flag to indicate quotes should be interpreted. To preserve the previous behavior it will be necessary to include this flag but the signature has not changed! Also, the signatures of the hash_fn and cmp_fn in the hashmap module have been modified to permit a context object. New text compare functions have been provided that should be used in favor of strcmp or similar. See HTML documentation for details.

Finally the domnode module has been deprecated. It is still in the tree and should build fine but modules that are not actively used by the author will be removed. Expat is no longer required by default.
libmba-0.8.0 released
Thu Jan 3, 2004
There have been very pervasive changes throughout this library. There are new modules, changes that affect many of the modules, and other miscellaneous adjustments. Specifically, there are seven (7) new modules. These are:
  • allocator - The allocator module abstracts memory management in Libmba and programs that use it. Most modules that allocate memory have been modified to use it but there is no additional knowledge required to use Libmba due to the allocator module. Just supply NULL for any allocator parameter to indicate the standard library allocator should be used (i.e. malloc(3)).
  • suba - The suba module provides a lock-less allocator implemented using a simple circular list of "cells". Using this allocator has many benifits including measurably increasing the performace of an application.
  • bitset - Some macros and functions for manipluating arbitrary pointer to memory as a collection of bits.
  • hashmap - The hashmap module has been completely replaced. The new implementation uses a very plain rehashing scheme with automatic resizing of the hash table. It is very space efficient and should be as fast as one could hope for from a general purpose hash map. This module replaces the previous chaining hashmap implementation.
  • svsem - The svsem module provides a POSIX-like semaphore implementation that uses the more common System V semaphores interface.
  • svcond - The svcond module provides a POSIX-like condition variables implementation that uses only System V semaphores. This is useful on Linux where process shared semaphores and condition variables are not supported.
  • time - Currently the time module only provides the time_current_millis function for retrieving the current time in milliseconds since Jan 1, 1970. The implematation works equally well on Linux and the Win32 environment at least.
Changes that affect multiple modules include:
  • Many Allocator Changes - With the addition of the allocator module most modules that allocate memory have been modified to accept the specification of an allocator. See the suba documentation for a description of the benifits of using a custom allocation scheme with Libmba ADTs.
  • Destruction and Function Parameters - One implication of factoring in the allocator module is that many functions that accepted function pointers to create or destroy objects have been modified to accept functions with signatures more suited to reentrant code and specifically functions of the allocator module. For example, previously the linkedlist_del function would accept a void del_fn(void *) parameter because this matched the signature of free(3). This has been changed to match the signature int *del_fn(void *context, void *object) (typedef'd as simply del_fn) which, with a cast, matches the signature of allocator_free. Similar changes have been made to function parameters that create objects such as pool_new.
  • New Clean Functions and Automatically Reclaiming Memory - Many modules have been modified with clean functions (not to be confused with clear). These functions will free any memory not explicitly being used by the module. For example the pool_clear function will destroy any unused objects. These functions are specifically designed to be called from a reclaim_fn specified using the allocator_set_reclaim function.
  • New Initializers - Many ADTs can be now be initialized into memory provided by the user. This has the benifit of reducing the number of objects created in programs and simplifying their management. It can be benificial to know that initializers that that have init in their name do not allocate memory. Initializers that have create in their name do allocate memory and must be explicitly destroyed even if they have not been used. Additionally the new functions have been modified to accept an allocator but otherwise their behavior has not fundamentally changed.
Other adjustments include the following:
  • The cfg_write function has been changed to cfg_fwrite. The csv_row_read function has been changed to csv_row_fread. The domnode_read function has been changed to domnode_fread. The domnode_write function has been changed to domnode_fwrite.
  • The paramter name this which is reserved in C++ has been removed entirely from the library.
  • The str_copy_new and wcs_copy_new functions have been modified to accept an allocator (again, use NULL for stdlib malloc).

libmba-0.7.0 released
Wed Oct 15, 2003
Microsoft Windows support has been improved. The Win32 debug build now properly creates DLLs with PDB information for listing source code after a memory fault. The standard __cplusplus macro guards have been added. Macros for prefixing __declspec(dllexport) directives have been added in favor of an explicit DEF file. These changes have been performed during the development of a non-trial MFC application so this release should work smoothly in a Win32 or MFC environment. The text module appears to work as advertised although a few adjustments have been made.

The csv module has been converted to support the text module text handling. The multibyte function is now csv_row_parse_str, the wide character function is csv_row_parse_wcs, and the csv_row_parse function is now a macro that accepts tchar parameters. The prototypes of these functions have also been changed to accept the specification of the separator that is used (e.g. '\t' rather than ',').

A new eval module has been added that will "calculate" the value of an expression such as '(5 + 3) * N', '(10+10-((10*10/11)|(10&10)))^0xFF78', etc.

The msgno functions have been adjusted to perform better in environments where variadic macros are not supported (e.g. MSVC).
libmba-0.6.15 released
Sat Aug 23, 2003
There have been significant and pervasive changes however to emphasize that all of these changes are binary compatible I have not incremented the major version number. All code that uses the published interfaces of libmba should work without modification.

The most significant change is the addition of text.h which contains a tchar typedef and many macros that abstract wide and multi-byte string functions. Depending on whether or not USE_WCHAR is defined these string functions will accept wide or multi-byte strings. This will permit programs to run using wide or locale dependent multi-byte text handling. Some of the libmba modules such as cfg have been modified to support both wide and locale dependent multi-byte text using this abstraction. Do not be alarmed that these prototypes have changed. Because the tchar typedef is defined as either unsigned char or wchar_t users can continue to use these modules as before without using tchar at all. It is also easy to globally substitute and replace tchar with the desired type in the source of interest. If you choose to take advantage of this new I18N functionality please read the following document for important information:

http://www.ioplex.com/~miallen/libmba/dl/docs/ref/text_details.html
One big advantage of this new text abstraction is that libmba will soon support Unicode on the Microsoft Windows platform (cfg and domnode modules already do).

The test suite has been cleaned up considerably. Just run make followed by the generated tmba program in the tcase directory to run all tests.

The build process has been formalized further. The code is now compiled using -D_XOPEN_SOURCE=500 meaning an SUSv2/UNIX98 is desired but most of the code does not require this standards level. In fact #ifdefs have been added to consider lesser environments.

Finally, a path module has been introduced. Currently this module contains one function; path_canon which canonicalizes a filesystem pathname. The state machine design is very safe when given the full range of possible inputs (see tcase/tests/data/PathCanonExamples.data).