Flatfile databases are plain text files in a defined format such as those released by EMBL, Swissprot and so on. The EMBOSS program DBIFLAT is used to generate EMBLCD indices that can be used for all types of database access. DBIFLAT can process databases in EMBL, SWISSPROT and GENBANK format. Pseudo EMBL format databases which do not have unique ID and AC entries may cause DBIFLAT to do mysterious things and should be avoided.
DBIFLAT (and the EMBLCD access method) requires the databases to be uncompressed. The examples given here will not probe the deeper secrets of DBIFLAT (for which the reader is referred to the documentation, or failing that the source code) but will show a typical installation for a common database.
We assume that EMBOSS has been installed and works. This can be tested with the command wossname -auto which should list all the programs available.
In this example we will index and configure the EMBL database for use with EMBOSS.
First download and unpack the EMBL database. This will require a considerable amount of disk space. If you do not have sufficient space available then just download a subset of the database.
Use cd to move the directory in which you have unpacked EMBL. This should look something like this when you run ls:
% ls est_fun.dat est_hum1.dat est_hum10.dat . Output truncated . syn.dat unc.dat vrl.dat vrt.datRun DBIFLAT to create the EMBLCD indices.
% dbiflat Index a flat file database EMBL : EMBL SWISS : Swiss-Prot, SpTrEMBL, TrEMBLnew GB : Genbank, DDBJ Entry format [SWISS]: EMBL Database name: embl Database directory [.]: Wildcard database filename [*.dat]: Release number [0.0]: 63.0 Index date [00/00/00]: 31/07/00DBIFLAT should happily chug away for some considerable time (up to a few hours depending on the speed of your machine) and will generate (eventually) the following index files:
% ls acnum.hit acnum.trg division.lkp entrynam.idxNow we create an entry in the EMBOSS configuration files to acces sthe database. It is probably a good idea to try new database definitions in your local configuration file first.
Put the following entry in your .embossrc
DB embl [ type: N method: emblcd format: embl dir: $emboss_db_dir/embl file: "*.dat" release: "63.0" comment: "EMBL release 63.0" ]you will have needed to predefine $emboss_db_dir using a directive such as
set emboss_db_dir /path_to_databasessomewhere in your emboss.default or .embossrc.
Save .embossrc and try SHOWDB. You should see a line that looks like:
% showdb .. output deleted embl N OK OK OK EMBL release 63.0 .. output deleted