![]() |
tfscan |
The SITE data from TRANSFAC contains information on individual (putatively) regulatory protein binding sites. It has been divided into the following taxonomic groups.
The program tfscan takes a sequence and the name of one of these taxonomic groups and does a fast match of the TRANSFAC sequences against the input sequence (optionally allowing mismatches).
The results is a list of the positions which match the binding sites in the TRANSFAC SITE database.
Because the binding sites are so small, there will be many spurious (false positive) matches.
% tfscan Input sequence(s): embl:hsfos Transcription Factor Class F : fungi I : insect P : plant V : vertebrate O : other Select class [V]: v Number of mismatches [0]: Output file [hsfos.tfscan]:
Mandatory qualifiers: [-sequence] seqall Sequence database USA -menu menu Select class -mismatch integer Number of mismatches [-outfile] outfile Output file name Optional qualifiers: (none) Advanced qualifiers: (none) General qualifiers: -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose |
Mandatory qualifiers | Allowed values | Default | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
[-sequence] (Parameter 1) |
Sequence database USA | Readable sequence(s) | Required | ||||||||||
-menu | Select class |
|
V | ||||||||||
-mismatch | Number of mismatches | Integer 0 or more | 0 | ||||||||||
[-outfile] (Parameter 2) |
Output file name | Output file | <sequence>.tfscan | ||||||||||
Optional qualifiers | Allowed values | Default | |||||||||||
(none) | |||||||||||||
Advanced qualifiers | Allowed values | Default | |||||||||||
(none) |
TFSCAN of HSFOS from 1 to 6210 HS$CFOS_20 R08485 384 396 agttcccgtcaat DOG$ATP1A_01 R08484 3057 3063 gacatgg HS$CEBPA_01 R08471 4535 4540 cacgtg HS$GPB_05 R08210 3716 3721 gtatct HS$GPB_02 R08207 5837 5842 ggtggg HS$GPB_02 R08207 2399 2404 ggtggg HS$GPB_02 R08207 2077 2082 ggtggg MOUSE$TIMP1_02 R08168 2362 2368 caggaag MOUSE$BMG_11 R08167 960 965 ggatag RAT$RPK_02 R08166 6136 6140 tgtgc RAT$RPK_02 R08166 5953 5957 tgtgc RAT$RPK_02 R08166 4433 4437 tgtgc RAT$RPK_02 R08166 4143 4147 tgtgc RAT$RPK_02 R08166 3450 3454 tgtgc RAT$RPK_02 R08166 3246 3250 tgtgc RAT$RPK_02 R08166 3154 3158 tgtgc RAT$RPK_02 R08166 1128 1132 tgtgc HS$TIMP1_02 R08152 2361 2368 ccaggaag HS$TIMP1_01 R08151 2642 2648 tgagtaa HS$IL3_08 R05028 4376 4381 tgtggg HS$IL3_08 R05028 3471 3476 tgtggg HS$IL3_08 R05028 2584 2589 tgtggg HS$IL3_08 R05028 2066 2071 tgtggg HS$CATHD_01 R04883 1430 1435 ggcggg HS$CATHD_01 R04883 1092 1097 ggcggg HS$CATHD_01 R04883 569 574 ggcggg RAT$IGFBP2_02 R04793 5123 5128 gggcgg RAT$IGFBP2_02 R04793 1429 1434 gggcgg RAT$IGFBP2_02 R04793 1091 1096 gggcgg RAT$IGFBP2_02 R04793 607 612 gggcgg RAT$IGFBP2_01 R04792 5123 5128 gggcgg RAT$IGFBP2_01 R04792 1429 1434 gggcgg RAT$IGFBP2_01 R04792 1091 1096 gggcgg RAT$IGFBP2_01 R04792 607 612 gggcgg HS$A14COL_01 R04791 5123 5128 gggcgg HS$A14COL_01 R04791 1429 1434 gggcgg HS$A14COL_01 R04791 1091 1096 gggcgg HS$A14COL_01 R04791 607 612 gggcgg HS$A24COL_03 R04790 5123 5128 gggcgg etc......
The output consists of a title line then 5 columns separated by whitespace.
The first column is the identifier of the entry.
The second column is the Accession Number of the entry.
The third and fourth columns are the start and end positions of the match in your input sequence.
The fifth column is the sequence of the region where a match has been found.
For further details on an entry from the TRANSFAC database, see:
http://transfac.gbf.de/cgi-bin/qt/search.pl
Your EMBOSS administrator will have to run the EMBOSS program tfextract in order to set these files up from the TRANSFAC distribution files.
EMBOSS data files are distributed with the application and stored in the standard EMBOSS data directory, which is defined by the EMBOSS environment variable EMBOSS_DATA.
To see the available EMBOSS data files, run:
% embossdata -showall
To fetch one of the data files (for example 'Exxx.dat') into your current directory for you to inspect or modify, run:
% embossdata -fetch -file Exxx.dat
Users can provide their own data files in their own directories. Project specific files can be put in the current directory, or for tidier directory listings in a subdirectory called ".embossdata". Files for all EMBOSS runs can be put in the user's home directory, or again in a subdirectory called ".embossdata".
The directories are searched in the following order:
This means that you should contact your EMBOSS administrator and ask them to run the tfextract program to set up the TRANSFAC data for EMBOSS.
Program name | Description |
---|
Your EMBOSS administrator will have to run the EMBOSS program tfextract in order to set up the data files from the TRANSFAC distribution files.