![]() |
siggen |
Each position in the alignment is scored on the basis of a single or any combination of up to 3 scoring schemes. A signature of, for example, 10% sparsity would include data from the top 10% highest scoring alignment positions.
The resulting protein signature file is used by the application sigscan to find examples of the signature in other proteins.
% siggen Generates a sparse protein signature Location of alignment files for input [./]: ./jontest Extension of alignment files for input [.align]: Location of contact files for input [./]: ./jontest Extension of contact files [.con]: % sparsity of signature [10]: Generate a randomized signature [N]: Substitution matrix to be used [./EBLOSUM62]: Score alignment on basis of residue conservation [Y]: Score alignment on basis of number of contacts [Y]: Score alignment on basis of conservation of contacts [Y]: N Score alignment on a combined measure of number and conservation of contacts [N]: Ignore alignment postitions with post_similar value of 0 [Y]: Name of signature file for output [sig.sig]:
Mandatory qualifiers (* if not always prompted): [-algpath] string Location of scop structure-based sequence alignment files (input) [-algextn] string Extension of alignment files -sparsity integer % sparsity of signature * -seqoption menu Select number * -datafile matrixf This is the scoring matrix file used when comparing sequences. * -conoption menu Select number * -filtercon boolean Ignore alignment positions making less than a threshold number of contacts * -conthresh integer Threshold contact number * -conpath string Location of contact files (input) * -conextn string Extension of contact files * -cpdbpath string Location of domain coordinate files (embl format input) * -cpdbextn string Extension of coordinate files * -filterpsim boolean Ignore alignment postitions with post_similar value of 0 [-sigpath] string Location of signature files (output) [-sigextn] string Extension of signature files Optional qualifiers: (none) Advanced qualifiers: -randomise boolean Generate a randomised signature General qualifiers: -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose |
Mandatory qualifiers | Allowed values | Default | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
[-algpath] (Parameter 1) |
Location of scop structure-based sequence alignment files (input) | Any string is accepted | ./ | ||||||||
[-algextn] (Parameter 2) |
Extension of alignment files | Any string is accepted | .salign | ||||||||
-sparsity | % sparsity of signature | Any integer value | 10 | ||||||||
-seqoption | Select number |
|
3 | ||||||||
-datafile | This is the scoring matrix file used when comparing sequences. | Comparison matrix file in EMBOSS data path | EBLOSUM62 | ||||||||
-conoption | Select number |
|
4 | ||||||||
-filtercon | Ignore alignment positions making less than a threshold number of contacts | Yes/No | No | ||||||||
-conthresh | Threshold contact number | Any integer value | 10 | ||||||||
-conpath | Location of contact files (input) | Any string is accepted | ./ | ||||||||
-conextn | Extension of contact files | Any string is accepted | .con | ||||||||
-cpdbpath | Location of domain coordinate files (embl format input) | Any string is accepted | ./ | ||||||||
-cpdbextn | Extension of coordinate files | Any string is accepted | .pxyz | ||||||||
-filterpsim | Ignore alignment postitions with post_similar value of 0 | Yes/No | No | ||||||||
[-sigpath] (Parameter 3) |
Location of signature files (output) | Any string is accepted | ./ | ||||||||
[-sigextn] (Parameter 4) |
Extension of signature files | Any string is accepted | .sig | ||||||||
Optional qualifiers | Allowed values | Default | |||||||||
(none) | |||||||||||
Advanced qualifiers | Allowed values | Default | |||||||||
-randomise | Generate a randomised signature | Yes/No | No |
Example excerpt from an output signature file:
CL All beta proteins XX FO Lipocalins XX SF Lipocalins XX FA Fatty acid binding protein-like XX NP 2 XX NN [1] XX IN NRES 3 ; NGAP 2 ; WSIZ 2 XX AA A ; 2 AA V ; 1 AA L ; 4 XX GA 1 ; 5 GA 2 ; 2 XX NN [2] XX IN NRES 2 ; NGAP 2 ; WSIZ 5 XX AA F ; 1 AA Y ; 5 XX GA 12 ; 3 GA 10 ; 2 XX //
Important
EMBOSS data files are distributed with the application and stored in the standard EMBOSS data directory, which is defined by the EMBOSS environment variable EMBOSS_DATA.
To see the available EMBOSS data files, run:
% embossdata -showall
To fetch one of the data files (for example 'Exxx.dat') into your current directory for you to inspect or modify, run:
% embossdata -fetch -file Exxx.dat
Users can provide their own data files in their own directories. Project specific files can be put in the current directory, or for tidier directory listings in a subdirectory called ".embossdata". Files for all EMBOSS runs can be put in the user's home directory, or again in a subdirectory called ".embossdata".
The directories are searched in the following order:
Program name | Description |
---|---|
contacts | Reads coordinate files and writes files of intra-chain residue-residue contact data |
dichet | Parse dictionary of heterogen groups |
hmmgen | Generates a hidden Markov model for each alignment in a directory |
interface | Reads coordinate files and writes files of inter-chain residue-residue contact data |
profgen | Generates various profiles for each alignment in a directory |
psiblasts | Runs PSI-BLAST given scopalign alignments |
scopalign | Generate alignments for families in a scop classification file by using STAMP |
scoprep | Reorder scop classificaiton file so that the representative structure of each family is given first |
scopreso | Removes low resolution domains from a scop classification file |
seqalign | Generate extended alignments for families in a scop families file by using CLUSTALW with seed alignments |
seqsearch | Generate files of hits for families in a scop classification file by using PSI-BLAST with seed alignments |
seqsort | Reads multiple files of hits and writes a non-ambiguous file of hits (scop families file) plus a validation file |
seqwords | Generate file of hits for scop families by searching swissprot with keywords |
sigscan | Scans a signature against swissprot and writes a signature hits files |