helixturnhelix

 

Function

Report nucleic acid binding motifs

Description

helixturnhelix uses the method of Dodd and Egan and finds helix-turn-helix nucleic acid binding motifs in proteins.

The helix-turn-helix motif was originally identified as the DNA-binding domain of phage repressors. One alpha-helix lies in the wide groove of DNA; the other lies at an angle across DNA.

Usage

Here is a sample session with helixturnhelix.

% helixturnhelix
Input sequence: sw:laci_ecoli
Output file [laci_ecoli.hth]: 

Command line arguments

   Mandatory qualifiers:
  [-sequence]          seqall     Sequence database USA
  [-outfile]           report     Output report file name

   Optional qualifiers:
   -mean               float      Mean value
   -sd                 float      Standard Deviation value
   -minsd              float      Minimum SD
   -eightyseven        boolean    Use the old (1987) weight data

   Advanced qualifiers: (none)
   General qualifiers:
  -help                boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose


Mandatory qualifiers Allowed values Default
[-sequence]
(Parameter 1)
Sequence database USA Readable sequence(s) Required
[-outfile]
(Parameter 2)
Output report file name Report file  
Optional qualifiers Allowed values Default
-mean Mean value Number from 1.000 to 10000.000 238.71
-sd Standard Deviation value Number from 1.000 to 10000.000 293.61
-minsd Minimum SD Number from 0.000 to 100.000 2.5
-eightyseven Use the old (1987) weight data Yes/No No
Advanced qualifiers Allowed values Default
(none)

Input file format

The input sequence can be one or more protein sequences.

Output file format

The output is a standard EMBOSS report file.

The results can be output in one of several styles by using the command-line qualifier -rformat xxx, where 'xxx' is replaced by the name of the required format. The available format names are: embl, genbank, gff, pir, swiss, trace, listfile, dbmotif, diffseq, excel, feattable, motif, regions, seqtable, simple, srs, table, tagseq

See: http://www.uk.embnet.org/Software/EMBOSS/Themes/ReportFormats.html for further information on report formats.

By default helixturnhelix writes a 'motif' report file.

Here is a sample output:


######################################## # Program: helixturnhelix # Rundate: Mon Feb 11 13:45:02 2002 # Report_file: laci_ecoli.hth ######################################## #======================================= # # Sequence: LACI_ECOLI from: 1 to: 360 # HitCount: 1 # # Hits above +2.50 SD (972.73) # #======================================= Maximum_score_at at "*" (1) Score 2160.000 length 22 at residues 4->25 * Sequence: VTLYDVAEYAGVSYQTVSRVVN | | 4 25 Standard_deviations: 6.54 #--------------------------------------- #---------------------------------------

Data files

The data files are stored in the standard EMBOSS data directory. The names are: With care these can be replaced to suit your data sets. If the files are placed in the following directories they will be used in preference to the files in the EMBOSS distribution data directory: Here is the default file:

# Amino acid counts for 91 Helix-turn-helix (presumed) protein motifs
# from Dodd IB and Egan JB (1990) Nucl. Acids. Res. 18:5019-5026.
#
Sample: 91 aligned sequences
#
# R  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 Total Exp
# - -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- ----- ---
  A  2  1  3 14 10 12 75  6 15  9  1  1  4  3  8 15  4  4  4 11  0 10   212 995
  C  0  0  1  1  0  0  0  0  0  3  3  1  1  0  0  0  0  0  0  1  0  3    14 106
  D  0  1  0  1 14  0  0 14  1  0  5  0  1  2  0  0  0  0  1  1  0  2    43 556
  E  4  5  0 11 26  0  0 16  9  3  3  0  3 12 13  0  0  2  0  1 13  6   127 669
  F  4  0  4  0  0  4  0  1  0 10  0  0  0  0  1  0  0  1  1  1 22  0    49 358
  G  9  7  1  4  0  0  8  0  0  0 50  0  6  0  7  1  0  3  1  1  0  4   102 761
  H  4  3  1  1  2  0  0  3  2  0  5  0  3  3  0  2  0  2  4  5  0  2    42 225
  I 10  0 13  3  2 15  0  4  9  4  0 17  0  2  0  1 31  1  4  8 16  1   141 583
  K  4  4  6 11 12  1  1 14 11  0  5  2  2  7  2  1  0  5  8  4  5 15   120 516
  L 16  1 17  0  1 35  0  3 12 31  0 22  0  2  1  1 22  1  1 12 20  0   198 954
  M  7  0  2  1  1  1  0  0  5  7  1 10  0  0  2  0  2  0  0  2  0  1    42 275
  N  0  8  0  1  0  0  0  2  1  1 14  0  8  1  4  2  0  4  9  0  0 11    66 383
  P  1  6  0  1  0  0  0  0  0  0  0  0  3 13  7  0  0  0  0  0  0  3    34 403
  Q  2  1 21  9 11  0  0  9  8  0  0  2  1 17  7 12  0  3 12  5  3  9   132 437
  R  9 10 14  9  5  0  1 16 10  0  1  0  1 17  8  7  0 17 28  3  0 16   172 609
  S  2 17  0  8  4  1  6  1  2  2  3  0 37  1 25  5  0 29  3  0  1  5   152 552
  T  6 24  3 12  1  5  0  2  2  4  0  5 20  4  3 39  0  4  1  0  4  3   142 512
  V  7  3  1  1  2 16  0  0  2 12  0 29  0  5  3  3 32  0  7  8  7  0   138 724
  W  2  0  0  0  0  0  0  0  0  1  0  1  0  0  0  0  0  0  2 21  0  0    27 105
  Y  2  0  4  3  0  1  0  0  2  4  0  1  1  2  0  2  0 15  5  7  0  0    49 267

Notes

None.

References

  1. Dodd I.B., Egan J.B. (1987) "Systematic method for the detection of potential lambda cro-like DNA-binding regions in proteins." J. Mol. Biol. 194: 557-564.
  2. Dodd I.B., Egan J.B. (1990) "Improved detection of helix-turn-helix DNA-binding motifs in protein sequences." Nucleic Acids Res. 18: 5019-5026.

Warnings

The program will warn you if the data file is not mathematically accurate.

Diagnostic Error Messages

None.

Exit status

It exits with status 0 unless an error is reported.

Known bugs

None.

See also

Program nameDescription
antigenicFinds antigenic sites in proteins
digestProtein proteolytic enzyme or reagent cleavage digest
fuzzproProtein pattern search
fuzztranProtein pattern search after translation
garnierPredicts protein secondary structure
hmomentHydrophobic moment calculation
oddcompFinds protein sequence regions with a biased composition
patmatdbSearch a protein sequence with a motif
patmatmotifsSearch a PROSITE motif database with a protein sequence
pepcoilPredicts coiled coil regions
pepnetDisplays proteins as a helical net
pepwheelShows protein sequences as helices
pregRegular expression search of a protein sequence
pscanScans proteins using PRINTS
sigcleaveReports protein signal cleavage sites
tmapDisplays membrane spanning regions

Author(s)

This application was written by Alan Bleasby (ableasby@hgmp.mrc.ac.uk)

Original program "HELIXTURNHELIX" by Peter Rice (EGCG 1990)

History

Completed 11th March 1999

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.

Comments