prettyplot

 

Function

Displays aligned sequences, with colouring and boxing

Description

prettyplot reads in a set of aligned DNA or protein sequences. It displays them graphically, with conserved regions highlighted in various ways.

Usage

Here is a sample session with prettyplot.

% prettyplot -resbreak=10 -boxcol -consensus -plurality=3
Displays aligned sequences, with colouring and boxing
Input sequence set: globin.msf
Graph type [x11]:
click here for result
$ prettyplot globin.msf -plurality=3 -docolour
Displays aligned sequences, with colouring and boxing
Graph type [x11]: 
click here for result

Command line arguments

   Mandatory qualifiers (* if not always prompted):
  [-msf]               seqset     File containing a sequence alignment
*  -graph              graph      Graph type

   Optional qualifiers:
   -residuesperline    integer    The number of residues to be displayed on
                                  each line
   -resbreak           integer    Residues before a space
   -[no]ccolours       boolean    Colour residues by their consensus value.
   -cidentity          string     Colour to display identical residues (RED)
   -csimilarity        string     Colour to display similar residues (GREEN)
   -cother             string     Colour to display other residues (BLACK)
   -docolour           boolean    Colour residues by table oily, amide etc.
   -[no]title          boolean    Do not display the title
   -shade              string     Set to BPLW for normal shading
                                  so for pair = 1.5,1.0,0.5 and shade = BPLW
                                  Residues score Colour
                                  1.5 or over....... BLACK (B)
                                  1.0 to 1.5 ....... BROWN (P)
                                  0.5 to 1.0 ....... WHEAT (L)
                                  under 0.5 ....... WHITE (W)
                                  The only four letters allowed are BPLW, in
                                  any order.
   -pair               string     Values to represent identical similar
                                  related
   -identity           integer    Only match those which are identical in all
                                  sequences.
   -[no]box            boolean    Display prettyboxes
   -boxcol             boolean    Colour the background in the boxes
   -boxcolval          string     Colour to be used for background. (GREY)
   -[no]name           boolean    Display the sequence names
   -maxnamelen         integer    Margin size for the sequence name.
   -[no]number         boolean    Display the residue number
   -[no]listoptions    boolean    Display the date and options used
   -plurality          float      Plurality check value (totweight/2)
   -consensus          boolean    Display the consensus
   -[no]collision      boolean    Allow collisions in calculating consensus
   -alternative        integer    Use alternative collisions routine
                                  0) Normal collision check. (default)
                                  1) checks identical scores with the max
                                  score found. So if any other residue matches
                                  the identical score then a collision has
                                  occurred.
                                  2) If another residue has a greater than or
                                  equal to matching score and these do not
                                  match then a collision has occurred.
                                  3) Checks all those not in the current
                                  consensus.If any of these give a top score
                                  for matching or identical scores then a
                                  collision has occured.
   -matrixfile         matrix     This is the scoring matrix file used when
                                  comparing sequences. By default it is the
                                  file 'EBLOSUM62' (for proteins) or the file
                                  'EDNAFULL' (for nucleic sequences). These
                                  files are found in the 'data' directory of
                                  the EMBOSS installation.
   -showscore          integer    Print residue scores
   -portrait           boolean    Set page to Portrait

   Advanced qualifiers:
   -data               boolean    (no help text) boolean value

   General qualifiers:
  -help                boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose


Mandatory qualifiers Allowed values Default
[-msf]
(Parameter 1)
File containing a sequence alignment Readable sequences Required
-graph Graph type EMBOSS has a list of known devices, including postscript, ps, hpgl, hp7470, hp7580, meta, colourps, cps, xwindows, x11, tektronics, tekt, tek4107t, tek, none, null, text, data, xterm, png EMBOSS_GRAPHICS value, or x11
Optional qualifiers Allowed values Default
-residuesperline The number of residues to be displayed on each line Any integer value 50
-resbreak Residues before a space Integer 1 or more Same as -residuesperline to give no breaks
-[no]ccolours Colour residues by their consensus value. Yes/No Yes
-cidentity Colour to display identical residues (RED) Any string is accepted RED
-csimilarity Colour to display similar residues (GREEN) Any string is accepted GREEN
-cother Colour to display other residues (BLACK) Any string is accepted BLACK
-docolour Colour residues by table oily, amide etc. Yes/No No
-[no]title Do not display the title Yes/No Yes
-shade Set to BPLW for normal shading so for pair = 1.5,1.0,0.5 and shade = BPLW Residues score Colour 1.5 or over....... BLACK (B) 1.0 to 1.5 ....... BROWN (P) 0.5 to 1.0 ....... WHEAT (L) under 0.5 ....... WHITE (W) The only four letters allowed are BPLW, in any order. Any string is accepted An empty string is accepted
-pair Values to represent identical similar related Any string is accepted 1.5,1.0,0.5
-identity Only match those which are identical in all sequences. Integer 0 or more 0
-[no]box Display prettyboxes Yes/No Yes
-boxcol Colour the background in the boxes Yes/No No
-boxcolval Colour to be used for background. (GREY) Any string is accepted GREY
-[no]name Display the sequence names Yes/No Yes
-maxnamelen Margin size for the sequence name. Any integer value 10
-[no]number Display the residue number Yes/No Yes
-[no]listoptions Display the date and options used Yes/No Yes
-plurality Plurality check value (totweight/2) Any numeric value Half the total sequence weighting
-consensus Display the consensus Yes/No No
-[no]collision Allow collisions in calculating consensus Yes/No Yes
-alternative Use alternative collisions routine 0) Normal collision check. (default) 1) checks identical scores with the max score found. So if any other residue matches the identical score then a collision has occurred. 2) If another residue has a greater than or equal to matching score and these do not match then a collision has occurred. 3) Checks all those not in the current consensus.If any of these give a top score for matching or identical scores then a collision has occured. Integer from 0 to 3 0
-matrixfile This is the scoring matrix file used when comparing sequences. By default it is the file 'EBLOSUM62' (for proteins) or the file 'EDNAFULL' (for nucleic sequences). These files are found in the 'data' directory of the EMBOSS installation. Comparison matrix file in EMBOSS data path EBLOSUM62 for protein
EDNAFULL for DNA
-showscore Print residue scores Any integer value -1
-portrait Set page to Portrait Yes/No No
Advanced qualifiers Allowed values Default
-data (no help text) boolean value Yes/No No

Input file format

Any sequence USA.

Output file format

An image of the alignment is displayed.

Data files

Prettyplot uses a comparison matrix file to calculate similarity to the consensus.

For protein sequences EBLOSUM62 is used for the substitution matrix. For nucleotide sequence, EDNAFULL is used.

EMBOSS data files are distributed with the application and stored in the standard EMBOSS data directory, which is defined by EMBOSS environment variable EMBOSS_DATA.

Users can provide their own data files in their own directories. Project specific files can be put in the current directory, or for tidier directory listings in a subdirectory called ".embossdata". Files for all EMBOSS runs can be put in the user's home directory, or again in a subdirectory called ".embossdata".

The directories are searched in the following order:

Notes

None.

References

None.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It exits with status 0 unless an error is reported.

Known bugs

Portrait mode does not cover the whole page! This is a "feature" in plplot.

See also

Program nameDescription
abiviewReads ABI file and display the trace
cirdnaDraws circular maps of DNA constructs
emmaMultiple alignment program - interface to ClustalW program
infoalignInformation on a multiple sequence alignment
lindnaDraws linear maps of DNA constructs
pepnetDisplays proteins as a helical net
pepwheelShows protein sequences as helices
plotconPlots the quality of conservation of a sequence alignment
prettyseqOutput sequence with translated ranges
remapDisplay a sequence with restriction cut sites, translation etc
seealsoFinds programs sharing group names
showalignDisplays a multiple sequence alignment
showdbDisplays information on the currently available databases
showfeatShow features of a sequence
showseqDisplay a sequence with features, translation etc
textsearchSearch sequence documentation text. SRS and Entrez are faster!
tranalignAlign nucleic coding regions given the aligned proteins

Author(s)

This application was written by Ian Longden (il@sanger.ac.uk) Informatics Division, The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

Many features were first implemented in the EGCG program "prettyplot" by Peter Rice.

The original suggestions for the PrettyPlot program were from Denis Duboule and Sigfried Labeit at EMBL. Gert Vriend added the star marking. Rita Grandori suggested the -NOCOLLISION option.

History

Completed 5th May 1999.

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.

Comments