GeneMark.hmm Manual
(eukaryotic)









Return to GeneMark.hmm (Eukaryotic)

The Eukaryotic GeneMark.hmm Output gmhmmp
Interpreting the Eukaryotic GeneMark.hmm Output

GeneMark.hmm output contains predicted protein-coding exon boundaries, and predicted proteins. The output will be a file divided into three sections as follows:

The Output Header

Each output generated by GeneMark.hmm has a header describing the parameters and matrix used in the analysis. This information is purely for recordkeeping purposes. Here's a sample header:
GeneMark.hmm (Version 1.0.0)
Sequence name: /test-human/humhbb/sequence-humhbb.txt
Sequence length: 73308 bp
G+C content: 39.46%
Matrices file: /test-sequences/human.mtx (Homo sapiens)
Fri May 14 16:27:47 1999

The Predicted Exon Boundaries

This section describes the predicted exons. The ' Gene #' column is the sequential gene number. ' Exon # ' refers to the order of exons in the current gene. 'DNA Strand' indicates which strand the gene was found on -- ' +' refers to direct, '-' to complementary. ' Exon Type' can be one of four options -- initial, internal, terminal, or single. The 'Exon Range' columns indicate exons boundaries relative to the beginning of the sequence (5' end of the direct strand.) The 'Start/End Frame' indicates the codon positions of exon boundaries.

Predicted genes/exons

Gene Exon Strand Exon Exon Range Exon Start/End
# #   Type     Length Frame
               
1 1 + Terminal 6168 6449 282 1 3
               
2 3 - Terminal 13450 13528 79 3 3
2 2 - Internal 16097 16311 215 1 2
2 1 - Initial 16436 16468 33 1 3
               
3 1 + Initial 19541 19632 92 1 2
3 2 + Terminal 19755 20169 415 3 3
               
4 1 + Initial 34531 34622 92 1 2
4 2 + Internal 34745 34967 223 3 3
4 3 + Terminal 35854 35982 129 1 3
               
5 1 + Initial 39467 39558 92 1 2
5 2 + Internal 39681 39903 223 3 3
5 3 + Terminal 40770 40898 129 1 3
               
6 1 + Initial 45995 46144 150 1 3
6 2 + Internal 47314 47417 104 1 2
6 3 + Terminal 50485 50572 88 3 3
               
7 1 + Initial 54790 54881 92 1 2
7 2 + Internal 55010 55232 223 3 3
7 3 + Terminal 60474 60557 84 1 3
               
8 1 + Initial 62187 62278 92 1 2
8 2 + Internal 62409 62631 223 3 3
8 3 + Terminal 63482 63610 129 1 3
               
9 1 + Initial 68183 68396 214 1 1
9 2 + Terminal 68586 68746 161 2 3
               
10 1 + Single 68770 69078 309 1 3
               
11 1 + Single 70355 70819 465 1 3
               
12 1 + Initial 72905 73053 149 1 2
               
Predicted Protein Sequences:

Each sequence has a header which contains the sequence file name, the gene number, and the number of amino acids.

>sequence-humhbb.txt|GeneMark.hmm|gene 1|93_aa
NHQVVRLGCRPSSATSEDSVFSTAKHKLRYCGCEKLEVDIPALWPLLLTFTSWRLEVVVQ
ATVADHTSSTIIAFLQESLREKKVKKNLETTSE

>sequence-humhbb.txt|GeneMark.hmm|gene 2|108_aa
MKAVALPQNLNSMDTSLLLDSEYGVDSLLLPRSQLQSPHFLLSLLPMVPDLACIQGSDPF
HVSLWLKVVGRSFKKGYSIERPRLGMVIAGQRQKLCVDIDKSSDYAEL

>sequence-humhbb.txt|GeneMark.hmm|gene 3|168_aa
MVHFTAEEKAAVTSLWSKMNVEEAGGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGNPK
VKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKVSSGAGDVIFWLYIL
TLIEAHNLIGKTNKDLRNHGSSLMLEQQTSSEHNQNLHDSELVTVKDY

>sequence-humhbb.txt|GeneMark.hmm|gene 4|147_aa
MGHFTEEDKATITSLWGKVNVEDAGGETLGRLLVVYPWTQRFFDSFGNLSSASAIMGNPK
VKAHGKKVLTSLGDAIKHLDDLKGTFAQLSELHCDKLHVDPENFKLLGNVLVTVLAIHFG
KEFTPEVQASWQKMVTGVASALSSRYH

>sequence-humhbb.txt|GeneMark.hmm|gene 5|147_aa
MGHFTEEDKATITSLWGKVNVEDAGGETLGRLLVVYPWTQRFFDSFGNLSSASAIMGNPK
VKAHGKKVLTSLGDAIKHLDDLKGTFAQLSELHCDKLHVDPENFKLLGNVLVTVLAIHFG
KEFTPEVQASWQKMVTAVASALSSRYH

>sequence-humhbb.txt|GeneMark.hmm|gene 6|113_aa
MGNPKVKAHGKKVLISFGKAVMLTDDLKGTFATLSDLHCNKLHVDPENFLPKGRTISDGN
ENVGEWEFKDREDTFLQSCKKRENSQCLPLQNVHATERVRKPGKCQFLKYREH

>sequence-humhbb.txt|GeneMark.hmm|gene 7|132_aa
MVHLTPEEKTAVNALWGKVNVDAVGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGNPK
VKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRIAIEEPNTFCVCENN
QSEIFSQVPDEG

>sequence-humhbb.txt|GeneMark.hmm|gene 8|147_aa
MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPK
VKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFG
KEFTPPVQAAYQKVVAGVANALAHKYH

>sequence-humhbb.txt|GeneMark.hmm|gene 9|124_aa
MEQSWAENDFDELREEGFRRSNYSKLKEEVRTNGKEVKNFEKKLDEWITRITNAQKSLKD
LMELKTKAGELPESDGENGTKLENTLQDIIQENFPNLARQPKFTFRKYRERHKDTPREKQ
LQDT

>sequence-humhbb.txt|GeneMark.hmm|gene 10|102_aa
MKEKMLRAAREKGRVTHKGKPIRLTADLSAETLQARRKWGPIFNIVKEKNFRPRISYPAK
LSFISIGEIKSFTDKQMLRDFVTTRPALQELLKEALNMERNN

>sequence-humhbb.txt|GeneMark.hmm|gene 11|154_aa
MTRGITTDPTEIQTTVREYYKHLYANKLENLEEMDKFLDTYTLPRLNQEEVVSLNRPITG
SEIEAIINSLSTKKSPGPVGFIAEFYQRYKEELVPFLLKLFQSIEKEGILPNSFYEASII
LIPKPDRDTTKKENVTPISLMNIDAKILNKILAN

>sequence-humhbb.txt|GeneMark.hmm|gene 12|49_aa
MDEAGNYHSQQTITRTINQTPHVLTHRWELNNENTWTHEEEHHTLGTVM


previous: Using GeneMark.hmm
next: Comparison of GeneMark.hmm



Gene Probe, Inc.
1106 Wrights Mill Court
Atlanta, GA 30324


PH: +1 (404) 579 - 2975

Technical Support
Licensing Support