The GeneMark and GeneMark.hmm programs utilizes tables of oligonucleotide usage
statistics as the basis for its coding potential function. These matrices are species
specific. On this system, the matrix files have been installed in the directory
There may be more than one matrix available for an organism, often created to model
classes of genes with distinct characteristics. The matrix you choose to use for
gene-prediction will greatly affect your results so, here are some things to keep
GeneMark Matrices are Species Specific
It is sometimes possible to use matrices of one organism to predict coding regions
in another. This works best if the organisms are phylogenetically very close.
If There Are Several Matrices for a Single Organism...
... then there are recognized more than one class of coding region in that organism.
Often the coding regions will simply be classified by coding potential, but classifications
may also be based on statistical clustering. If you are not sure which matrix is
appropriate, try both.
If There Is No Appropriate Matrix for Your Organism...
previous: GeneMark Environment Variables
next: GeneMark Output