Plant Genome I Conference
Town & Country Conference Center, San Diego, CA, November, 1992.
PG-I: GENE MAPPING ALGORITHM FOR GENE DISTANCES MEASURED WITH ERRORS
GENE MAPPING ALGORITHM FOR GENE DISTANCES MEASURED WITH ERRORS.
Fan H. Kung, Department of Forestry, Southern Illinois
University, Carbondale, IL 62901-4411.
Given N genes on a chromosome, there are N2 distance
between the N genes, N*(N-l)/2 absolute distances between
different genes. On the other hand, given there are N*(N-I)/2
absolute distances, each with some unknown measurement error,
how can one arrange the N genes into the exact order of
arrangement and obtain the additive distances estimated with
least squared errors? The problem can be solved by first
coping all initial N2 distances to a N by N matrix, then by
identifying which gene is located at the center or at the edge of
a reducing file. The steps of the procedure are as follows:
(1) Sum all the distances from a gene to another gene. The sum
is obtained by the row sum in the matrix. (2) If the current
number of genes is even, identify the gene with the maximum
sum of distances from step (1) as center, else, identify the
gene, with the minimum sum From step (1) as end. (3) Transfer
that gene to a waiting list, delete the row and column
associated with that gene and downsize the matrix to N- 1 by N-
1, then follow steps (1) and (2) to search for the next center or
end in the diminished N-1 gene file. (4) Repeat the process of
transferring and downsizing until there are only two genes left
then, according to the reverse progression of removal (i.e. first
out last in), genes being taken out are inserted back, one by
one, into the gene lineup as follows: (5) Transfer the newest
center gene to the middle when the current number of genes is
even, else, transfer the newest end gene to one of the
tails, left or right (6) Verify and discard duplicated
arrangements. (7) Repeat transferring genes from the bottom of
the waiting list to the gene lineup until all genes are ordered
in the correct sequence. (8) Assemble a new N by N directional
distance matrix. The square matrix will have zeros on the
diagonal, positive distances above the diagonal and
negative distances below. (9) Compute the average difference
between the paired cells within the two consecutive rows as the
map distance between those two neighboring genes. The
distance is additive and has the property of least squared
errors. For illustration, given map distances in locus pair for
a electrophoretic analysis of genetic linkage in Scots pines
(see Biochem. Genet. 25:803-814,1987) as follows:
- ADH-A:GOT-B = 47.3 ADH-A:LAP-B = 29.8 ADH-A:PGK-B = 5.0
- GOT-B:LAP-B = 13.2 GOT-B:PGK-B = 52.5 LAP-B:PGK-B = 29.1
The gene mapping algorithm shows that the order of genes is
AD-A:PG-B:LAP-B:GOT-B and distances between successive genes are
1.375, 30.575, and 16.825. The computer program will be
demonstrated during the Poster Session and will be available upon
request.
Return to Previous Page or Intl-PAG Homepage