Parameters
This chapter discusses some of the parameters that are common to a large
number of the routines.
General parameters for functions are shown
where anal is the name of the routine, and parameters is a list of one or more items to be used in the analysis. The descriptions of the function parameters use the convention:
type code | meaning |
---|---|
compulsory | this parameter must always be present |
optional | this parameter is optional and may be used to modify the behaviour of the routine or request additional information |
graphics | output as postscript graphics is available |
The output from routines are written to the file specified by
where each value must be in the range
The map functions implemented in gas are
will show all the parameters available with the sibdes routine.
Where the options are:
type | parameter | description |
---|---|---|
optional | pedigree | analyze properties of whole dataset |
family | examine individual families | |
locus | analyze loci singly and in pairs | |
graphical | psgraphics | display of statistical results |
If no families are named, then every family in the pedigree is analysed sequentially.
To run the analysis for the locus height include the following line in your gasfile program:
The type of analysis performed depends on the locus (see below). If several loci of the same type are listed within the brackets, gas performs a pairwise analysis to show correspondences between their distributions.
performs a IBD sib-pair analyses for the affection locus dis1 versus the marker locus mk1. Any multiple sibships are given a {strict} weighting as described above.
The routine lists the various types of matings, the degree of allele sharing between sibs in each (and parental source), the 2-1-0 t2 and chi2 scores and associated probabilities, together with the exact 1-0 binomial probabilities.
type | parameter | description |
---|---|---|
compulsory | locus | list the affection and marker loci to be analyzed |
optional | alltypes | show sharing for not-affected, concordant and discordant * pairs |
halfsib | show common-parent sharing for half-siblings | |
summary | only a short summary of the results is given | |
weight | options are strict and hodge |
If you have allele data for more than one named locus on a chromosome
then, provided the recombination fraction between adjacent loci is less
than 0.3, you will benefit from using the interval map version
(
The maximum is located in a two-phase search, using simulated
annealing to explore the function domain, then Powell's algorithm
(using Brent for the 1-dimensional sub-stages)
to refine converge about the highest point found.
For instance, the command
performs an IBS sib-pair analyses for the affection locus dis1
versus the marker locus mk1.
Note that even if parental information is available on some of the
pairs, it will not be used in the analyses.
Two methods are used to calculate p-values. The first uses a 2-sided
chi2
test to compare the overall observed IBS sharing distribution
with that predicted from the allele frequencies - note that this can
produce spurious significant results when an excess of 0 sharers are
present. The second method
uses Lange's Z-statistic (which automatically takes into account
multiple sibships) and produces a 1-sided p-value.
Note that the weight parameter only affects the
chi2 results
by reducing the effective contribution of multiple sibships - the
Z-statistic does not require weighting.
For instance, the command
gives a map of the allele sharing in maternally-derived chromosomes
for the marker loci {mk1, mk2, mk3 and mk4,
sorted to show pairs with the most sharing at the left end of
the chromosome first.
bestorder can take a numeric parameter n in
which case the first n equivalently good orders (as produced
by the triplet permutations described above) are listed in full.
The dataset contains 20 nuclear families with each locus
simulated as having 6 equally
frequent alleles and a recombination fraction of 0.04 between adjacent loci.
The results show that there are two equally good orders in which
loci 8 and 9 are interchanged
(20 sib-pairs is too small a dataset to expect sufficient crossovers between
each locus to produce a unique best ordering).
For instance, the command
assesses whether sibling pairs sharing more alleles
at named locus `mker' are
significantly more similar at quantitative locus `humour'
than pairs sharing fewer alleles at `mker'.
The exact parameter controls the threshold above which
the U-statistic is calculated approximately.
For more details see the entry on the
assmwu
routine.
sibhe implements 3 versions of the Haseman-Elston algorithm. The
default is to use all pairs for which there is definite sharing
information for either the paternal or maternal alleles (or their sum).
The knownonly parameter causes gas to use only the pairs for
which there is definite sharing information for both paternal and
maternal alleles
(this was the algorithm used by sibdreg in gas1.4).
The useall parameter means that all pairs in a dataset
with known quantitative values are
used and if no sharing information is available for a pair then their
expected IBD sharing is taken to
For instance, the command
gives a table of the allele sharing in paternally-derived
chromosomes for the marker loci
mk1, mk2, mk3 and mk4.
Note that when the sharing at a particular locus (for a particular pair)
cannot be assigned due to missing parental data, the algorithm in
gas calculates the expected sharing purely
from the known sharing at adjacent loci rather than attempting to infer
parental genotypes. This strategy was adopted to prevent incorrect
results being caused by wrongly specified allele frequencies,
which is a particular problem with highly polymorphic markers.
To illustrate consider 3 consecutive loci
X, I and Y with recombination fractions
thetaXI
writing
V12s=theta12s2+(1-theta12s)2,
where theta12s is the recombination fraction between
loci 1 and 2 along the chromatid of
where Vij is the sex-specific `V' value between
Sib-pairs for which there is no IBD sharing information at any locus are
not used by the interval mapping routines.
N.B. In most references the symbols V and S are generally
denoted by Greek `psi' and `pi', however it wasn't possible to duplicate
this using transportable html.
will generate extra points between the loci so that there is no region
larger than theta=0.03 without such an interpolated value.
Since recombination fractions cannot be added linearly (for
instance twice 0.2 is 0.32) the steps taken will be smaller
than the value specified after interval.
The empiricalpv option computes empirical p-values for each
dataset, and may be given a numeric parameter to control the number
of simulations used to estimate these. Hence
will compute 10 thousand replicates - if no number is given then the default
value of 5 (giving 5000 replicates per calculation) is assumed.
All of the `lik' routines use the Vitesse likelihood engine, which was
devised and implemented by Jeff O'Connell.
Vitesse is the fastest
likelihood program currently extant (1996), capable of computing multipoint
lodscores with highly polymorphic markers - see below for further details.
Vitesse is undergoing continuous improvement, and while we believe that
all the results produced are correct, there are restrictions on the types
of data it can currently handle. These are:
Condition [4] means that there can only be one mating in any
family in which all four grandparents are unknown (ie. not listed
in the pedigree).
Datasets which violate these conditions will cause the program to exit.
Vitesse will eventually available as a stand-alone program
with a `Linkage-like' interface via anonymous ftp.
The data and control formats are compatible with version 5.1/5.2
of LINKAGE and version 2.3P of the FASTLINK program.
Email
jeff@sherlock.hgen.pitt.edu
for more details on this.
If no value is supplied, a default of 1 is assumed.
If no value is supplied, a default of -2 is assumed, so
that any region with lodscore of -2 or lower is marked as being excluded.
Under the vast majority of circumstances the default options will produce
good results, however for `difficult' datasets you may try
increasing initstep and maxiter.
The maximization is carried out using Brent's algorithm, taking as starting
point the highest value found whilst constructing the map (the resolution
of which may be changed using the step parameter).
The generalized lodscore compares the likelihood against the value
when all the recombination fractions are set
To perform association tests it is essential that the names of alleles
be the same in different families (eg. named allele `1' must
represent the same physical marker in the whole population).
This means that global binning (preferably with fixed bin sizes)
must be used if data is read using the alsize option.
Some authors suggest that only one child should be used from each mating,
and that this child be selected according to fixed ascertainment criteria.
To employ this strategy you need to remove the other
children from the pedigree file before running asstdt.
If there are several children within a family which satisfy the
ascertainment criteria equally well (so that selecting a particular one
would be arbitrary), then the weight option
will calculate the average contribution from each
of these `equivalent' children and treat this as being
the contribution due to a single child.
Since the weight option may result in non-integer totals,
the chi2
distribution (with 1 degree of freedom) is used to calculate
the significance.
The second test categorizes subjects according to whether they do or
do not have a particular allele. The ranks of the subjects (according
to the quantitative trait) who have
each allele are compared with those who do not have the allele to
indicate if the allele tends to be associated with subjects who
are biased in a particular direction away from the mean. Subjects with
half-known genotypes are not used.
Note that exact calculation of p-values requires a large amount of time
and memory
(RAM is approximately proportional to
N2M2/4 where N and M
are the sizes of the datasets being compared)
and the optimal values for exact will depend on
your computer. If gas halts with an out-of-memory message, reduce one or
both of the exact values.
For example, the command
performs the Mann-Witney U-test on the quantitative locus weight
against the marker mar1. P-values are calculated by a
Gaussian approximation unless there are less than 20 instances of
a particular allele versus a set of 50 instances of other alleles.
For example, the command
performs the RPE analysis on the affection locus spotty
in terms of the alleles of the marker locus mar1.
If the total p-value is less than the significance criteria (which may
be altered with the signif parameter) then the allele with the
smallest p-value is removed from the dataset and the expected frequencies
are re-calculated as though that allele did not exist. This
procedure is then repeated until the total p-value becomes
non-significant.
If no affection loci are listed, then the whole of the pedigree is compared
to the input allele frequencies, and the risks computed refer to the
probability of a random member of the population being selected to form
part of the dataset (for optimum performance the members of the
pedigree should not be related).
The parameters inpairs, incommon, and
allother may be combined in a single command.
calculates the relative risk of subjects having genotype
calculates (separately) the relative risks of subjects having genotype
calculates (separately) the relative risks of subjects having genotype
At least three loci (one of which must be an affection status)
must be listed. If named loci are listed then all their alleles are
tested in a pairwise fashion.
References
Routine: SIBMLS
The sibmls routine calculates
the maximum-likelihood 2-1-0 IBD sharing distribution
of markers (ie. named loci). In addition to the data used by sibdes
it also utilizes partial information from cases in which the sharing
cannot be unambiguously determined.
type parameter description
compulsory
locus list the affection and marker loci to be analyzed
optional
alltypes show non-affected and concordant pairs
Algorithm Notes
The maximum-likelihood estimate is restricted so that
References
Routine: SIB2MLS
The sib2mls routine calculates the
joint maximum-likelihood 2-1-0 IBD sharing
distribution of pairs across two named loci simultaneously.
In addition to the data used by sibdes
it also utilizes partial information from cases in which the sharing
cannot be unambiguously determined.
type parameter description
compulsory
locus list the affection and marker loci to be analyzed
optional
alltypes show non-affected and concordant pairs
compare show MLS sharing for alternative sub-models
showraw display raw sharing data
Algorithm Notes
The region of maximization is restricted according to the type of
model being considered.
The present version of gas compares mls values for the single-locus,
multiplicative and general models.
References
Routine: SIBSTATE
The sibstate routine performs Identity By State analysis
on sib-pair data.
type parameter description
compulsory
locus list the affection and marker loci to be analyzed
optional
alltypes show non-affected and concordant pairs
showraw display raw statistical information
weight options are strict and hodge
Algorithm Notes
For the sibstate analysis (and any other IBS technique)
it is absolutely essential that the allele frequencies
of the marker loci are correctly set, otherwise the computed
probabilities will be meaningless.
This means that global binning (preferably using fixed bin sizes)
must be used if data is read using the
alsize
option.
References
Routine: SIBMAP
This routine gives a graphical display of how sharing between siblings
varies along the length of a chromosome,
with options to estimate recombination fraction and named-locus
order. The syntax is:
type parameter description
compulsory
locus list the affection and marker loci to be analyzed
optional
bestorder attempt to order loci using sharing data
halfsib show half-siblings with paternal/maternal options
mapfunc select map-function for distance estimates
maternal show map for maternally-derived chromosomes
maxprob show pairs having crossover probability
above this threshold
minchanges show pairs in which there are at least
n changes of sharing
mindefinite show pairs in which
at least n loci can be categorised definitely
paternal show map for paternally-derived chromosomes
sortleft sort pairs by first recombination position from left
sortright sort pairs by first recombination position from right
theta recombination values to use with maxprob
Algorithm Notes
The problem of computing a metric for all possible arrangements of loci
is called N-P complete, meaning that the time required is proportional
to the factorial of the number of possible orders. For modest numbers
References
None.
* Example *
The gasfile sib.gas reads g-format locus data from sib.loc,
and g-format pedigree data from sib.ped.
The sibdes routine is used to perform an IBD analysis,
with results sent to sibp.out.
The sibstate routine is used to perform an IBS analysis,
with results sent to sibs.out.
The sibmap routine is used twice, firstly to
show the sharing of paternal chromosomes for the sib-pairs in which at least
three of the named markers are unambiguously determined (results
in sibm1.out), and then to
to show only those pairs in which there are at least two changes in
sharing status (results in sibm2.out).
The latter analysis may be used to indicate the possibility of
double recombinants.
* Example *
The gasfile bo.gas reads g-format locus data from bo.loc,
and g-format pedigree data from bo.ped.
The sibmap routine is used to determine the most probable order
of the named loci 1-10 (which should be 1,2,3,...,10) with
the results being written out to the file bo.out.
Routine: SIBMWU
The sibmwu routine performs a non-parametric IBD analysis on a trait
which is specified in terms of a quantitative locus.
type parameter description
compulsory
locus list the affection and marker loci to be analyzed
exact sizes of dataset below which p-values calculated exactly
signif significance level for linkage
Algorithm Notes
The sibmwu routine first ranks all sibling pairs according to
the absolute difference in their value at a quantitative locus,
then uses the Mann-Witney U-test to compare the distributions of
these values within subsets of the sib-pair population, categorized
according to the amount of IBD sharing at a named locus. A result may
indicate linkage if the average rank of pairs decreases as the number
of alleles shared IBD increases, and the p-values are 1-sided towards
this direction.
References
None.
Routine: SIBHE
This routine implements the Elston-Haseman algorithm
for analyzing a quantitative trait using IBD sib-pair information.
type parameter description
compulsory
locus list the affection and marker loci to be analyzed
optional
absolute use absolute difference of values rather than square
dfweight compensate for multi-pair sibships
empiricalpv compute empirical p-values
graph draw graphs of regression plots
knownonly only pairs with unambiguous sharing are used
sexual do separate analyses for paternal and maternal sharing
showallp show p-values for +ve slope regressions
signif the value at which significant results are marked
useall pairs with no genetic sharing information are used
graphical
psgraphics regression plots with graph option
Algorithm Notes
The basic assumption of this method is that siblings sharing marker alleles
near the quantitative trait locus will be more likely to have similar
quantitative values than non-sharing siblings. Thus the mean value of
the difference between siblings should decrease as the fraction of alleles
shared increases. The sibhe routine performs a least-squares fit
using allele sharing as the independent variable, and trait difference
as the dependent variable. A significantly negative slope may be taken
to indicate linkage.
If you have allele data for more than one named locus on a chromosome
then, provided the recombination fraction between adjacent loci is less
than 0.2, you will benefit from using the interval map version
(sibihe) of this routine.
References
* Example *
The gasfile qsib.gas reads locus data from qt_mk1.loc
and qt_level.loc with pedigree data from qtrait.ped.
It calls sibhe and writes the results to qsibhe.out,
then calls sibmwu and writes these results to qsibmwu.out.
Plots of the points and best-fit lines for the sibhe regression
are written to the file qsibhe.ps.
Note the use of fprintf to add comments to the screen and output files.
Routine: SIBTABLE
This routine displays simultaneously the sib-pair sharing across a
number of affection and marker loci.
type parameter description
compulsory
locus list the affection and marker loci to be analyzed
optional
halfsib show half-sibling data
Algorithm Notes
sibtable has no analytic functions, it's only purpose is to display
the observed sharing in the pedigree data.
References
None.
Sib-Pair Interval Mapping
Sib-pair interval mapping is a multi-point method in which information
from adjacent markers is used to infer missing or ambiguous allele sharing.
Calculation of Sharing Probabilities
The calculation of sharing probabilities is carried out in 4 steps:
Ambiguous Intercrosses
With some intercrosses it can be observed that, while the actual
sharing is unobservable,
either
or
The expected sharing at such a locus is calculated using Bayes'
formula by conditioning on the nearest adjacent loci at which sharing can be
definitely assigned.
=
= _____________________________________________
VXIm(1-VIYm)VXIfVIYf
= _____________________________________________
VXIm(1-VIYm)VXIfVIYf
+(1-VXIm)VIYm(1-VXIf)(1-VIYf)
Interpolation
Once any ambiguous intercrosses have been resolved, the paternal and
maternal sharing calculations are effectively decoupled. Suppose that
a, b and c are 3 adjacent loci, and
that the sharing (Sa,
Sc)
is definitely known at the outer loci a and c, but not
at b in the centre, then Sb is calculated
using the formula
Parameters
Interval
By default the interval mapping routines infer sharing only at the
actual loci listed. The interval parameter may be used to
request that the expected sharing is calculated at points between
the loci, thus adding the command option
Showraw
The showraw parameter will display the results on a pair-by-pair basis
after stage [3] of the sharing calculation described above.
References
Routine: SIBIDES
This combines interval mapping with the t2
analysis in the
sibdes
routine.
type parameter description
compulsory
locus list the affection and marker loci to be analyzed
theta list of recombination fractions between marker loci
optional
alltypes evaluate for non-affecteds also
interval use interval map with specified recombination
mapfunc select distance mapping function
sexual do separate analyses for paternal and maternal sharing
showraw display raw sharing data
weight options are strict and hodge
graphical
psgraphics displays t2
and -log10(pvalue) along chromosome
Algorithm Notes
The expected sharing is calculated as described earlier in this
chapter, and the 1-sided t2 test applied to the results.
Do not confuse the graph of -log10(pvalue) with a lodscore
statistic - the use of the logarithm is purely to allow the
full range of data to be displayed on a sensible scale.
References
None.
Routine: SIBIHE
This routine combines interval mapping with the Haseman-Elston algorithm
described in
sibihe.
type parameter description
compulsory
locus list the affection and marker loci to be analyzed
theta list of recombination fractions between marker loci
optional
absolute use absolute difference of values rather than square
dfweight compensate for multi-pair sibships
empiricalpv compute empirical p-values
graph draw graphs of regression plots
interval use interval map with specified recombination
mapfunc select distance mapping function
sexual do separate analyses for paternal and maternal sharing
showraw display raw sharing data
signif the value at which significant results are marked
graphical
psgraphics displays -log10(pvalue),
also regression plots with graph option
Algorithm Notes
The expected sharing is calculated as described earlier in this
chapter, and the Haseman-Elston test is then applied as described
in routine sibhe.
Do not confuse the graph of -log10(pvalue) with a lodscore
statistic - the use of the logarithm is purely to allow the
full range of data to be displayed on a sensible scale.
References
None.
* Example *
The gasfile iplot.gas performs the sibides test on
the affection locus a1 and the sibihe test on the
quantitative locus `q1' using genotype data from 8 named
loci labelled 1,2,...,8. The recombination values are
different along the male and female chromatids.
Results are written to the
files iplot.out and iplot.ps.
Likelihood Calculations
The routines in this chapter are designed to perform `traditional' linkage
analysis in which alternate hypotheses about genotype/phenotype interactions
are tested by computing lodscores.
Vitesse
Jeff O'Connell's Vitesse program incorporates many new computational
techniques which enable it to perform calculations impossible for
other programs. In particular it is able to handle up to 8 loci
simultaneously in multi-point lodscores and
isn't slowed by highly polymorphic marker alleles. A more optimized
(ie. faster) version of the likelihood engine is under construction
and will be incorporated into gas as soon as it is fully tested.
Parameters
Several of the routines use a common syntax for performing particular
tasks, and some of this is described below. Refer to the individual
routine descriptions to see which features are available for each of them.
Support
Some functions are able to calculate support intervals about the location
of a maximum lodscore (ie. the adjacent region where
the lodscore is within a certain
amount of its highest value). Hence if a lodscore has a peak of 6.3
at Exclusion
An exclusion map shows regions of a chromosome where linkage is unlikely
because the lodscore is significantly below zero. The exclude
parameter is used to scan for such areas:
References
Routine: LIK2POINT
The lik2point routine performs a series of two-locus optimizations
to determine the most probable recombination fractions between pairs
of adjacent loci.
type parameter description
compulsory
locus list of loci to be analyzed
optional
allorders all possible pairs of loci are examined
exclude identify exclusion region
findmax find maximum lodscores in each interval between fixed loci
mapfunc mapping function to use
signif level to declare linkage to be significantly probable
support calculate support interval
graphical
psgraphics plots of lodscores over range
Algorithm Notes
The maximization (for the findmax option) is carried out using
Brent's algorithm, taking as starting point the highest value found
during the initial scan of the range
type parameter description
optional
initstep number of steps in initial scan of interval
maxtol maximum tolerance in optimization,
maxiter
maximum optimization iterations to attempt, References
None.
Routine: LIKMAP
The likmap routine generates a series of likelihoods giving the
probabilities that a particular locus (called `movable') lies in various
locations with respect to one or more other loci whose positions
are specified (called `fixed').
type parameter description
compulsory
locfix list of ordered fixed loci to analyze
locmov list of movable loci to analyze
theta list of recombination fractions between fixed loci
optional
doall calculate all values in subsets
dosets use subsets of fixed loci of this size
exclude level to indicate linkage is excluded
findmax find maximum likelihood position in each interval
mapfunc mapping function to use
margin the minimum distance between fixed and movable loci
showraw display `actual' likelihoods
signif level to declare linkage to be significantly probably
step the number of steps to take between adjacent fixed loci
graphics
psgraphics lodscore map across the interval
Algorithm Notes
It is essential to set the dosets parameter if more than 8 fixed
loci are used - otherwise the computation time and space are likely
to be prohibitive. For optimal performance dosets should be an
even number, with a value of 4 (the default) or 6 generally
producing good results (also, graphical output will be messy for
odd numbers of fixed loci since many points will have two values
plotted).
The text output displays only the two recombination fractions to either
side of the movable locus, since for each order the others
are fixed by the input parameters.
References
None.
* Example *
The gasfile twop.gas uses lik2point to calculate
the most-probable recombination
fractions between a series
* Example *
The gasfile map.gas demonstrates the use of likmap to
create a table showing the likelihood of
the loci try1 and try2 being at various locations along a
chromosome on which the five markers
Routine: LIKSINGLE
The liksingle routine performs a single likelihood calculation
for a fixed set of loci and recombination fractions.
type parameter description
compulsory
locus list of ordered loci to be analyzed
theta list of recombination fractions between fixed loci
optional
genlod computes Ott's generalized lodscore
Algorithm Notes
Because of internal variations in algorithms, the likelihoods calculated
by two programs for the same dataset may vary enormously.
However the ratio of two likelihoods (as computed by the same program)
should be invariant between programs, and thus the lodscores
produced by such programs should be very similar.
References
None.
* Example *
The gasfile sin.gas demonstrates the use of liksingle to
show the likelihood of mka, mkb and mkc lying on the
same chromatid separated by recombination fractions
theta=0.35
Association Analysis
The routines for association analysis look for correspondences
between the occurrences of particular alleles of named loci
and the values of traits in the population.
Routine: ASSTDT
The asstdt routine performs association analysis between a
marker and an affection locus using the Transmission Dis-equilibrium
Test.
type parameter description
compulsory
locus list of affection and marker loci to analyze
optional
sexual show separate analysis for paternal and maternal alleles
signif set significance criteria
weight reduce contribution of multiple sibships
Algorithm Notes
The standard algorithm follows Spielman's advice of treating all children
as independent observations, summing their transmitted and non-transmitted
alleles, and calculating the significance using the exact 1-sided binomial
distribution.
References
Routine: ASSCOMPARE
The asscompare routine compares the allele frequencies between
two groups of subjects denoted by y/n values at an
affection locus.
type parameter description
compulsory
locus list of affection and marker loci to analyze
optional
sexual show males and females separately
signif set significance criteria
useall use half-known genotypes
Algorithm Notes
For a locus with n alleles, gas constructs
a 2xn contingency
table showing how often each allele occurs in the two populations
(ie. the sets of people labelled y
References
None.
Routine: ASSMWU
The assmwu routine performs association analysis between a
marker and a quantitative locus, using the Mann-Witney
U-Test (equivalent to the Wilcoxon Rank-Sum test).
type parameter description
compulsory
locus list of quantitative and marker loci to analyze
optional
allinfo give extra information
cutoff cutoff for displaying p-values
exact sizes of dataset below which p-values calculated exactly
sexual show separate results by subject sex
signif set significance criteria
Algorithm Notes
Two tests are performed. The first treats each allele as a separate
observation, so that a subject with genotype
References
None.
* Example *
The gasfile assoc.gas reads pedigree data from the file
assoc.ped. The asstdt
routine is used on the loci disease and marker1,
and the assmwu test is used on response
and marker1.
The results are written to the files
tdt.out and mwu.out respectively.
Routine: ASSRELPREF
The assrelpref routine performs association analysis between a
marker and an affection locus, using the Relative Predispositional
Effect technique.
type parameter description
compulsory
locus list of quantitative and marker loci to analyze
optional
alltypes results are shown for non-affected subjects
signif set significance criteria
sexual males and females are analyzed separately
Algorithm Notes
The RPE method calculates a p-value for each allele individually according
to the formula
chi2
=(Oi-Ei)2/Ei
where Oi is the observed number of occurrences of
References
Routine: ASSGENORR
The assgenorr routine performs association analysis between a
marker and a condition using the Genotype Relative Risk method,
in which the observed distribution of named alleles in a subset of
the pedigree is compared to that predicted from the input allele
frequencies (entered earlier using
the
type parameter description
compulsory
locus list of marker (and optionally affection) loci to analyze
allele list of alleles of the marker loci to analyze
optional
inpairs compare two single genotypes
incommon compare genotype against others with allele in common
allother compare genotype against all others
signif set significance criteria
Inpairs
The inpairs option tests listed pairs of alleles against each other.
For example, the command
Incommon
The incommon option tests specific allele pairs against all the
haplotypes sharing a particular allele in common with them.
The command
Allother
The allother option tests specific allele pairs against all
other allele pairs simultaneously.
The command
Algorithm Notes
The genotype relative risk (RAB) of genotype
set A individuals compared to
genotype set B individuals is calculated according to the formula
References
* Example *
The gasfile grr.gas reads pedigree data from assoc.ped
and uses the assgenorr method to compare the
Routine: ASSEMPLOG
The assemplog routine performs association analysis between a
pair of named or affection loci
and a condition using the Empirical Logistic method.
type parameter description
compulsory
locus list of marker and affection loci to analyze
optional
showraw display raw Z terms and variances
signif set significance criteria
Algorithm Notes
In cases where there are no subjects possessing a particular
genotype/phenotype combination, the variance of the statistic becomes
infinite and a p-value of 1 is returned.
References
* Example *
The file elm.gas read data from elm.loc and
elm.ped and performs the assemplog analysis.
(the subjects are all `singletons' and gas will generate
warnings about this - press `c' to continue at each stage
(if you
Routine: ASSHAPRR
The asshaprr routine performs association analysis between a
marker and an affection locus using the Haplotype Relative Risk
Test.
type parameter description
compulsory
locus list of marker and affection loci to analyze
Algorithm Notes
None.
References
None.
Haplotyping
Haplotyping is the process of determining which alleles in an
un-ordered genotype are descended from each of a subjects parents,
and thus (when this is done for several linked loci)
re-constructing segments of the chromatids within each subject.
Routine: HAPCHILD
The hapchild routine determines the allelic phase of the genotypes
of children within the input population. For those children at which
the phase can definitely be decided for all or their alleles, the
observed haplotypes are ordered in decreasing frequency. The syntax is:
type parameter description
compulsory
locus list of affection and marker loci to analyze
optional
sexual show separate analysis for paternal and maternal chromatids
Algorithm Notes
The hapchild routine only uses the alleles for which the parental
origin can be definitely determined - there is no attempt to assign
probabilities to ambiguous cases (which are marked x and ignored
when counting haplotype frequencies).
The haplotypes are listed in order of decreasing frequency.
References
None.
* Example *
The gas-file chap.gas loads pedigree data from chap.ped
and uses hapchild to calculate the most frequently occurring
haplotypes in the children. Results are sent to the
file chap.out.
End of Gas Manual v2.3