/usr/lib/R/site-library/haplo.stats/NEWS

=================================================================
#                    HAPLO STATS News File
#                     
#        This file documents software changes up to version 1.3.8
#
#        format is as follows
#        ---------------------------
#        [change/add]: function.name  title for issue
#        explanation of issue, status, and recommendations 

===================================================
### changes made between releases 1.3.8 and 1.4.1
===================================================

--------------------
change: seqhap  --use precision threshold for permuation p-values
Adapt the permutation rules used in haplo.score's sim.control
parameter to ensure accuracy and precision thresholds for permutation
p-values.  The permutations are carried out in seqhap.c, so the
parameters p.threshold, min.sim, and max.sim are passed to the C code
to permute the response until precision criteria met.  No longer use n.sim
parameter; now sim.control=score.sim.control() handles the permutations.

-------------------
update: user manual
The user manual has been updated from version 1.3.1 to reflect all the
updates since then, and will be placed on Dan Schaid's software page, in
addition to its current location within the package.

-------------------
update:  help files for example datasets contained \item keyword with
no text, which didn't pass R CMD check on R 2.8.1.  Now they pass.


===================================================
### changes made between releases 1.3.6 and 1.3.8
===================================================

---------------------------------
change: plot.seqhap  --handle small p-values
Handle very small p-values better by having a minimum allowable
asymptotic p-value of .Machine.double$eps, and permutation p-value of
1/(n.sim+1).  It will also handle a ylim value if passed.
Add more useful warning messages for when p-values are fixed for
plotting.

----------------------------------
change: haplo.score  -add eps.svd
In some assocation tests from haplo.score, we have observed extremely
significant values for the global association test statistic. The
degrees of freedom for the global test is the rank of the score
vector's variance matrix.  We found the source of the problem was
having too low a cutoff (epsilon) for svd values for 
determining rank of the variance matrix.  We increased the default for
the epsilon from 1e-6 to 1e-5 and allow it to be changed by the user
as the eps.svd parameter in any function that uses haplo.score
(haplo.score.slide, haplo.cc).

---------------------------------------
change: haplo.cc parameters
We remove haplo.min.count as a top-level parameter; it can only
be used in the control() function, just as in haplo.glm.
Note that haplo.freq.min can also be used.  
The eps.svd parameter is also added, as noted for haplo.score.


===================================================
### changes made between releases 1.3.0 and 1.3.6
===================================================

----------------------------------
add: haplo.power.qt and haplo.power.cc:
Power and sample size calculations for haplotype association studies.
Calculations are performed given a set of haplotypes, their freqs, and
their beta coefficients, which can be converted by log(OR) for 
case-control (cc) or calculated for quantitative trait (qt) by R2 
variance explained by gene association.  For qt, use the
find.haplo.beta.qt to get these beta coefficients.  

-------------------------
added: dataset hapPower.demo
An example data set hapPower.demo is included in the package for
demonstrating the haplo.power.qt/cc functions in example() and in the manual.

------------------------------
change: haplo.em
In past versions, a change was made to pre-calculate how much memory
would be needed for all haplotype pairs, and issued a warning if that
memory could not be allocated.  It stopped calculations
that could have been completed by progressive insertion & trimming
steps because rare haplotypes are trimmed off and memory rarely meets
the max.  So the warning is taken off.

------------------------------
change: haplo.em.control:   min.posterior
The old default for min.posterior was set at 1e-7.  In rare cases of some
datasets that had low LD and 10 or more markers, the trimming steps
actually trimmed away all haplotypes for a given person and the person
was removed.  We have changed min.posterior to 1e-9 and put in
warnings and check for this occuring.  Note, we have only observed this in
simulated data on very rare occasions.

-------------------------------
change: haplo.glm  remove allele.lev and miss.val parameters
We used to require the use of allele.lev as a parameter for haplo.glm,
and allow miss.val to specify codes for missing alleles in the
genotype matrix.  However, we require using setupGeno to prepare the
genotype matrix to be used in haplo.glm, after it is added to the
data.frame to be passed to haplo.glm.  miss.val is completely taken
care of there, and allele.lev is assigned as an attribute of geno.  
We have re-worked the formula and na.geno.keep to recognize these
values when it finds geno in the formula; therefore, these parameters
are not required in haplo.glm.

----------------------------
change: na.geno.keep
We used to keep all subjects who were missing any number of
alleles. However, if a subject is missing all alleles, they both slow
the calcualtions down, and don't add any information to the analysis.
This function still removes subjects missing y or covariate values,
and now removes subjects missing all their alleles.  After the
removal, the attributes of the genotype matrix are re-calculated and
retained for its use in haplo.model.frame.  

-------------------------
changes: haplo.model.frame
Get allele.lev from geno in m[[]], not as passed paremeter from haplo.glm.


---------------------------
change: haplo.glm.control
enforce the default setting for haplo.min.count and haplo.freq.min in
the function delcaration.  In the declaration they were NA, but a
default min.count of 5 was enforced.  We have changed the default of
haplo.freq.min of .01 to be enforced, and the delcaration now reflects the
enforced default.

-------------------------
changes: Ginv.q and Ginv.R
Nothing has changed for R.  Splus version 8.0.1 has a problem in its
use of the svd fortran function, as called by svd.Matrix.  We
contacted Insightful and they fixed it for version 8.0.4.  We include
the svd.Matrix function from version 7 and 8.0.4 in the Ginv.q file,
but only load it if the Splus version matches 8.0.1.

-------------------------------
change: louis.info.c
Prior efforts to make all long integer values as int was not completed
for this function.  The result was the package didn't work on linux
64bit machines.  Now it doesn't use long, and it should work on most
platforms.

--------------------------
change: louis.info.q  
When the variance of a quantitative trait is so high that the the
information matrix becomes ill-conditioned, the Ginv determines the
information matrix singular, and the standard errors are
incorrect. Change the epsilon parameter for the generalized inverse to
about 1e-8, versus the old default in Ginv of 1e-6.  


=========================================================
#### changes made between release 1.2.5 and 1.3.0  #####
=========================================================

--------------------------
seqhap: sequential haplotype selection in a set of loci
For choosing loci for haplotype associations, as 
described in Yu and Schaid, 2007.  The method performs three tests 
for association of a binary trait over a set of bi-allelic loci. 
When evaluating each locus, loci close to it are added in a sequential
manner based on the Mantel-Haenszel test. 

--------------------------
geno1to2: convert geno from 1- to 2-column
convert 1-column minor-allele-count matrix to two-column 
allele codes

---------------------
plot.haplo.score.slide: handle near-zero pvalues
For asymptotic pvalues near zero, set to epsilon.
For simulated, set to 0.5 divided by the number of simulations performed

-----------------------
haplo.design: create design matrix for haplotypes
In response to many requests made for getting columns for haplotype
effects to use in glm, survival, or other regression models, we
created a function to set up this kind of design matrix.  There are
issues surrounding the use of these effect columns, as outlined in the
user manual.  

----------------------
Ginv: svd problems continue
The Matrix library svd function has changed for Splus 8.0.1.
Therefore, revert back to the default svd function in getting the
generalized inverse.


=========================================================
#### changes made between release 1.2.0 and 1.2.5  #####
=========================================================

-----------------------------------
haplo.glm: Iterative steps efficiency
In consecutive steps of the IRWLS steps in haplo.glm, the starting
values for re-fitting the glm model were not updated to be the most
recently updated values.  This now saves about 20% of run time in
haplo.glm.

-----------------------------------
haplo.score: haplo.effect allow additive, dominant, recessive 
A new option to make haplo.score more flexible.  Previously the scores
for haplotypes were computed assuming an additive effect for all
haplotypes.  A new parameter, haplo.effect, is in place to allow
either additive, dominant, or recessive effects.  

-----------------------------------
haplo.score:  min.count parameter
The cut-off for selecting haplotypes to score is either by a minimum
frequency, skip.haplo, or a new option, min.count.  The min.count is
based on the same idea as that used in haplo.glm, where the minimum 
expected count of haplotypes in the population is enough such that 
accurate estimates of parameters and standard errors are computed.  The
min.count became needed when haplo.effect was added because under
the dominant or recessive models, the number of persons actually
having a haplotype effect could be fewer than the expected count
over the population (i.e., haplotype pair h1/h2 is coded as 0 for
both under recessive model, and h1/h1 is coded as 1 under dominant).  

---------------------------------------
haplo.em:  improved reliability of C routines
Previously problems had been observed with running haplo.em and
haplo.glm on linux 64-bit machines, because of issues with the storage
of integers in R.  In R, all integers are stored as int, which are
stored differently on 64-bit and 32-bit machines.  We get around this
problem by using all int types for integers, which are only used for
indices of other data structures.  We find out the max value for integers
on the system, and if the indices are going to exceed the max, issue a
warning from C.  

--------------------------------------
haplo.glm and Ginv: improvement of standard error calculations
Under some extreme circumstances, such as haplo.glm modeling
haplotypes with rare frequencies, or a high amount of variance in the
response, the standard error estimates were unreliable.
The issue came out in the Ginv function in haplo.stats, which needed a
smaller epsilon to decide on the rank of the information matrix. 


=========================================================
#### changes made between release 1.1.1 and 1.2.0  #####
=========================================================

---------------------
haplo.em:  fixed memory leak
Versions up to 1.1.1 had either one or two memory leaks in haplo.em.
They are fixed.

---------------------
All .C functions: Long Integers warning for 64-bit machine
Due to problems with long integers between 32-bit and 64-bit machines
using R, all integers used in C functions will use unsigned integers.

---------------------
haplo.glm:  haplo.effect="recessive"
the estimation stops if no columns are left in the model.matrix for 
homozygotes with the haplotype, and for haplotypes that do not have 
any subjects with a posterior probability of being homozygous for the 
haplotype, those subjects are grouped into the baseline effect.
Guidelines for rare haplotypes are explained further in the manual.

---------------------
haplo.glm:  na.action, when not specified got set to something besides
the intended 'na.geno.keep'.  Now the default setting works.

---------------------
haplo.cc: New Function for Case-Control Analysis
New function added to combine methods of haplo.score,
haplo.group and haplo.glm into one set of output for Case-Control
data.  Choose haplotypes for analysis by haplo.min.count only, not a 
frequency cut-off.

------------------
haplo.score: skip.haplo new default
Default for skip.haplo is now 5/(nrow(geno)*2)

------------------
haplo.glm: haplo.freq.min and haplo.min.count control parameters
Haplotypes used in the glm are still chosen by haplo.freq.min, but
the default is based on a minimum expected count of 5 in the
sample. The better choice for selecting haplotypes is
haplo.min.count. The issue is documented in the manual and help files.

-----------------
haplo.score: max-stat simulated p-value
A better description of this is included in the manual and help file

------------------
haplo.em.control and haplo.em:  defaults for control parameters
changed
The default for control parameter:
max.iter=5000, changed from 500
insert.batch.size = 6, changed from 4
 
------------------
locus
The genetics package for R has a function named locus which does not
agree with locus from haplo.stats.  We do not plan to change it, so be
aware of the possible clash if you use these two packages

------------------
haplo.scan:  new function
For analyzing a genome region with case-control
data.  Search for a trait-locus by sliding a fixed-width window over 
each marker locus and scanning all possible haplotype lengths within 
the window


=================================================================
### changes made prior to release 1.1.1  #####
=================================================================

----------------------
haplo.glm: Warnings for non-integer weights
glm.fit for R does not allow non-integer weights for subjects, whereas
S-PLUS does.  Use a glm.fit.nowarn function for R to ignore warnings.

---------------------
haplo.glm: Character Alleles
Local settings for strings as factors causes confusion for keeping
orinial character allele values.  To ensure consistency of allele
codes, use setupGeno() and then in the haplo.glm call, use allele.lev
as documented in the manual and help files.

---------------------
haplo.score.slide: add to package
Run haplo.score on all contiguous subsets of size n.slide from the 
loci in a genotype matrix (geno).

---------------------
haplo.score: simulations controlled for precision
Employ simulation precision criteria for p-values, adopted from
Besag and Clifford [1991].  Control simulations with 
score.sim.control.
r-cran-haplo.stats 1.4.4-1 / usr / lib / R / site-library / haplo.stats / NEWS