Animal Improvement Programs : Software / findhap.f90
You are here: Beltsville Area Home / Beltsville Agricultural Research Center / Animal Genomics and Improvement Laboratory / Animal Improvement Program


ANIMAL IMPROVEMENT
PROGRAM


findhap.f90 Find haplotypes and impute genotypes using multiple chip sets.
DOWNLOAD: Version 2 program, example files, and executable
INPUTS:  
genotypes.txt Format: animal# chip# #SNPs genotypes
Sort by animal#, genotype codes are 0,1,2, and 5 = missing
For fixed length input, set chip# to 1 and missing genotypes to 5
For variable length input, #SNPs and order must match chromosome.data
chromosome.data List of all SNPs used and which SNPs are on each chip
Sort by chromosome number and position within chromosome
X-specific chromosome last, after pseudo-autosomal 'chromosome'
Y-specific SNP not supported yet
pedigree.file Format: sex  animal#  sire#  dam#  birthdate  animal ID  animal name
Sort in ascending birth date order
findhap.options Program control file with user-defined options
OUTPUTS:  
hap.list List of all haplotypes found in each segment
hap.found Each animal's paternal and maternal haplotypes (2 lines / animal)
hap.inherit Tracks inheritance and crossovers for each parental chromosome
hap.filled Summarize imputation quality for each animal
cross.overs Lists exact location of all detected crossovers
allele.frequency Estimated allele frequencies and missing rates for each SNP
genotypes.filled Imputed genotypes with codes: 0 = BB, 1 = AB, 2 = AA, 3 = B_, 4 = A_, 5 = __
Number of animals output may exceed input because of imputed dams
Remaining missing alleles in code 3, 4, and 5 can be set using allele frequencies
haplotypes.txt Imputed haplotypes: SNP1 paternal maternal, SNP2 pat mat, etc for each animal
No missing alleles, allowing genotypes to be formed simply as (pat + mat - 2)
 
VERSION 1 REFERENCES: 2010 Interbull Bulletin 42, 4 pages
2010 9th World Congress Genetics Appl. Livest. Prod., Comm. #27
2011 Genetic Selection Evolution (submitted)
VERSION 2 DIFFERENCES: Options file uses maxlen, minlen, and steps to divide long segment into shorter segments.
Computing time increases by number of steps used to get from maxlen to minlen.
Population and pedigree haplotyping in one loop vs. 2 separate loops.
Searches for great grandparent haplotypes, not just genotyped parents and grandparents.
Higher accuracy and / or fewer high density genotypes required.
LICENSE: Fortran program findhap.f90 is public domain and was developed with US taxpayer funding. Accurate results are not guaranteed. Please report any bugs to Paul.VanRaden@ars.usda.gov. You may modify, improve, use, and redistribute the code to anyone for any purpose. Or, you can ask Paul to make changes that could benefit US evaluations and other users.


Paul VanRaden
USDA Animal Improvement Programs Laboratory

Last Modified: 02/23/2011