Imputation of Cow Genotypes and Adjustment of PTAs
You are here: Beltsville Area Home / Beltsville Agricultural Research Center / Animal Genomics and Improvement Laboratory / Animal Improvement Program

usda logo AIPL RESEARCH REPORT
GENOMIC1 (4-10)

Imputation of Cow Genotypes and Adjustment of PTAs

T.A. Cooper, M.E. Tooker, P.M. VanRaden, G.R. Wiggans, and J.B.Cole
Animal Improvement Programs Laboratory, ARS-USDA, Beltsville, MD 20705-2350
301-504-8334 (voice) ~ 301-504-8092 (fax) ~ http://aipl.arsusda.gov

Imputation || Cow adjustment

Two new techniques were introduced in April 2010 to incorporate all available information in the evaluations. The use of imputed genotypes has added over 1600 cows to the genomic database, and adjusting cow evaluations has increased accuracy. All other countries that are producing genomic evaluations do not use any cow information in genomic predictions. The US has made the decision to use cow information when it is available, rather than ignore the female contribution to the genomic equation. In order to do this, cow evaluations need to be similar to those of the bulls. Adjusting the cows increases the accuracy of the evaluations, and allows all the data available to be used in the genomic equations. The largest change occurs in the genotyped or imputed cows that have high reliability because their PTAs are less influenced by the parent average (PA), and receive larger adjustments. Big changes were also observed in popular bulls such as Shottle. These changes were due in part to how productive life is now calculated as well as the impact the adjustment had on bulls with a high number of genotyped daughters. Several specific examples of cows that had extreme drops in their PTAs have been hand inspected and the new PTAs are more closely related to the performance of their offspring.

Imputation

Which cows have imputed genotypes?

Any non-genotyped cow where 90% of the single nucleotide polymorphism (SNP) genotype can be determined is imputed. This usually requires 4 or 5 genotyped progeny. Some animals reach that threshold because daughters with imputed genotypes are included. The first offspring contributes ~50% of the imputed SNP information, the second contributes ~25% more, the third ~12.5%, the fourth ~6.25% and so on until the 90% minimum is reached. Any additional offspring increases the accuracy.

Is phenotypic information used to determine an imputed genotype?

No, the genotype of the animal is determined only by the genotype of her progeny and/or her ancestors, and is completely independent of any phenotypic information.

How accurate are the imputed genotypes?

As reported by [VanRaden (2010)], the imputed genotypes are, on average, between 96.6% and 97.9% accurate depending on breed. These values were tested using simulated data. The procedure was recently tested using a cow that was imputed for research purposes, and which was subsequently genotyped using the BovineSNP50 chip (Illumina, San Diego, CA). The imputed genotype and the genotype received from the chip were 98% similar.

Breed Number % Correct Calls
Holstein 1308 97.4
Jersey 141 97.9
Brown Swiss 56 96.6

Cow adjustment

Why did the genotyped cows need an adjustment?

Genotyped cows usually are the most valuable cows in a dairy herd. These cows needed an adjustment because they were being over-evaluated. The high PTA values were causing the genomic predictions to suffer in accuracy because the marker effects were trying to explain a phenotype that was inflated compared to the bulls that carried those same genes. A study completed over a year ago showed that the added information from genotyped cows was not increasing reliability, indicating that some type of adjustment was necessary. By adjusting the over-evaluated cow PTA, it was possible to increase the accuracy of the genomic predictions. The following table shows the gains in reliability over PA when all genotyped animals were included in the predictor population, and when cows were excluded. Negative differences indicate traits where a greater gain in reliability is achieved when the females are not included in the predictor population.

Gains in Reliability as of 2008

Trait Holstein Jersey
Reliability gain1
(percentage units)
Gain difference Reliability gain1
(percentage units)
Gain difference
Bulls Only Bulls and Cows Bulls Only Bulls and Cows
Net merit 23.4 23.6   0.2   7.4   8.9   1.5
Milk yield 26.7 26.3 −0.4   9.0   6.8 −2.2
Fat yield 33.4 31.9 −1.5 13.8 12.5 −1.3
Protein yield 24.3 24.1 −0.2   3.1   1.4 −1.7
Fat percentage 49.8 50.3   0.5 39.0 37.8 −1.2
Protein percentage 37.2 37.5   0.3 26.3 27.4   1.1
Productive life 31.3 32.5   1.2   7.7 10.7   3.0
Somatic cell score 22.0 23.1   1.1   5.9   7.5   1.6
Daughter pregnancy rate 27.0 27.7   0.7 10.4 10.6   0.2
1Reliability gain based on 2004 information to predict 2008 bull performance. 2010 Gains including imputed animals and adjustment.

What animals were adjusted and how were they adjusted?

All genotyped cows, heifers, and imputed cows were adjusted by reducing the mean and variance of the traditional PTA. All genotyped animals including bulls, were affected by the adjustment made to the maternal portion of the parent average, regardless of the genotype status of the dam. The adjustments were made to milk, fat and protein and the percentage traits only. Brown Swiss adjustments where not implemented due to low numbers of genotyped cows.

Genomic evaluations are calculated using deregressed values from traditional PTA to estimate SNP effects. The upward bias of traditional cow PTA may be the reason for this. The direct genomic value (DGV) is the sum of an animal’s SNP effects. It should be consistent with traditional PTA which is true is for bulls. For cows, however, the traditional PTA is higher. To make the cow PTA more like those of the bulls for the yield traits (milk, fat and protein), mean and variance adjustments were calculated. Evaluations were stratified by reliability so cow PTA could be adjusted to be similar to bulls with the same reliability. The variance adjustment was the SD of deregressed Mendelian sampling within reliability group for bulls divided by the SD of deregressed Mendelian sampling within reliability group for cows. The mean adjustment is the difference between bull and cow evaluations after variance adjustment. Corrected PTA values were calculated from the adjusted deregressed Mendelian sampling values after reversing the deregression. Adjusted deregressed values were then used to calculate SNP effects. SNP effects are the contribution of that SNP to the total PTA for a given trait. This value can be positive or negative. The SNP effects are then applied to animals based on their genotype in combination with polygenic effects and the adjusted traditional values. This combined value is known as the genomic PTA. The following table shows the variance and mean adjustments applied to the deregressed values of the traditional cow PTA and maternal portion of the PA.

Breed Milk Fat Protein
Std. Deviation Mean Std. Deviation Mean Std. Deviation Mean
Holstein 0.84 −784 0.72 −27.5 0.77 −23.0
Jersey 0.72 −643 0.67 −31.4 0.67 −24.2

If the adjustment of cows is needed, why was it only done to genomically tested cows?

The undesirable consequence of the adjustment is that non-genotyped cows are not fully comparable with genotyped ones. It may be useful to think of this as though genotyped cows have been converted to have evaluations comparable with those of bulls, and non-genotyped cows have evaluations that are the same as they have always been. There is a new link on the AIPL website that ranks only genotyped cows by $NM. A research project showed that a similar result can be achieved for all cows by lowering the heritability of the yield traits. The industry did not approve the change to heritability, so the focus moved to fixing those animals that are contributing to the genomic equations. However, not being able to compare genotyped and non-genotyped cows is leading us to reconsider implementing the change in heritability.

What are the effects of the cow adjustment?

This table shows the gain in reliability for predicting the performance of young bulls using bulls and adjusted cows to predict SNP/marker effects verses using bulls and unadjusted cows. These gains in reliability are also observed in cows and heifers. The full paper is available [Wiggans et al.(2010)].

Trait Gain in Reliability
Holstein Jersey
Milk 3.0 8.8
Fat 3.1 8.8
Protein 2.1 8.6
Fat % 2.6 7.7
Protein % 2.7 9.4

Resulting genomic evaluations were more accurate when the rescaled cow PTAs were included in estimation of marker effects. Genomic evaluations for the top cows, top young bulls, and top heifers decreased by about 250 pounds for milk, 8 pounds for fat, and 5 pounds for protein, whereas genomic evaluations for the top bulls with daughters decreased only by about 70 pounds for milk, 3 pounds for fat, and 2 pounds for protein. Adjustments were largest for foreign bulls with a high proportion of genotyped daughters. The population average of all bulls with daughters decreased slightly by 40 pounds for milk, 1.5 pounds for fat, and 1.5 pounds for protein; standard deviations also decreased slightly by about 1%. Correlations between genomic evaluations before and after the adjustment were 0.997 for bulls with daughters and 0.990 for cows and young animals.

Example:

The PTA milk for Holstein cow A dropped from 1982 pounds in the January release to 381 pounds in the April release. She has several offspring, two are full sisters (whose sire is a high reliability (99%) bull) and will be used to illustrate the new changes.

Evaluation from January 2010 (all values are PTA milk pounds):

  • Cow A = 1982
  • Daughter X = 1124
  • Daughter Y = 1636
  • PA for both daughters = 2048

Both offspring are performing below parent average.

Evaluation from April 2010:

  • Cow A = 381
  • Daughter X = 996
  • Daughter Y = 1460
  • PA for both daughters = 1178
This illustrates the inflation of the maternal portion of the PA. By adjusting the milk PTA of the dam, Cow A, daughter X is below the PA by 182 and daughter Y is above by 282. Cow A's Milk PTA of 381 pounds more accurately predicts the performance of her offspring.

How can I compare genotyped cows to non-genotyped cows?

To provide information on the comparison between genotyped and non-genotyped cows the regression of genomic PTA on traditional PTA was calculated. The resulting slope (b) and intercept (a) values (below), can be used to give an estimate of how the non genotyped PTAs compare to the genotyped PTAs and vice versa.

Holstein

Birth Year Milk Fat Protein
a b a b a b
< 2000 −294 0.76 −10 0.70 −8 0.70
2001 −278 0.76 −10 0.70 −8 0.70
2002 −243 0.76 −7 0.70 −6 0.70
2003 −196 0.76 −6 0.70 −5 0.70
2004 −176 0.76 −4 0.70 −3 0.70
2005 −153 0.76 −2 0.70 −2 0.70
2006 −113 0.76 −0 0.70 1 0.70
> 2007 −26 0.76 3 0.70 1 0.70

Jersey

Birth Year Milk Fat Protein
a b a b a b
< 2004 −433 0.87 −21.2 0.84 −15.6 0.85
> 2005 −176 0.80 −6.2 0.73 −3.3 0.71


Genomic PTA = a + (b * Traditional PTA)


If you then want to see the effect changes to milk, fat, and protein have on $NM due to genotyping and the cow adjustment, use the value ($/PTA unit) Milk - $0.0001, Fat - $2.89, Protein - $3.41. Net Merit is affected by changes in many traits, the changes below strictly show those due to milk, fat and protein.

Genomic (Gen) $NM = Traditional (Trad) $NM - (Trad Milk PTA - Gen Milk PTA)*.0001 - (Trad Fat PTA - Gen Fat PTA)*2.89 - (Trad Protein PTA - Gen Protein PTA)*3.41


Example:

Holstein Cow Born in 2006 Traditional Evaluation:

Milk PTA = 1000 pounds

Fat PTA = 25 pounds

Protein PTA = 15 pounds

$NM = 400

Comparable Genomic PTA:

Genomic Milk PTA = -113 + (.76 * 1000) = 647

Genomic Fat PTA = 0 + (.70 * 25) = 17.5

Genomic Protein PTA = 1 + (.70 * 15) = 11.5

$NM = 400 - (1000 - 647)*.0001 - (25 - 17.5)*2.89 - (15 - 11.5)*3.41 = 366

How do descendants of genotyped animals (genomic indicator code 2) compare?

Trait (Genomic PTA - Traditional PTA)
Holstein Jersey
Milk −57 −67
Fat −2.5 −2.1
Protein −2.1 −2.6

Animals with genotyped ancestors are more closely related to the genotyped population. The above table shows how much a cow with genotyped ancestors (genomic indicator code 2) can be expected to change when they are genotyped.