Invited review: Unknown-parent groups and metafounders in single-step genomic BLUP

Y. Masuda,1* P.M. VanRaden,2 S. Tsuruta,1 D.A.L. Lourenco,1 and I. Misztal1

1Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602
2USDA, Agricultural Research Service, Animal Genomics and Improvement Laboratory, Beltsville, MD 20705-2350

*Corresponding author


2021 J. Dairy Sci. (?)
© American Dairy Science Association, 2021. All rights reserved.
Individuals may download, store, or print single copies solely for personal use.
Do not share personal accounts or passwords for the purposes of disseminating this article.

ABSTRACT

Single-step genomic BLUP (ssGBLUP) is a method for genomic prediction that integrates matrices for pedigree relationships (A) and genomic relationships (G) as the inverse of a unified matrix into a linear mixed model. In dairy cattle, pedigree information is often incomplete. The missing pedigree potentially causes the bias and inflation of genomic estimated breeding value (GEBV) in ssGBLUP. There are 3 whole issues relevant to the missing pedigree: unaccountability for selection, missing inbreeding in pedigree relationships, and incompatibility between G and A in level and scale. The issues can be solved using a proper model of unknown parent groups (UPG). The UPG theory was well-established in pedigree BLUP but is unclear in ssGBLUP. This study reviewed the development of the UPG model in pedigree BLUP, the property of UPG models in ssGBLUP, and its impact on genetic trends and genomic predictions. The similarities and differences between UPG and metafounders (MF), a generalized UPG model, were also reviewed. A UPG model (QP) is derived based on a transformation of mixed-model equations; this model has a good convergence behavior to solve the equations, but without enough data, it may cause biased genetic trends and underestimated UPG effects due to the confounding among GEBV, UPG effects, and the general mean for genotyped animals. The QP model can be altered by removing the genomic relationships linking GEBV and the UPG effects; the altered model results in less bias in genetic trends and less inflation in genomic predictions than the QP model, especially for large data sets. A new model encapsulates the UPG equations into the pedigree relationships for genotyped animals; it works well in simulation in purebred populations. The MF model is a comprehensive solution to the missing-pedigree issues; it is a choice for multibreed or crossbred evaluations if the data set allows estimating a reasonable relationship matrix for MF. The missing pedigree is influential on genetic trends but should be negligible on the predictability of genotyped animals when many proven bulls are genotyped. In such a situation, the SNP effects can be back solved from GEBV of the older genotyped animals, and the indirect prediction based on the SNP effects is useful to calculate GEBV for young, genotyped animals with missing parents.

Key Words: bias, genomic selection, inflation, pedigree, relationship matrix