Alternative SNP weighting for SSGBLUP evaluation of stature in US Holstein in the presence of selected sequence variants1

B.O. Fragomeni*2, D.A.L. Lourenco, A. Legarra, P.M. VanRaden§, I. Misztal

*Department of Animal Science, University of Connecticut, Storrs-Mansfield, CT 06269
Department of Animal and Dairy Science, University of Georgia, Athens, GA 30602
Institut National de la Recherche Agronomique, UMR1388 GenPhySE, Castanet Tolosan, France 31326
§Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD, USA

1This study was partially funded by Agriculture and Food Research Initiative Competitive Grants no. 2015-67015-22936 from the US Department of Agriculture’s National Institute of Food and Agriculture.
2Corresponding author


2019 J. Dairy Sci. (?)
© American Dairy Science Association, 2019. All rights reserved.
Individuals may download, store, or print single copies solely for personal use.
Do not share personal accounts or passwords for the purposes of disseminating this article.
 

ABSTRACT

Causal variants inferred from sequence data analysis are expected to increase accuracy of genomic selection. In this work we evaluate the extra accuracy using selected sequence variants, for US Holstein data and the trait stature, by three prediction methods: GBLUP using de-regressed proofs assuming either homozygous (HOMVAR – ignoring the number of equivalent daughters per bull) or heterozygous (HETVAR) residual variances, and by single-step GBLUP (ssGBLUP) using actual phenotypes.

Phenotypic data included 3,999,631 records for stature on 3,027,304 Holstein cows. Genotypes on 54,087 SNP markers (54k) were available for 26,877 bulls. Addition of 16,648 selected sequence variants were available, for a total of 70,735 (70k) markers.

In all methods, SNPs in the genomic relationship matrix (GRM) were unweighted or weighted iteratively, with weights derived either by quadratic functions of SNP effects or Nonlinear A. Adjusted reliabilities of predictions were obtained by cross validation. With unweighted GRM derived from 54k markers, the reliabilities (*100) were 68.8 for GBLUP HOMVAR, 72.4 for GBLUP HETVAR, and 75.3 for ssGBLUP. With unweighted GRM derived from 70k markers, the reliabilities were 69.5, 73.4 and 76.0, respectively. Weighting by Nonlinear A changed reliabilities to 70.9, 73.3, and 75.9, respectively. Addition of selected sequence variants increased accuracy very little. Weighting by quadratic functions reduced reliabilities. Weighting by Nonlinear A increased accuracies in GBLUP HOMVAR but had only a small effect in ssGBLUP. Reliabilities by DGV extracted from ssGBLUP using unweighted GRM with 54k were higher than reliabilities by any GBLUP. Thus, ssGBLUP seems to capture more information than GBLUP and there is less room for extra accuracy. Improvements with weighting may partly due to deficiencies in the model such as incorrect modeling of residuals or imperfect pseudo-observations. s easy to compute and useful for both improving genetic gains and control of genetic diversity.

Keywords: causative variants, BayesA, genomic prediction, sequence data, variable selectione