What health care providers need to know about polygenic risk scores

Jeanette McCarthy, MPH, PhD

The genetic basis of common diseases like diabetes, cancer, heart disease, asthma and Alzheimer’s disease is complex. A small subset of genetic cases may be due to a single gene variant (e.g. monogenic, or Mendelian forms) that imparts a greatly increased risk of disease on individuals with the variant. These forms are rare, but genetic tests for these variants are highly predictive of disease.

However, the genetic underpinning of most cases of common diseases is not monogenic, but rather, polygenic. In other words, there are genetic variants of dozens of different genes, each with a tiny effect on disease risk. Individually, these genetic changes are not good predictors of disease and are not routinely offered as genetic tests.

But what if we could combine all of these individual variants with small effects into one test and use that to predict disease risk? That is the basic idea behind polygenic risk scores (PRS).

Where do polygenic risk scores come from?

To understand PRS, we need to step back and explain how we find the individual variants associated with disease in the first place. Genetic variants underlying common diseases are identified from Genome-wide Association Studies (GWAS). These large-scale studies compare the frequency of several hundred thousand pre-selected, common genetic variants between a group of people with a given disease/trait (cases) to those without the trait (controls). Variants showing a significant and reproducible difference in frequency between cases and controls are said to be associated with the disease. To date, dozens to hundreds of genetic variants have been robustly associated with numerous diseases/traits.

After identifying individual associated variants comes the task of incorporating the variants into a single, quantitative PRS. There are many different ways to do this and every decision made about which variants to include and how to combine them will impact the performance of the PRS.

Figure 1. Concordance of results between first generation PRS from three companies across several diseases (adapted from Kalf R, et al.)

Figure 1. Concordance of results between first generation PRS from three companies across several diseases (adapted from Kalf R, et al.)

Early efforts to develop polygenic risk scores

In the early 2000s, after GWAS started churning out associated genes, several companies, including Decode, Navigenics, and 23andMe, commercialized PRS for different traits (note these companies either no longer exist or no longer offer PRS tests).The problem with these commercial tests, outlined in a study by Kalf R, et al.[1] was that there was wide variability in risk prediction between PRS products from different companies (Figure 1). Each company used different genetic variants and combined them in different ways in their PRS.

The simplest way to combine the variants into a simple predictor is to add up the number of risk variants a person has. The more risk variants they have, the higher their risk of disease. A more complex method is to not simply add the number of variants, but to weight them according to how strongly they are associated with disease. An even more sophisticated approach would be to account for how different variants interact with each other.

Polygenic risk scores make a comeback

Fast-forward to 2017-19 and we find polygenic risk scores making a come-back. So, what has changed? Clearly there are more disease-associated variants that have been discovered and more accurate algorithms employed to find the most predictive combination of variants. Moreover, with sizable populations amassed, it is now feasible to develop a score in one large cohort and evaluate its performance in another equally large, independent cohort, providing much-needed validation.

Recent publications describe the development of PRS across different disease areas, including Alzheimer’s, coronary artery disease, diabetes, breast cancer and other common disorders[2]. Different PRS are based on anywhere from several dozen variants to over a million variants (i.e. genome-wide).

Some companies have begun commercializing PRS tests as well, including reputable genetic testing companies Myriad Genetics, Ambry Genetics, Color Genomics and Phosphorous. Other less well-known players have entered the market as well, including Allelica, Genomic Prediction and KarmaGenes, some offering dubious applications.  Interestingly, 23andMe, who were early adopters of PRS, currently do NOT offer these tests.

Controversies over polygenic risk scores

Not everyone is buying the hype around these tests. Questions remain about whether they are predictive or broadly relevant or useful.

Predictive value of PRS

Some argue that as predictive tests go, PRS fall short in terms of sensitivity and specificity[3]. These tests seem to be able to identify a small percentage (generally <1%) of the population at very high risk of developing disease, similar to the risk imposed by Mendelian forms of the same disease (Figure 2).

Figure 2. PRS will identify people at both lower and higher than average risk of disease. Only a small percent of the population will have the highest level of risk.

Figure 2. PRS will identify people at both lower and higher than average risk of disease. Only a small percent of the population will have the highest level of risk.

But what if you’re not in the 1%? The vast majority of people will be told that they have a nominally increased or decreased risk of disease. In these cases, the PRS has little positive or negative predictive value and even less utility. Some argue that PRS are no worse than current risk factors like cholesterol levels are for assessing heart disease risk. But for high cholesterol, the message is clear: lower your cholesterol levels through diet, exercise, medications to reduce your risk of heart disease. For a high PRS the message is less clear because the biology underlying the risk is not simple and your genetic risk itself is unmodifiable. At best, you get generic advice that everyone should follow to reduce risk.

Relevance of PRS across ethnic groups

An even bigger issue stems from the fact that most PRS developed to date were optimized for use in populations of primarily European ancestry and their performance in other ethnic groups is diminished. This ethnic bias can result in inaccurate risk estimates in non-European populations[4]. Lack of general relevance to non-Caucasian populations can lead to health disparities and confusion about the use of these tests. This major shortcoming is well-recognized by test developers and there are ongoing efforts to remedy it.

Utility of polygenic risk scores

Just like many Mendelian forms of disease, high PRS may be useful for identifying individuals who would benefit from increased surveillance, prophylactic treatment or risk reducing procedures, regardless of which gene is mutated.

But unlike their Mendelian counterparts, polygenic risk is not due to a single pathogenic variant in a specific gene and so we lose some potentially valuable knowledge that might inform treatment.  For Mendelian disease genes, patients may be candidates for targeted therapies. Moreover, cascade screening can also help identify family members who have the same pathogenic variant and also be at high risk of disease. In contrast, for high risk PRS, the underlying genetic risk is generic and has less utility for informing treatment or cascade screening.

Validity of commercial polygenic risk scores

Besides the general concerns noted above, there are other reasons to be cautious about individual PRS tests on the market. Unlike Mendelian genetic tests where testing companies state exactly which genes they are evaluating and the user can independently verify the clinical validity of that gene, PRS are a black box. PRS may differ from one another based on the specific variants included in the test, the weights assigned to each variant and how they are combined to generate the PRS. Moreover, none of that underlying information may be available for the user to independently verify.

An important step in PRS development is validation of the test in a study population that is independent of the population used to derive the PRS. In a largely unregulated field, nobody is policing this step, so we have to rely upon the test providers to share information about validation studies on their website. It’s very possible that the PRS test being offered by a company lacks any type of rigorous validation.

Proceed with caution

The value of using a PRS in clinical care is really up to the health care provider. But, here’s what health care providers should know:

  1. Not all PRS are relevant to non-Caucasian populations and may provide biased risk estimates if used in different ethnic groups

  2. Not all PRS are the same and PRS tests from different companies could give conflicting results

  3. Not all PRS tests have been rigorously validated, so sticking with PRS tests offered by reputable genetic testing labs is recommended

To learn more about the application of genetic testing, including polygenic risk scores, to clinical practice, enroll in our online courses or attend one of our upcoming workshops.

[1] Kalf RR, et al. Genet Med 2014;16:85–91.

[2] A selection of the literature: Khera AV, et al. Nat Genet 2018; 50(9):1219-1224.  Mavaddat N, et al. J Natl Cancer Inst. 2015;107(5):djv036.  Tosta G, et al. Neurology 2017;88(12):1180-1186. 

[3]  Wald NJ and Old R. Genet Med 2019. 

 [4]  De La Vega FM and Bustamante C. Genome Med. 2018 Dec 27;10(1):100.