At the request of the gnomAD team at the Broad Institute, the name 'gnomAD' removed from this resource. The newly revised name of our resource, the world's largest collection of Tandem Repeat Variation, is the Tandem Repeat Aggregation Atlas (abbreviated TR-Atlas).

A genome-wide spectrum of tandem repeat expansions in 338,963 humans


Tandem repeats (TRs), constituting ~6% of the human genome, have profound impacts in evolution and human disease. Compared to SNVs and SVs, TRs have mutation rates orders of magnitude higher. Importantly, TR expansions are implicated in more than 50 human diseases, including amyotrophic lateral sclerosis (ALS), Huntington's disease, and multiple cancers. Furthermore, TR expansions captured by whole genome sequencing (WGS) have been widely used in rare disease diagnosis. However, current biobank recourses utilizing WGS have largely overlooked TR expansions and there is no biobank-scale reference map for TR expansions. Here we introduce the first phase of the Tandem Repeat Genome Aggregation Database (TR-Atlas) at University of California Irvine, a biobank-scale reference of 0.86 million TRs derived from 338,963 WGS samples in diverse ancestries (39.5% non-European samples). TR-Atlas enables users to ascertain the prevalence or rarity of a specific TR expansion within its respective ancestry. Importantly, this resource is able to differentiate between common, presumably benign TR expansions, which are prevalent in TR-Atlas, from those potentially pathogenic TR expansions, which are found more frequently in disease groups than within TR-Atlas. The freely available TR-Atlas stands as an invaluable resource for researchers, physicians, and genetic counselors to interpret TR expansions in individuals with genetic diseases.