Skip to main content
Skip table of contents

Frequently Asked Questions

NIAGADS and Partners

What is NIAGADS?

The National Institute for Aging Genetics of Alzheimer's Disease Data Storage Site (NIAGADS) is the designated data repository for genetic and genomic data generated from NIA grants. NIAGADS also accepts AD genetics and genomics data from studies not funded by the NIA.

NIAGADS is a U24 cooperative agreement between the NIA and the University of Pennsylvania and has been in existence since 2013. In 2017, NIAGADS developed the Data Sharing Service, a FISMA compliant framework that makes it easier for users and administrators to apply for, access, and store AD genetics and genomics data.

Currently, the NIAGADS team is working on harmonizing, standardizing, and moving over legacy datasets from the original NIAGADS database to the DSS service in addition to processing and depositing new datasets for the research community. While the move is taking place, legacy datasets will be available to research on the NIAGADS site. All new datasets are available on the DSS site.

How does this differ from NACC, NCRAD, or AD knowledge portal?

NIAGADS stores genetics and genomics data.

NCRAD stores and distributes biospecimens.

NACC stores and distributes standardized clinical and neuropathological research data for the ADRCs.

AD Knowledge Portal stores the multi-omic and drug development data.

How do I map subjects across repositories?

The NIAGADS team is happy to support ID mapping to NACC, NCRAD, and AD Knowledge Portal datasets. If you need help mapping subject IDs, please reach out to help@niagads.org.

Accessing Data in NIAGADS and DSS

Who is eligible to submit an access request?

To be eligible to access the data investigators must meet the following criteria:

  1. Be permanent employees of their institution at a level equivalent to a full-time assistant, associate, or full professor. Lab staff, trainees (e.g., graduate students) and postdocs are not permitted to submit (NIAGADS and DSS)

  2. Submitting investigators and their signing official must have an eRA Commons ID (DSS only)

How do I get an eRA Commons ID?

This is done through the NIH. Go to http://www.era.nih.gov and click on the accounts tab to learn more.

How do I register my institution?

Only individuals with legal signing authority, can complete this process,  see, Registering Institutions (nih.gov).

Can institutions outside the US apply for access?

Yes

Can for-profit institutions apply for access?

Yes

How do I access the data?

To access data in NIAGADS or the DSS you need to complete a Data Access Request (DAR). Each database requires its own DAR, but the information required for both overlaps substantially.

Required Documentation 

NIAGADS 

DSS 

Research Use Statement 

X 

X 

Other Project Information 

(PI, Institution, IT Director, Signing Officer, Staff and Collaborators, etc) 

X 

X 

IRB Approval 

X 

X 

IRB Protocol 

 

X 

Data Use Certification 

X 

X 

NIAGADS Data Distribution Agreement 

X 

X 

NIA Genomic Data Sharing Policy 

X 

X 

Derived/Secondary Data Return Plan 

X 

X 

PI Biosketch 

X 

 

Data Transfer Agreement 

 

X 

Cloud Use Statement and Cloud Server Provider Information (if applicable) 

X (in supplementary information doc) 

X 

Non-Technical Summary 

X (in supplementary information doc) 

X 

Templates and instructions for filling out each form can be found on each site. 

NIAGADS: Documents and guidelines | NIAGADS 

DSS: Application Instructions - DSS NIAGADS 

How long does this process take?

Requests are reviewed by multiple administrators as well as two review committees. Unfortunately, this can make it difficult to estimate total review times. To minimize review times, it is important to submit documents as described in the application instructions on the NIAGADS and DSS Sites.

If requesting data from DSS, be sure to send your signed Data Transfer Agreement (DTA) to niagads@pennmedicine.upenn.edu in a timely manner. Data Access Requests cannot be approved until a fully executed DTA is received from Penn’s contracts department.

What is the application success rate?

If the submission instructions are followed carefully, most applications are approved. Please reach out to the NIAGADS team at niagads@pennmedicine.upenn.edu should you have any questions.

I am having trouble logging in to the system with my eRA commons ID. What should I do?

Please make sure you are logging into DARM with your eRA commons ID username and not with your Login.gov credentials. Also be sure you are logged out of login.gov before using your eRA commons ID to log into DARM. If you are still having trouble, you may need to change your eRA commons password. More information on that can be found here: https://www.era.nih.gov/erahelp/Commons/Commons/access/reset_pswd.htm.

Is there an embargo on datasets distributed by NIAGADS?

No data made available through NIAGADS DSS has an embargo.

Data Types

What type of data do you have?

How can I find datasets of interest?

Click on the dataset tab at the top of the home page. Here you can either use the search bar at the top to enter your query or you can use the filter menu on the side to narrow down results. The lock next to the dataset name tells you if the dataset is publicly available (no application) or restricted access (requires application). Some datasets have publicly available files and restricted files. Within each dataset page you can find details about the study it came from, a breakdown of the participant population, and any related publications.

The Alzheimer’s Genomics Database is also another great resource for this. It has the publicly available GWAS summary statistics available to browse with an easy-to-use interface.

If you are still unsure, you can always email the NIAGADS team at help@niagads.org for assistance.

Do I need to know what dataset I am interested in before I apply for access?

Yes, you need to specify the dataset in your application.

Downloading Data

How do I download the data?

 NIAGADS incurs the cost for investigators to download most of the available data, including joint genotype-called project level VCFs, phenotypes, and associated meta-data. All files <5Gb can be downloaded directly through the portal, while files >5Gb will need to be downloaded directly from Amazon. In order to download data >5Gb in size, please ensure that you have an Amazon Web Services account.  Additional information about using Amazon can be found on our Amazon Instructions page.

CRAMs, gVCFs, and SV VCFs can be downloaded using the Amazon Requester Pays option, which means that the requesting institution will incur the cost of downloading the data.

Options for AWS download using Requestor Pays option:

AWS resource - You would not be charged if you download within the same region as our S3 bucket, US-East (N. Virginia), to another US-East (N. Virginia)

Local download - an affordable transfer option is an Amazon Snowball. DSS would send the data to your S3 bucket, then you can create an AWS Snowball export. The device costs $250 to transfer 80TB of data (plus additional fees.

What are my options to download the CRAMs and VCFs?

CRAMs, gVCFs, and SV VCFs can be downloaded using the Amazon Requester Pays option, which means that the requesting institution will incur the cost of downloading the data.

Options for AWS download using Requestor Pays option:

AWS resource - You would not be charged if you download within the same region as our S3 bucket, US-East (N. Virginia), to another US-East (N. Virginia)

Local download - an affordable transfer option is an Amazon Snowball. DSS would send the data to your S3 bucket, then you can create an AWS Snowball export. The device costs $250 to transfer 80TB of data (plus additional fees.

Submitting Data

How do I know if my dataset is appropriate to deposit in NIAGADS?

NIAGADS accepts most types of genetics and genomics data. We currently house genotype, GWAS, WGS, WES, RNA-seq, microarray, epigenetic studies (ChIP-seq, ect), WTL, methylation, proteomic, as well as single cell data for RNA-seq, microarray, and epigenetic studies. Please see the data submission instructions for a list of datatypes and documentation requirements here: Data Submission Instructions - DSS NIAGADS

We are always open to taking in new types of genetics and genomics data. Contact help@niagads.org if you have questions about data type submission for existing or new data types.

How do I deposit my dataset?

To deposit data, please email help@niagads.org and review the data submission instructions page for the additional documentation needed to support deposition of your dataset.

What documentation do I need?

Data submission form

NIH genomic sharing plan

Institutional certification

Please see the data submission instructions page for the additional documentation needed to support deposition of your dataset.

How can we transfer the data to you?

The NIAGADS team will work with you to transfer your data over an FTP link or other cloud service. Please contact help@niagads.org to coordinate.

What does it cost to store our data with you? What does it cost to transfer our data to you?

NIAGADS will incur the costs to transfer and store smaller datasets. If you plan to submit a dataset over 50TB or >1K short-read raw WGS samples, please reach out to help@niagads.org to discuss storage and transfer options.

How long does it take until others can apply for access to my data?

This depends on the receipt of all the documentation needed for submission in a timely manner. Once we have all documentation, and it is complete, it takes about 2 months to deposit the dataset and make it available.

Can I update my data in NIAGADS if I want/need to? What does this process entail?

Yes, you can always update your data set. To update your data set, please contact help@niagads.org with the existing accession number and a mapping of which files should replace existing files, are new, or should no longer be shared.

Who will be able to access my data?

Datasets made available via controlled-access are available for request by qualified investigators with an eRA Commons account. They must have IRB approval and their research use is reviewed and approved by the NIAGADS ADRD Data Access Committee (NADAC), made up of program officials from the NIH. Yearly renewal of their approved data access request is required in order to maintain access to the data.

Datasets that do not required controlled access and are available for public access (e.g. p-values) are made freely and openly available through our Open Access Data Portal.

Alzheimer’s Disease Sequencing Project (ADSP)

Where can I find this dataset to apply?
What data are available?

For the latest information about the data available from the ADSP visit https://adsp.niagads.org/data/data-summary/

When will the R5 WGS data be released?

We currently anticipate that the R5 data will be released in the first half of 2024. 

Support and Resources

What type of support do you offer to users?

NIAGADS currently offers extensive documentation on applying, accessing and submitting data here: Data Application and Submission - DSS NIAGADS

Video tutorials can be found on our YouTube site here: NIAGADS - YouTube

Lastly, we hold office hours throughout the year, anyone can join. Subscribe to our mailing list to be notified when the next office hour will occur

Where can I find video tutorials?

Yes, they can be found here: NIAGADS - YouTube

Where can I ask questions if I have any?

You can always reach out to help@niagads.org if you have any questions and a member of our team will assist you.

We also hold office hours throughout the year with different subject matter experts from our team to answer questions live. If you would like to be kept informed of future office hours, please opt into our mailing list here

What type of resources do you offer users?

NIAGADS currently offer four tools for users.

 

ADVP

FILER

GenomicsDB

VariXam

Description

Contains curated ADRD variants from the literature

Contains functional genomics and annotation data for  GRCh37/g19 and GRch38/hg38

Browse NIAGADS GWAS summary statistics and ADSP variant annotations or view summaries in the context of genes and variants

Variant browser

 

Browse for ADSP variants instead of downloading the VCFs

What does it cover?

Variants specific to AD

Whole genome

Whole genome with emphasis on contextualizing AD

Specific to ADSP dataset (R1-R4)

Data Types

Alzheimer’s disease variants and supporting literature

Functional genomic data such as tissue/cell-type specific epigenetic marks, QTL, chromatin interaction, TF binding.

Genes and variants across the whole genome

 

NIAGADS publicly available GWAS summary statistics

ADSP variants of significance (R1-R4)

Use it for

Look up and review supporting information for high confidence AD variants

Use the harmonized genomic and annotation data collections for downstream genetic and genomic analyses such as GWAS variant analysis, genomic regions and other analyses.

View gene and variant reports to quickly discover existing evidence for AD-association or use the NIAGADS genome browser to explore AD GWAS summary statistics in a broader genomic context.  

Open access, querying variants in the ADSP data without downloading VCFs

See the tools section below for additional frequently asked questions.

Tools

Alzheimer’s Disease Variant Portal (ADVP)

What is ADVP?

The Alzheimer’s Disease Variant Portal is a database of literature derived populations specific variants associated with AD collated form published GWAS papers from the Alzheimer’s Disease Genetics Consortium (ADGC) and other consortiums. 

What is the process used to curate variants?

Publication where genome-wide significant and suggestive loci from AD genetic studies (ADGC and other AD GWAS studies in the NHGRI/EBI GWAS catalog) were selected if they reported GWAS finding. From these selected publications, only main/major findings were extracted, and then a systemic data extraction and curation procedure for each publication was applied. Data was then harmonized to ensure consistency and reduce variability when reporting information (e.g., CSF AB1-42 or CSF Abeta would be the same and reported as CSF ab1-42). A second curator validated the work of the first curator, and lastly extracted data was parsed customized scripts to validate, annotate, and store the publication, variant, and association data in the relational database. For detail information on the curation process, see the associated publication here.

What data is included?

V1.0 (2/21) contains >900 loci, >1,800 variants, >80 cohorts, and 8 populations cataloging 6,990 associations related to disease risk, expression quantitative trails, endophenotypes, and neuropathy. Data primarily comes from studies conducted by the Alzheimer’s Disease Genetics Consortium (ADGC).

How would I apply this to my research?

Investigator can use the ADVP to quickly and systematically explore high-confidence AD genetics findings and review insights into population specific AD genetic architecture.

Alzheimer’s Genomics Database

What is the Alzheimer's Genomics Database?

The Alzheimer’s Genomics Database is a user-friendly web-knowledgebase for searching annotated AD-relevant GWAS summary statistics datasets. It also facilitates real-time data mining and provides summaries of association records in the context of functionally annotated genes and variants.  Interactive visualizations, such as the NIAGADS Genome Browser allow users to explore these data tracks and annotations in the broader genomic context and compare against functional genomics tracks from FILER.

What data are available?

56 publicly summary statistics deposited in NIAGADS are available, as well as ~263 million annotated variants (including 232M from the ADSP, ~296K with significant AD/ADRD-risk association, and ~20K annotated genes. For more details about the data available visit the documentation page: GDB |About (niagads.org).

How would I apply this to my research?

Quickly search a gene or variant to see supporting evidence from the publicly available GWAS summary statistics deposited in NIAGADS and link to other resources, like AD Knowledge Portal and UniProtKB to explore additional evidence. You can also examine the datasets with publicly available summary statistics to help determine if you would like to submit a Data Access Request (DAR) for a particular dataset.

How can I navigate the site?

See our YouTube tutorials for navigating the site here: Navigating the NIAGADS Alzheimer's Genomics Database - YouTube

FILER

What is FILER?

FILER is a functional genomics database with a large curated and integrated collection of harmonized, extensible, indexed, and searchable human functional genomics data from >20 data sources.

What data are available?

>50,000 harmonized, annotated genomic datasets across >20 integrate data sources, >1,100 tissues/cell types and >20 experimental assays spanning 17 billion genomic records on GRch37/hg19 and GRCh38/hg38. To learn more about the datatypes available, click here.

What steps are involved in the data processing?

Primary annotated and functional genomics datasets from existing data sources are collected and complied into a unified catalogue. Then individual genomic datasets are curated, processed, and imported into FILER using the FILER data harmonization and annotation pipeline. Data type specific metadata was then harmonized to generate standardized consistent metadata descriptions of each of the filer data tracks (sets of genomic/annotation records). To learn more about the curation process, please check out the information here.

How would I apply this to my research?

Use the harmonized genomic and annotation data collect for downstream genetic and genomic analysis, further interpretation, characterization, and discovery for GWAS results and biological experiments.

Can I load my own data?

Yes. You can use FILER to find overlaps with your data (in the form of genomic intervals) from other types of user experiments like ChIP-seq peaks, small RNA peaks, or ATAC-seq open chromatin regions.

How do I run a search query?

FILER can be deployed two ways:

  1. Locally on a server or cluster

  2. Through the webserver

To learn more about how to use FILER, including it API feature, visit the about page, where there is a user tutorial as well as other support information.

VariXam

What is VariXam?

VariXam is an aggregated database of genomic variants detected in data via whole-genome/whole-exome sequencing from the Alzheimer’s Disease Sequencing Project (ADSP). VariXam is also comprised of a variant browser to visualize this data.

The database can also be accessed through command line. Visit the VariXam website to learn more.

What data is available?

Currently the database includes variants from the R1 5K WGS, R2 20K WES, R3 17K WGS, and R4 36K WGS. The human reference genome used is GRCh38.

What steps are involved in the data processing?

Variants were processed by the Variant Calling Pipeline and data management tool (VCPA). The VCPA consists of two independent but linkable components.

  1. The pipeline – details about the pipeline can be found in the link above.

  2. The tracking database

How would I apply this to my research?

Quickly search for variants of interest in the ADSP to see variants of significance for AD and related dementias without the need to apply for the full dataset or download and process VCFs.

How do I run a search query?

Type your gene, variant, position, or refSNP id into the search bar and click enter. Variant records from the ADSP dataset will be returned in a table below the search bar.

There is also an API which can be used to query the variables above. For more details about the API, visit the VariXam website.

NIAGADS (API)

What is the NIAGADS API?

The NIAGADS Application Programming Interface (API) is a public resource allowing programmatic access the publicly available data stored in NIAGADS knowledgebases.  An alpha version is now available, allowing basic queries against the ADVP, GenomicsDB, and FILER to retrieve data track metadata, track hits within a genomic span of interest, and genomic evidence for AD-risk association for genes and variants. It is a user-friendly framework for extracting information and integrating AD-relevant gene, variant, and functional genomic annotations into analysis pipelines.

The NIAGADS API is an intuitive, templated RESTful interface for programmatic querying that meets OpenAPI standards. Endpoints are simple URLs that specify the knowledgebase, feature type (gene, variant), feature identifier (s) and filter criteria (experiment or bio-sample properties).

How do I use the NIAGADS API?

Use of the API requires only a HTTPS library and JSON parser, permitting largescale programmatic access in most programming languages.

Check out the API at http://api.niagads.org/ to learn more.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.