GEM Japan Whole Genome Aggregation (GEM-J WGA) Panel
Summary
The GEM-J WGA panel is a variant frequency dataset of a Japanese general population, which was obtained by joint variant calling of whole genome sequence (WGS) data collected from 7,609 individuals across Japan. The WGS data is also available in a controlled access manner. They are the result of a joint research by Tohoku Medical Megabank Organization (ToMMo), Iwate Tohoku Medical Megabank Organization, RIKEN, and the Institute of Medical Science of the University of Tokyo, as part of the GEnome Medical alliance Japan (GEM Japan) project promoted by the Agency for Medical Research and Development (AMED).
Note that the GRCh37-based data was lifted over to GRCh38 by using transanno.
- GRCh37 version/last update: 2020/07/27
- Sample size: 7,609
- Number of alternative alleles after the liftover from GRCh37 to GRCh38:94,961,154
Terms of use
GEM-J Whole Genome Aggregation (WGA) panel by GEnome Medical alliance Japan (GEM-J) is licensed under a Creative Commons Attribution 4.0 International License. As additional terms, it is prohibited from identifying and contacting research participants. No warranty or liability is assumed for the data. This is complied with Article 5 (No Warranty and Limitation of Liability) of the Creative Commons Attribution 4.0 International License.
How to credit in your works
"GEM Japan Whole Genome Aggregation (GEM-J WGA) Panel" by GEnome Medical alliance Japan Project (GEM-J) is licensed under CC BY 4.0. See also additional terms of use.
How to cite in your publications
"GEM Japan Whole Genome Aggregation (GEM-J WGA) Panel". Japan: GEnome Medical alliance Japan Project (GEM-J). Available from: https://grch38.togovar.org/doc/datasets/gem_j_wga.
Download VCF files created by the data provider [Unrestricted access]
Click here to download the VCF files. The files have been converted from GRCh37 to GRCh38 using CrossMap (last updated on 2021-11-29).
WGS datasets used for joint variant calling [Controlled access]
If you would like to use the datasets, apply for data use of them whose ID begins with "JGAD" and "AGDS" to the NBDC Human database and the AMED group sharing database, respectively.
Dataset ID (NBDC research ID) | Study title | Participants | Sample size | Data provider |
---|---|---|---|---|
Total | 7,609 | |||
JGAD000220 (hum0014) | The Tailor-made Medical Treatment Program (BioBank Japan: BBJ) | The cohort participants registered in the BBJ from 2003 to 2007 | 768 | BioBank Japan |
AGDS_00000000005 (agd0008) | バイオバンク・ジャパンの運営・管理と個別化医療の実現に向けた疾患バイオマーカー探索 (English page is under construction) | 心筋梗塞、胃がん(非腫瘍組織)、認知症 | 2,089 | BioBank Japan |
JGAD000117 (hum0103) | To investigate genomic alterations of Japanese biliary tract cancers | Biliary tract cancer (non-tumor tissue) | 17 | RIKEN Center for Integrative Medical Sciences |
JGAD000228 (hum0158) | To investigate genomic alterations of Japanese liver cancers | Liver cancer (non-tumor tissue) | 220 | RIKEN Center for Integrative Medical Sciences |
JGAD000233 (hum0160) | To investigate genomic alterations of Japanese esophageal squamous cell carcinomas | Esophageal squamous cell carcinoma (non-tumor tissue) | 20 | RIKEN Center for Integrative Medical Sciences |
JGAD000338 JGAD000339 (hum0184) | Construction of Japanese Whole-Genome database | General residents | 4,495 | Tohoku Medical Megabank Organization |
Note: Those datasets above provide fastq/bam file formatted data. The result data will be shown in our database soon. The sample size of each dataset indicates the sample number after quality control in this current study.