Datasets > GEM-J WGA

Summary

The GEM-J WGA panel is a variant frequency dataset of a Japanese general population, which was obtained by joint variant calling of whole genome sequence (WGS) data collected from 7,609 individuals across Japan. The WGS data is also available in a controlled access manner. They are the result of a joint research by Tohoku Medical Megabank Organization (ToMMo), Iwate Tohoku Medical Megabank Organization, RIKEN, and the Institute of Medical Science of the University of Tokyo, as part of the GEnome Medical alliance Japan (GEM Japan) project promoted by the Agency for Medical Research and Development (AMED).
Note that the GRCh37-based data was lifted over to GRCh38 by using transanno.

GRCh37 version/last update: 2020/07/27
Sample size: 7,609
Number of alternative alleles after the liftover from GRCh37 to GRCh38：94,961,154

Terms of use

GEM-J Whole Genome Aggregation (WGA) panel by GEnome Medical alliance Japan (GEM-J) is licensed under a Creative Commons Attribution 4.0 International License. As additional terms, it is prohibited from identifying and contacting research participants. No warranty or liability is assumed for the data. This is complied with Article 5 (No Warranty and Limitation of Liability) of the Creative Commons Attribution 4.0 International License.

How to credit in your works

"GEM Japan Whole Genome Aggregation (GEM-J WGA) Panel" by GEnome Medical alliance Japan Project (GEM-J) is licensed under CC BY 4.0. See also additional terms of use.

How to cite in your publications

"GEM Japan Whole Genome Aggregation (GEM-J WGA) Panel". Japan: GEnome Medical alliance Japan Project (GEM-J). Available from: https://grch38.togovar.org/doc/datasets/gem_j_wga.

Download VCF files created by the data provider [Unrestricted access]

Click here to download the VCF files. The files have been converted from GRCh37 to GRCh38 using CrossMap (last updated on 2021-11-29).

WGS datasets used for joint variant calling [Controlled access]

If you would like to use the datasets, apply for data use of them whose ID begins with "JGAD" and "AGDS" to the NBDC Human database and the AMED group sharing database, respectively.

Dataset ID (NBDC research ID)	Study title	Participants	Sample size	Data provider
Total			7,609
JGAD000220 (hum0014)	The Tailor-made Medical Treatment Program (BioBank Japan: BBJ)	The cohort participants registered in the BBJ from 2003 to 2007	768	BioBank Japan
AGDS_00000000005 (agd0008)	バイオバンク・ジャパンの運営・管理と個別化医療の実現に向けた疾患バイオマーカー探索 (English page is under construction)	心筋梗塞、胃がん（非腫瘍組織)、認知症	2,089	BioBank Japan
JGAD000117 (hum0103)	To investigate genomic alterations of Japanese biliary tract cancers	Biliary tract cancer (non-tumor tissue)	17	RIKEN Center for Integrative Medical Sciences
JGAD000228 (hum0158)	To investigate genomic alterations of Japanese liver cancers	Liver cancer (non-tumor tissue)	220	RIKEN Center for Integrative Medical Sciences
JGAD000233 (hum0160)	To investigate genomic alterations of Japanese esophageal squamous cell carcinomas	Esophageal squamous cell carcinoma (non-tumor tissue)	20	RIKEN Center for Integrative Medical Sciences
JGAD000338 JGAD000339 (hum0184)	Construction of Japanese Whole-Genome database	General residents	4,495	Tohoku Medical Megabank Organization

Note: Those datasets above provide fastq/bam file formatted data. The result data will be shown in our database soon. The sample size of each dataset indicates the sample number after quality control in this current study.