A program for efficient GWAS for multiple continuous traits and PHEWAS with many features designed and optimized for large scale analysis:

  • BGENIE is built upon the BGEN library. It takes BGEN files as input and avoids repeated decompression and conversion of these files when analyzing multiple continuous phenotypes.
  • It was written for the analysis of the UK Biobank dataset (which is stored in the  BGEN v1.2 file format). This dataset consists of genetic data on ~500,000 individuals, ~93 Million autosomal variants and thousands of phenotypes.
  • It works with indexed BGEN files yielding fast access for any (group of) SNPs. This feature facilitates very fast PHEWAS.
  • BGENIE uses the Eigen matrix library and OpenMP to carry out as many of the linear algebra operations in parallel as possible. For example, estimation of effect sizes of large numbers of SNPs can be carried out in parallel using matrix operations, and indexing of missing data values is used to allow for fast estimation of standard errors.
  • It has built in functionality to apply PCA or ICA (using the fastICA algorithm) to multiple phenotypes and use the resulting transformed phenotypes for testing via GWAS.


If you use BGENIE in your research, please cite the following publication:

Bycroft et al. (2017) Genome-wide genetic data on ~500,000 UK Biobank participants.

Studies using BGENIE:

Several studies have used BGENIE to carry out genome-wide association studies

Elliott et al. (2017) The genetic basis of human brain structure and function: 1,262 genome-wide associations found from 3,144 GWAS of multimodal brain imaging phenotypes from 9,707 UK Biobank participants.

Luciano et al. (2017) 16 independent genetic variants influence the neuroticism personality trait in over 329,000 UK Biobank individuals.

Davies et al. (2017) Ninety-nine independent genetic loci influencing general cognitive function include genes associated with brain health and structure (N = 280,360)


28 July (v1.2) : added features –include_rsids, –scale_phenotypes –scale_genotypes, –dosage flag, –dump_phenotypes

10 July (v1.1) : Improvements to performance when using threading

14 Jun (v1.0) : First release


BGENIE performs a linear association test between SNP/phenotype pairs in the provided data. A basic command to run GWAS on all the phenotypes is:

bgenie --bgen example.bgen --pheno example.pheno --out example.out

If you wish to specify a range of SNPs specified by position (useful if you wish to split the genome up into multiple jobs) you can use the –range option, for example:

bgenie --bgen example.bgen --pheno example.pheno --out example.out --range 22 20000000 21000000

If you wish to analyse just a single SNP you can select it using the –rsid option, for example:

bgenie --bgen example.bgen --pheno example.pheno --out example.out --rsid rs573069994

A full list of arguments and details of file formats are listed here.

Software registration and license:

BGENIE is freely available for academic use only. To see rules for non-academic use see the LICENCE file (also included with each software download). Please register for access to the software here.


Please join the OXSTATGEN mailing list and then post any questions there

BGENIE was written by Lloyd T. Elliott and Jonathan Marchini. We are grateful to Dr Gavin Band for support and advice on using the BGEN library.