Introduction, downloads

D: 7 Jul 2025

Recent version history

What's new?

Coming next

[Jump to search box]

General usage

Getting started

Flag usage summaries

Column set descriptors

Citation instructions

Standard data input

PLINK 1 binary (.bed)

PROVISIONAL_REF?

PLINK 2 binary (.pgen)

Autoconversion behavior

VCF/BCF (.vcf[.gz], .bcf)

Oxford genotype (.bgen)

Oxford haplotype (.haps)

EIGENSOFT binary

PLINK 1 text (.ped, .tped)

PLINK 1 dosage

Sample ID conversion

Dosage import settings

Generate random

Unusual chromosome IDs

Allele frequencies

Phenotypes

Covariates

'Cluster' import

Reference genome (.fa)

Input filtering

Sample ID file

Variant ID file

Interval-BED file

--extract-col-cond

QUAL, FILTER, INFO

Chromosomes

SNPs only

Simple variant window

Multiple variant ranges

Deduplicate variants

Sample/variant thinning

Pheno./covar. condition

Missingness

Category subset

--keep-col-match

Missing genotypes

Same-indiv selection

Number of distinct alleles

Allele frequencies/counts

Hardy-Weinberg

Imputation quality

Mendel errors

Sex

Founder status

Main functions

Data management

--make-[b]pgen/--make-bed

--export

--output-chr

--split-par/--merge-par

--set-me-missing

--set-all-var-ids

--recover-var-ids

--update-map...

--update-ids...

--ref-allele

--ref-from-fa

--normalize

--indiv-sort

--write-covar

--variance-standardize

--quantile-normalize

--split-cat-pheno

--pheno-svd

--pmerge[-list]

--write-samples

Basic statistics

--freq

--geno-counts

--sample-counts

--missing

--genotyping-rate

--hardy

--mendel

--het

--check-sex/--impute-sex

--fst

--pgen-info

Pairwise diffs

--pgen-diff

--sample-diff

Linkage disequilibrium

--indep...

--r[2]-[un]phased

--ld

Sample-distance matrices

Relationship/covariance

(--make-grm-bin...)

--make-king...

--king-cutoff

Population stratification

--pca

PCA projection

Association analysis

--glm

--glm ERRCODE values

--gwas-ssf

--adjust-file

Report postprocessing

--clump

Linear scoring

--score[-list]

--variant-score

Distributed computation

Command-line help

Miscellaneous

Flag/parameter reuse

System resource usage

--loop-cats

.zst decompression

Pseudorandom numbers

Warnings as errors

.pgen validation

Resources

1000 Genomes phase 3

HGDP-CEPH

FASTA files

Errors and warnings

Output file list

Order of operations

Developer information

GitHub root

Python library

R library

Compilation

Adding new functionality

Discussion forums

Credits

File formats

Tutorials

Setup

Rules of Thumb

Data Exploration 1 — HWE, Allele Frequency Spectrum

Data Exploration 2 — Genomic Structure

Linkage

Relationship Matrix

Genome-Wide Assocation Analyses (GWAS)

Regressions

bcftools

Quick index search

Order of operations

If --zst-decompress present, decompress file to stdout and QUIT
Load additional commands from --script
Apply --rerun
If --help present, print requested help entries and QUIT
If --version present, print version and QUIT
Apply --silent
Apply --out, start logging
Define chromosome set (--chr-set, --cow...; human if unspecified)
Parse remaining command line flags in lexicographic order
Note chromosome filter (--chr, --not-chr, --autosome, --autosome-par)
Handle nonstandard input:

--adjust-file, --gwas-ssf standalone jobs
If --pgen-info with no .pvar file provided, scan header, print basic info, and QUIT
Convert VCF (--vcf), BCF (--bcf), Oxford dosage (--bgen, --data/--gen), Oxford haplotype (--haps), EIGENSOFT (--eigfile/--eiggeno), PLINK 1 text (--pedmap/--ped, --tfile/--tped), or PLINK 1 dosage (--import-dosage) data to PLINK 2 binary, then QUIT if no other commands
Generate random dataset (--dummy)

Merge filesets (--pmerge, --pmerge-list; --set-all-var-ids/--set-missing-var-ids applied to all inputs at the start if necessary)
Read main sample-info file, if necessary:

Check for duplicate sample IDs

Read main variant-info file, if necessary:

Exclude variants with multi-character allele codes (--snps-only)
Apply #-of-distinct-alleles filters (--min-alleles, --max-alleles)
Apply QUAL/FILTER/INFO variant filters (--var-min-qual, --var-filter, --extract-if-info, --exclude-if-info, --require-info, --require-no-info)
Assign chromosome-and-position-based names to variants (--set-all-var-ids, --set-missing-var-ids)
Split or merge pseudoautosomal region (--split-par, --merge-par, --merge-x)

Read main genotype file's header (--[b]pfile, --bfile, or freshly autoconverted)
Transpose PLINK 1 sample-major .bed, if necessary
Validate genotype file (--validate), then QUIT if no other commands
Print basic information about .pgen (--pgen-info), then QUIT if no other commands
Apply "--make-founders first"
Load/create additional phenotypes (--pheno, --within, --family)
Ignore phenotypes (--not-pheno)
Check for duplicate allele codes, if necessary
Select single variant range by ID (--from, --to, --snp, --exclude-snp, --window, --from-bp...)
Select multiple variant ranges by ID (--snps, --exclude-snps)
Update variant information (--recover-var-ids, --update-map, --update-name)
Update allele information (--update-alleles)
Extract/exclude variants by ID list(s) or intervals (--extract, --exclude, --extract-intersect)
Extract variants based on text column string/substring match or range condition (--extract-fcol)
Deduplicate variants (--rm-dup)
Filter variants by position (--from-bp, --to-bp, --extract bed0, ...)
Recode alleles to 1234/ACGT (--allele1234, --alleleACGT)
Random thinning of variant set (--thin, --thin-count)
Update sample information (--update-ids, --update-parents, --update-sex)
Keep/remove samples by ID or ID list(s) (--keep, --remove, --keep-fam, --remove-fam, --indv)
Filter samples based on text column string match (--keep-fcol)
Filter samples based on phenotype existence (--require-pheno)
Filter based on sex and/or founder status (--keep-males...)
Random thinning of sample set (--thin-indiv, --thin-indiv-count)
Calculate per-sample genotyping rate, remove samples below threshold (--mind)
Select one sample out of each same-FID-and-IID group (--select-sid-representatives)
Set founder status for samples with missing parent(s) (--make-founders)
Load covariates (--covar)
Ignore covariates (--not-covar)
Filter samples based on covariate existence (--require-covar)
Filter samples based on phenotype/covariate conditions (--keep-if, --remove-if, --keep-cats, ...)
Report remaining sample/sex/founder counts
Split categorical phenotypes/covariates (--split-cat-pheno)
Quantile-normalize phenotypes/covariates (--quantile-normalize, --pheno-quantile-normalize, --covar-quantile-normalize)
Variance-standardize phenotypes/covariates (--variance-standardize, --covar-variance-standardize)
Generate low-rank approximation of input phenotype matrix (--pheno-svd)
Loop over categories (--loop-cats)
Write sample IDs (--write-samples), then QUIT (or advance to next --loop-cats category) if no other commands
Main variant filters:

Load allele frequencies (--read-freq)
Calculate needed allele/genotype frequencies
Report overall genotyping rate (--genotyping-rate)
Write allele/genotype frequencies to file (--freq, --geno-counts), then QUIT if no other commands
Generate missing data reports (--missing), then QUIT if no other commands
Remove variants below genotyping rate threshold (--geno)
Hardy-Weinberg equilibrium report and/or exact test (--hardy, --hwe), then QUIT if no other commands
Apply minor allele frequency and count filters (--maf, --max-maf, --mac, --max-mac)
Apply imputation-quality filter (--mach-r2-filter)
Enforce minimum spacing (--bp-space)
Report remaining variant count

Scan for and/or filter on Mendel errors (--mendel, --me)
Report sample variant-counts by type (--sample-counts)
Report sample-pair discordances (--sample-diff)
Calculate kinship matrix, if necessary
Perform kinship-based pruning of samples (--king-cutoff)
Write kinship matrix/table to disk (--make-king, --make-king-table)
Calculate variance-standardized relationship matrix, if necessary
Write relationship matrix to disk (--make-rel, --make-grm-list, --make-grm-bin)
Extract principal components (--pca)
Write .snplist file (--write-snplist)
Change REF/ALT alleles (--maj-ref, --ref-allele, --alt-allele, --alt1-allele, --ref-from-fa)
Left-normalize alleles (--normalize)
Write .cov file (--write-covar; also induced by --make-pgen, --export, and similar commands)
Write PLINK 1 or 2 binary fileset, first updating chromosome information if necessary (--make-[b]pgen, --make-bed, --make-just-pvar, --make-just-psam, --make-just-bim, --make-just-fam, --update-chr)
Export genotype data to other formats (--export)
Compare two filesets (--pgen-diff)
Perform LD-based pruning (--indep-pairwise)
Display LD statistics for a single pair of variants (--ld)
Write LD-statistic matrix/table to disk (--r[2]-[un]phased)
F inbreeding coefficient report (--het)
F_ST fixation index report (--fst)
Apply linear scoring system(s) to each sample (--score, --score-list)
Apply linear scoring system(s) to each variant (--variant-score)
Multi-covariate association test (--glm)
Reformat new association test results (--gwas-ssf)
If --loop-cats, select next category and jump back to "Loop over categories" step, if any categories left
Organize association reports into LD-based clumps (--clump)
Definitely QUIT

Developer information >>