Introduction, downloads

D: 6 Dec 2024

Recent version history

What's new?

Coming next

[Jump to search box]

General usage

Getting started

Flag usage summaries

Column set descriptors

Citation instructions

Standard data input

PLINK 1 binary (.bed)

PROVISIONAL_REF?

PLINK 2 binary (.pgen)

Autoconversion behavior

VCF/BCF (.vcf[.gz], .bcf)

Oxford genotype (.bgen)

Oxford haplotype (.haps)

PLINK 1 text (.ped, .tped)

PLINK 1 dosage

Sample ID conversion

Dosage import settings

Generate random

Unusual chromosome IDs

Allele frequencies

Phenotypes

Covariates

'Cluster' import

Reference genome (.fa)

Input filtering

Sample ID file

Variant ID file

Interval-BED file

--extract-col-cond

QUAL, FILTER, INFO

Chromosomes

SNPs only

Simple variant window

Multiple variant ranges

Deduplicate variants

Sample/variant thinning

Pheno./covar. condition

Missingness

Category subset

--keep-col-match

Missing genotypes

Number of distinct alleles

Allele frequencies/counts

Hardy-Weinberg

Imputation quality

Sex

Founder status

Main functions

Data management

--make-[b]pgen/--make-bed

--export

--output-chr

--split-par/--merge-par

--set-all-var-ids

--recover-var-ids

--update-map...

--update-ids...

--ref-allele

--ref-from-fa

--normalize

--indiv-sort

--write-covar

--variance-standardize

--quantile-normalize

--split-cat-pheno

--pheno-svd

--pmerge[-list]

--write-samples

Basic statistics

--freq

--geno-counts

--sample-counts

--missing

--genotyping-rate

--hardy

--het

--check-sex/--impute-sex

--fst

--pgen-info

Pairwise diffs

--pgen-diff

--sample-diff

Linkage disequilibrium

--indep...

--r[2]-[un]phased

--ld

Sample-distance matrices

Relationship/covariance

  (--make-grm-bin...)

--make-king...

--king-cutoff

Population stratification

--pca

PCA projection

Association analysis

--glm

--glm ERRCODE values

--gwas-ssf

--adjust-file

Report postprocessing

--clump

Linear scoring

--score[-list]

--variant-score

Distributed computation

Command-line help

Miscellaneous

Flag/parameter reuse

System resource usage

--loop-cats

.zst decompression

Pseudorandom numbers

Warnings as errors

.pgen validation

Resources

1000 Genomes phase 3

HGDP-CEPH

FASTA files

Errors and warnings

Output file list

Order of operations

Developer information

GitHub root

Python library

R library

Compilation

Adding new functionality

Discussion forums

Credits

File formats

Tutorials

Setup

Rules of Thumb

Data Exploration 1 — HWE, Allele Frequency Spectrum

Data Exploration 2 — Genomic Structure

Linkage

Relationship Matrix

Genome-Wide Assocation Analyses (GWAS)

Regressions

Post-Hoc

Formatting Files

bcftools

Variant IDs

Reference Alleles

Format for R

Shortcuts

Quick index search

Order of operations

  • If --zst-decompress present, decompress file to stdout and QUIT
  • Load additional commands from --script
  • Apply --rerun
  • If --help present, print requested help entries and QUIT
  • If --version present, print version and QUIT
  • Apply --silent
  • Apply --out, start logging
  • Define chromosome set (--chr-set, --cow...; human if unspecified)
  • Parse remaining command line flags in lexicographic order
  • Note chromosome filter (--chr, --not-chr, --autosome, --autosome-par)
  • Handle nonstandard input:
    • --adjust-file, --gwas-ssf standalone jobs
    • If --pgen-info with no .pvar file provided, scan header, print basic info, and QUIT
    • Convert VCF (--vcf), BCF (--bcf), Oxford dosage (--bgen, --data/--gen), Oxford haplotype (--haps), or PLINK 1 dosage (--import-dosage) data to PLINK 2 binary, then QUIT if no other commands
    • Generate random dataset (--dummy)
  • Merge filesets (--pmerge, --pmerge-list; --set-all-var-ids/--set-missing-var-ids applied to all inputs at the start if necessary)
  • Read main sample-info file, if necessary:
    • Check for duplicate sample IDs
  • Read main variant-info file, if necessary:
    • Exclude variants with multi-character allele codes (--snps-only)
    • Apply #-of-distinct-alleles filters (--min-alleles, --max-alleles)
    • Apply QUAL/FILTER/INFO variant filters (--var-min-qual, --var-filter, --extract-if-info, --exclude-if-info, --require-info, --require-no-info)
    • Assign chromosome-and-position-based names to variants (--set-all-var-ids, --set-missing-var-ids)
    • Split or merge pseudoautosomal region (--split-par, --merge-par, --merge-x)
  • Read main genotype file's header (--[b]pfile, --bfile, or freshly autoconverted)
  • Transpose PLINK 1 sample-major .bed, if necessary
  • Validate genotype file (--validate), then QUIT if no other commands
  • Print basic information about .pgen (--pgen-info), then QUIT if no other commands
  • Apply "--make-founders first"
  • Load/create additional phenotypes (--pheno, --within, --family)
  • Ignore phenotypes (--not-pheno)
  • Check for duplicate allele codes, if necessary
  • Select single variant range by ID (--from, --to, --snp, --exclude-snp, --window, --from-bp...)
  • Select multiple variant ranges by ID (--snps, --exclude-snps)
  • Update variant information (--recover-var-ids, --update-map, --update-name)
  • Update allele information (--update-alleles)
  • Extract/exclude variants by ID list(s) or intervals (--extract, --exclude, --extract-intersect)
  • Extract variants based on text column string/substring match or range condition (--extract-fcol)
  • Deduplicate variants (--rm-dup)
  • Filter variants by position (--from-bp, --to-bp, --extract bed0, ...)
  • Random thinning of variant set (--thin, --thin-count)
  • Update sample information (--update-ids, --update-parents, --update-sex)
  • Keep/remove samples by ID or ID list(s) (--keep, --remove, --keep-fam, --remove-fam, --indv)
  • Filter samples based on text column string match (--keep-fcol)
  • Filter samples based on phenotype existence (--require-pheno)
  • Filter based on sex and/or founder status (--keep-males...)
  • Random thinning of sample set (--thin-indiv, --thin-indiv-count)
  • Calculate per-sample genotyping rate, remove samples below threshold (--mind)
  • Set founder status for samples with missing parent(s) (--make-founders)
  • Load covariates (--covar)
  • Ignore covariates (--not-covar)
  • Filter samples based on covariate existence (--require-covar)
  • Filter samples based on phenotype/covariate conditions (--keep-if, --remove-if, --keep-cats, ...)
  • Report remaining sample/sex/founder counts
  • Split categorical phenotypes/covariates (--split-cat-pheno)
  • Quantile-normalize phenotypes/covariates (--quantile-normalize, --pheno-quantile-normalize, --covar-quantile-normalize)
  • Variance-standardize phenotypes/covariates (--variance-standardize, --covar-variance-standardize)
  • Generate low-rank approximation of input phenotype matrix (--pheno-svd)
  • Loop over categories (--loop-cats)
  • Write sample IDs (--write-samples), then QUIT (or advance to next --loop-cats category) if no other commands
  • Main variant filters:
    • Load allele frequencies (--read-freq)
    • Calculate needed allele/genotype frequencies
    • Report overall genotyping rate (--genotyping-rate)
    • Write allele/genotype frequencies to file (--freq, --geno-counts), then QUIT if no other commands
    • Generate missing data reports (--missing), then QUIT if no other commands
    • Remove variants below genotyping rate threshold (--geno)
    • Hardy-Weinberg equilibrium report and/or exact test (--hardy, --hwe), then QUIT if no other commands
    • Apply minor allele frequency and count filters (--maf, --max-maf, --mac, --max-mac)
    • Apply imputation-quality filter (--mach-r2-filter)
    • Enforce minimum spacing (--bp-space)
    • Report remaining variant count
  • Report sample variant-counts by type (--sample-counts)
  • Report sample-pair discordances (--sample-diff)
  • Calculate kinship matrix, if necessary
  • Perform kinship-based pruning of samples (--king-cutoff)
  • Write kinship matrix/table to disk (--make-king, --make-king-table)
  • Calculate variance-standardized relationship matrix, if necessary
  • Write relationship matrix to disk (--make-rel, --make-grm-list, --make-grm-bin)
  • Extract principal components (--pca)
  • Write .snplist file (--write-snplist)
  • Change REF/ALT alleles (--maj-ref, --ref-allele, --alt-allele, --alt1-allele, --ref-from-fa)
  • Left-normalize alleles (--normalize)
  • Write .cov file (--write-covar; also induced by --make-pgen, --export, and similar commands)
  • Write PLINK 1 or 2 binary fileset, first updating chromosome information if necessary (--make-[b]pgen, --make-bed, --make-just-pvar, --make-just-psam, --make-just-bim, --make-just-fam, --update-chr)
  • Export genotype data to other formats (--export)
  • Compare two filesets (--pgen-diff)
  • Perform LD-based pruning (--indep-pairwise)
  • Display LD statistics for a single pair of variants (--ld)
  • Write LD-statistic matrix/table to disk (--r[2]-[un]phased)
  • F inbreeding coefficient report (--het)
  • FST fixation index report (--fst)
  • Apply linear scoring system(s) to each sample (--score, --score-list)
  • Apply linear scoring system(s) to each variant (--variant-score)
  • Multi-covariate association test (--glm)
  • Reformat new association test results (--gwas-ssf)
  • If --loop-cats, select next category and jump back to "Loop over categories" step, if any categories left
  • Organize association reports into LD-based clumps (--clump)
  • Definitely QUIT

Developer information >>