Introduction, downloads

S: 22 Oct 2024 (b.7.7)

D: 22 Oct 2024

Recent version history

What's new?

Future development

Limitations

Note to testers

[Jump to search box]

General usage

Getting started

Citation instructions

Standard data input

PLINK 1 binary (.bed)

Autoconversion behavior

PLINK text (.ped, .tped...)

VCF (.vcf[.gz], .bcf)

Oxford (.gen[.gz], .bgen)

23andMe text

Generate random

Unusual chromosome IDs

Recombination map

Allele frequencies

Phenotypes

Covariates

Clusters of samples

Variant sets

Binary distance matrix

IBD report (.genome)

Input filtering

Sample ID file

Variant ID file

Positional ranges file

Cluster membership

Set membership

Attribute-based

Chromosomes

SNPs only

Simple variant window

Multiple variant ranges

Sample/variant thinning

Covariates (--filter)

Missing genotypes

Missing phenotypes

Minor allele frequencies

Hardy-Weinberg

Mendel errors

Quality scores

Relationships

Main functions

Data management

--make-bed

--recode

--output-chr

--zero-cluster

--split-x/--merge-x

--set-me-missing

--fill-missing-a2

--set-missing-var-ids

--update-map...

--update-ids...

--flip

--flip-scan

--keep-allele-order...

--indiv-sort

--write-covar...

--[b]merge...

Merge failures

VCF reference merge

--merge-list

--write-snplist

--list-duplicate-vars

Basic statistics

--freq[x]

--missing

--test-mishap

--hardy

--mendel

--het/--ibc

--check-sex/--impute-sex

--fst

Linkage disequilibrium

--indep...

--r/--r2

--show-tags

--blocks

Distance matrices

Identity-by-state/Hamming

  (--distance...)

Relationship/covariance

  (--make-grm-bin...)

--rel-cutoff

Distance-pheno. analysis

  (--ibs-test...)

Identity-by-descent

--genome

--homozyg...

Population stratification

--cluster

--pca

--mds-plot

--neighbour

Association analysis

Basic case/control

  (--assoc, --model)

Stratified case/control

  (--mh, --mh2, --homog)

Quantitative trait

  (--assoc, --gxe)

Regression w/ covariates

  (--linear, --logistic)

--dosage

--lasso

--test-missing

Monte Carlo permutation

Set-based tests

REML additive heritability

Family-based association

--tdt

--dfam

--qfam...

--tucc

Report postprocessing

--annotate

--clump

--gene-report

--meta-analysis

Epistasis

--fast-epistasis

--epistasis

--twolocus

Allelic scoring (--score)

R plugins (--R)

Secondary input

GCTA matrix (.grm.bin...)

Distributed computation

Command-line help

Miscellaneous

Tabs vs. spaces

Flag/parameter reuse

System resource usage

Pseudorandom numbers

Resources

1000 Genomes

Teaching materials

Gene range lists

Functional SNP attributes

Errors and warnings

Output file list

Order of operations

For developers

GitHub repository

Compilation

Core algorithms

Partial sum lookup

Bit population count

Ternary dot product

Vertical population count

Exact statistical tests

Multithreaded gzip

Adding new functionality

Discussion forums

plink2-users

Credits

File formats

Quick index search

Order of operations

We have designed this to match PLINK 1.07's order of operations (mostly described here) whenever it's relevant. (PLINK 1.9 occasionally deviates from this literal order, but only when the difference does not affect the outcome of any computation.)

  • Load additional commands from --script
  • Apply --rerun
  • If --help present, print requested help entries and QUIT
  • If --version present, print version and QUIT
  • Apply --silent
  • Apply --out, start logging
  • Define chromosome set (--chr-set, --cow...; human if unspecified)
  • Parse remaining command line flags in lexicographic order
  • Note chromosome filter (--chr, --not-chr, --autosome, --autosome-xy)
  • Handle nonstandard input:
    • --annotate, --epistasis-summary-merge, --gene-report, --meta-analysis standalone jobs
    • On .cnv input, jump to the CNV branch below
    • With --dosage, jump to the dosage branch below
    • On --grm-gz or --grm-bin input, perform --rel-cutoff or --unrelated-heritability job, then QUIT
    • On PLINK (--file, --tfile, --lfile), VCF (--vcf, --bcf), Oxford (--data), or 23andMe (--23file) text input, perform automatic conversion to binary, then QUIT if no other commands
    • Generate random dataset (--dummy, --simulate, --simulate-qt)
  • Merge one or more filesets (--merge, --bmerge, --merge-list), then QUIT if no other reports requested
  • Read PLINK binary fileset (--bfile, or freshly autoconverted/merged)
  • Assign chromosome-and-position-based names to unnamed variants (--set-missing-var-ids)
  • Select single variant range by ID (--from, --to, --snp, --exclude-snp, --window, --from-bp...)
  • Exclude variants with multi-character allele codes (--snps-only)
  • Zero out centimorgan positions (--zero-cms)
  • Apply "--make-founders first"
  • Swap in alternate phenotype file (--pheno), or make a new phenotype (--make-pheno)
  • Convert scalar phenotype to case/control (--tail-pheno)
  • Select multiple variant ranges by ID (--snps, --exclude-snps)
  • Update variant information (--cm-map, --update-cm, --update-map, --update-name)
  • Update allele information (--update-alleles)
  • Flip strand (--flip)
  • Extract/exclude variants by ID list(s) (--extract, --exclude)
  • Filter variants by attributes (--attrib)
  • Filter variants by quality scores (--qual-scores)
  • Recode alleles to 1234/ACGT (--allele1234, --alleleACGT)
  • Random thinning of variant set (--thin, --thin-count)
  • If current PLINK binary fileset is pre-v0.99, write new binary fileset in v1.00 format (with no filtering)
  • Update sample information (--update-ids, --update-parents, --update-sex)
  • Keep/remove samples by ID list(s) (--keep, --remove, --keep-fam, --remove-fam)
  • Filter samples by attributes (--attrib-indiv)
  • Validate obligatory missing genotypes (--oblig-missing)
  • Filter samples on a covariate (--filter)
  • Set phenotypes of ambiguous-sex samples to missing, unless --allow-no-sex specified, or --make-bed/--recode/--write-covar run without other commands
  • Remove samples with missing phenotypes (--prune)
  • Filter based on sex, phenotype, and/or founder status (--filter-males...)
  • Random thinning of sample set (--thin-indiv, --thin-indiv-count)
  • Calculate per-sample genotyping rate, remove samples below threshold (--mind)
  • Define clusters (--within, --family)
  • Filter based on cluster membership (--keep-clusters, --keep-cluster-names, --remove-clusters, --remove-cluster-names)
  • Set founder status for samples with missing parent(s) (--make-founders)
  • Write .clst file (--write-cluster), then QUIT if no other reports requested
  • Load covariates (--covar)
  • Exclude constant covariates (--no-const-covar)
  • Report current founder/nonfounder counts
  • Main variant filters:
    • Calculate allele and heterozygote frequencies, or load them with --read-freq
    • Set minor alleles to A1, unless --keep-allele-order or "--freq counts" was specified
    • Load A1/A2 allele settings from file (--a1-allele, --a2-allele)
    • Write allele and heterozygote frequencies to file (--freq, --freqx), then QUIT if no other reports requested
    • Determine per-variant genotyping rates
    • Generate missing data reports (--missing), then QUIT if no other reports requested
    • Remove variants below genotyping rate threshold (--geno)
    • Hardy-Weinberg equilibrium report and/or exact test (--hardy, --hwe), then QUIT if no other reports requested
    • Apply minor allele frequency and count filters (--maf, --max-maf, --mac, --max-mac)
    • Enforce minimum spacing (--bp-space)
  • Scan for and/or filter on Mendel errors (--mendel, --me)
  • Define sets (--set, --make-set, --subset, --set-collapse-all)
  • Invert sets (--complement-sets)
  • Extract variants based on set membership (--gene, --gene-all)
  • Report variant and sample counts
  • Calculate relationship matrix and/or inbreeding coefficients, if necessary
  • Perform relationship-based pruning of samples (--rel-cutoff)
  • Write relationship matrix and GCTA inbreeding coefficients to disk (--ibc, --make-rel, --make-grm-gz, --make-grm-bin)
  • Extract principal components (--pca)
  • Regress genomic relationships/pairwise average phenotypes (--regress-rel)
  • REML additive heritability estimate (--unrelated-heritability)
  • Impute sexes from X chromosome genotype calls (--check-sex, --impute-sex)
  • Create pseudo case/control units from trio data (--tucc)
  • Write permuted phenotype table (--make-perm-pheno)
  • Assemble final pedigree, if necessary
  • Write .set.table file (--set-table)
  • Write .set file (--write-set)
  • Write .snplist file (--write-snplist)
  • Write .snp.ranges file (--write-snp-ranges)
  • Write .indel file (--list-23-indels)
  • Duplicate-position-and-allele report (--list-duplicate-vars)
  • Write .cov file (--write-covar; also induced by --make-bed and --recode)
  • Write binary fileset or subset thereof, first updating chromosome information and/or zeroing out genotype blocks if necessary (--make-bed, --make-just-bim, --make-just-fam, --merge-x, --split-x, --update-chr, --zero-cluster)
  • Write text fileset (--recode)
  • Two-locus genotype count report (--twolocus)
  • Generate list of tagging variants (--show-tags)
  • Estimate haplotype blocks (--blocks)
  • Generate runs-of-homozygosity reports (--homozyg...)
  • Perform LD-based pruning (--indep, --indep-pairwise, --indep-pairphase)
  • Perform LD-based scan for strand flips (--flip-scan)
  • Report LD statistics (--ld, --r, --r2)
  • Missing status vs. flanking haplotype association test (--test-mishap)
  • Calculate and write distance matrices (--distance, --ibs-matrix, --distance-matrix)
  • Load previously calculated triangular binary distance matrix (--read-dists)
  • Case/control distance analysis (--ibs-test, --groupdist)
  • Regress distances/pairwise average phenotypes (--regress-distance)
  • Calculate genome-wide IBS and IBD (--genome)
  • F inbreeding coefficient report (--het)
  • FST fixation index report (--fst)
  • Compute nearest neighbor-based outlier detection diagnostics (--neighbour)
  • Perform cluster and MDS analysis (--cluster, --mds-plot)
  • Perform epistasis tests (--fast-epistasis, --epistasis)
  • Apply linear scoring system to each sample (--score)
  • Run R plugin function on dataset (--R)
  • Perform association tests (looping over all phenotypes with --all-pheno, or all clusters with --loop-assoc)
    • Basic association test (--assoc, --model), followed by permutations if necessary
    • Multi-covariate association test (--linear, --logistic), followed by permutations if necessary
    • Quantitative phenotype + case/control covariate association test (--gxe)
    • LASSO regression (--lasso)
    • Stratified association tests (--mh, --mh2, --homog), followed by permutations if necessary
    • Missing status vs. case/control phenotype association test (--test-missing), followed by permutations if necessary
    • Transmission disequilibrium test (--tdt)
    • Sib-TDT-based association test (--dfam)
    • QFAM test (--qfam...)
    • Go to next phenotype/cluster, if necessary
  • Organize association reports into LD-based clumps (--clump)
  • Definitely QUIT

CNV branch (under construction):

  • Automatic --cnv-make-map (ignoring chromosome filter), if necessary
  • Select position range (--from-bp...)
  • Swap in alternate phenotype file (--pheno), or make a new phenotype (--make-pheno)
  • Convert scalar phenotype to case/control (--tail-pheno)
  • Update position information (--update-cm, --update-map, --update-name)
  • Update sample information (--update-ids, --update-parents, --update-sex)
  • Keep/remove samples by ID list(s) (--keep, --remove, --keep-fam, --remove-fam)
  • Filter samples on a covariate (--filter)
  • Filter based on sex, phenotype, and/or founder status (--filter-males, etc.)
  • Explicit --cnv-make-map

Dosage branch:

  • If .map specified, select variant range(s) by ID (--snps, --exclude-snps, --from, --to, --snp, --exclude-snp, --window, --from-bp...)
  • Apply "--make-founders first"
  • Swap in alternate phenotype file (--pheno), or make a new phenotype (--make-pheno)
  • Convert scalar phenotype to case/control (--tail-pheno)
  • If .map specified:
    • Update variant information (--update-map, --update-name)
    • Extract/exclude variants by ID list(s) (--extract, --exclude)
    • Filter variants by attributes (--attrib)
    • Filter variants by quality scores (--qual-scores)
    • Random thinning of variant set (--thin, --thin-count)
  • Update sample information (--update-ids, --update-parents, --update-sex)
  • Keep/remove samples by ID list(s) (--keep, --remove, --keep-fam, --remove-fam)
  • Filter samples by attributes (--attrib-indiv)
  • Filter samples on a covariate (--filter)
  • Set phenotypes of ambiguous-sex samples to missing, unless --allow-no-sex specified
  • Remove samples with missing phenotypes (--prune)
  • Filter based on sex, phenotype, and/or founder status (--filter-males...)
  • Define clusters (--within, --family)
  • Filter based on cluster membership (--keep-clusters, --keep-cluster-names, --remove-clusters, --remove-cluster-names)
  • Set founder status for samples with missing parent(s) (--make-founders)
  • Load covariates (--covar)
  • If .map specified, enforce minimum spacing (--bp-space)
  • Perform main --dosage operation

Developer information >>