Displaying 1 to 9 from 9 results

hail - Scalable genomic data analysis.

  •    Scala

Hail is an open-source, scalable framework for exploring and analyzing genomic data. The Hail project began in Fall 2015 to empower the worldwide genetics community to harness the flood of genomes to discover the biology of human disease. Since then, Hail has expanded to enable analysis of large-scale datasets beyond the field of genomics.

locuszoom - A Javascript/d3 embeddable plugin for interactively visualizing statistical genetic data from customizable sources

  •    Javascript

LocusZoom is a Javascript/d3 embeddable plugin for interactively visualizing statistical genetic data from customizable sources. See github.com/statgen/locuszoom/wiki for full documentation and API reference.

lme4qtl - Linear mixed models (@lme4) + custom covariances + restrictions on model parameters

  •    R

lme4qtl extends the lme4 R package for quantitative trait locus (qtl) mapping. It is all about the covariance structure of random effects. lme4qtl supports user-defined matrices for that, e.g. kinship or IBDs. See slides bit.ly/1UiTZvQ introducing the lme4qtl R package or read our article / preprint.

rvtests - Rare variant test software for next generation sequencing data

  •    C++

Rvtests, which stands for Rare Variant tests, is a flexible software package for genetic association analysis for sequence datasets. Since its inception, rvtests was developed as a comprehensive tool to support genetic association analysis and meta-analysis. It can analyze both unrelated individual and related (family-based) individuals for both quantitative and binary outcomes. It includes a variety of association tests (e.g. single variant score test, burden test, variable threshold test, SKAT test, fast linear mixed model score test). It takes VCF/BGEN/PLINK format as genotype input file and takes PLINK format phenotype file and covariate file. With new implementation of the BOLT-LMM/MINQUE algorithm as well as a series of software engineering optimizations, our software package is capable of analyzing datasets of up to 1,000,000 individuals in linear mixed models on a computer workstation, which makes our tool one of the very few options for analyzing large biobank scale datasets, such as UK Biobank. RVTESTS supports both single variant and gene-level tests. It also allows for highly effcient generation of covariance matrices between score statistics in RAREMETAL format, which can be used to support the next wave of meta-analysis that incorporates large biobank datasets.

pyseer - SEER, reimplemented in python 🐍🔮

  •    Python

Kmers-based GWAS analysis is particularly well suited for bacterial samples, given their high genetic variability. This approach has been implemented by Lees, Vehkala et al., in the form of the SEER software. The reimplementation presented here should be consistent with the current version of the C++ seer (though we do not guarantee this for all possible cases).

snpsea - :bar_chart: Identify cell types and pathways affected by genetic risk loci.

  •    C++

SNPsea is an algorithm to identify cell types and pathways likely to be affected by risk loci. It requires a list of SNP identifiers and a matrix of genes and conditions.

qqman - An R package for creating Q-Q and manhattan plots from GWAS results

  •    R

Turner, S.D. qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. biorXiv DOI: 10.1101/005165.

adjclust - Adjacency-constrained hierarchical clustering of a similarity matrix

  •    R

adjclust is a package that provides methods to perform adjacency-constrained hierarchical agglomerative clustering. Adjacency-constrained hierarchical agglomerative clustering is hierarchical agglomerative clustering (HAC) in which each observation is associated to a position, and the clustering is constrained so as only adjacent clusters are merged. It is useful in bioinformatics (e.g. Genome Wide Association Studies or Hi-C data analysis). adjclust provides three user level functions: adjClust, snpClust and hicClust, which are briefly explained below.