Displaying 1 to 20 from 24 results

deepvariant - DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data

  •    Python

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.DeepVariant is a suite of Python/C++ programs that run on any Unix-like operating system. For convenience the documentation refers to building and running DeepVariant on Google Cloud Platform, but the tools themselves can be built and run on any standard Linux computer, including on-premise machines. Note that DeepVariant currently requires Python 2.7 and does not yet work with Python 3.

khmer - In-memory nucleotide sequence k-mer counting, filtering, graph traversal and more

  •    Python

The official source code repository is at https://github.com/dib-lab/khmer and project documentation is available online at http://khmer.readthedocs.io. See http://khmer.readthedocs.io/en/stable/introduction.html for an overview of the khmer project. khmer is research software, so you should cite us when you use it in scientific publications! Please see the CITATION file for citation information.

galaxy - Data intensive science for everyone.

  •    Python

You may wish to make changes from the default configuration. This can be done in the config/galaxy.ini file. Note that not all dependencies for the tools provided in the tool_conf.xml.sample are included. To install them please visit "Manage dependencies" in the admin interface.

gatk - Official code repository for GATK versions 4 and up

  •    Java

Please see the GATK website, where you can download a precompiled executable, read documentation, ask questions, and receive technical support. This repository contains the next generation of the Genome Analysis Toolkit (GATK). The contents of this repository are 100% open source and released under the BSD 3-Clause license (see LICENSE.TXT).



Implementation of the PCluster model for clustarizing DNA and other kinds of microarray data. It is written in C#.

.NET Bio

  •    DotNet

.Net Bio is a language-neutral bioinformatics toolkit built using the Microsoft 4.5 .NET Framework to help developers, researchers, and scientists.

nucleus - Python and C++ code for reading and writing genomics data.

  •    Python

Nucleus is a library of Python and C++ code designed to make it easy to read, write and analyze data in common genomics file formats like SAM and VCF. In addition, Nucleus enables painless integration with the TensorFlow machine learning framework, as anywhere a genomics file is consumed or produced, a TensorFlow tfrecords file may be substituted. For all other systems, you will need to first install CLIF by following the instructions at https://github.com/google/clif#installation before running install.sh.

bionode - Modular and universal bioinformatics

  •    Javascript

To use bionode as a command line tool, you can install it globally with -g. Or, if you want to use it as a JavaScript library, you need to install it in your local project folder inside the node_modules directory by doing the same command without -g.

dna - utility functions to handle DNA/RNA string data (JavaScript, Node.js)

  •    Javascript

Gets a complementary strand of str. If rev is true, reverse the sequence. (5' -> 3').

arv - A fast 23andMe DNA parser and inferrer for Python

  •    C++

Arv (Norwegian; "heritage" or "inheritance") is a Python module for parsing raw 23andMe genome files. It lets you lookup SNPs from RSIDs. See below for software requirements.

dna-traits - A fast 23andMe genome text file parser, now superseded by arv

  •    Python

This project has been abandoned in favor of the newer and better https://github.com/cslarsen/arv — which is faster, installable via pip and works on both Python 2 and 3. On my machine, a 2010-era MBP with SSD, I can parse a 24 Mb file using Python's csv module and create a dictionary in 2.5 seconds. Pandas takes around 2.1 seconds, and I've seen some parsers take up to 8.

dna-monitor - A simple device monitoring tool for e-cigarettes with Evolv DNA chipset :cloud: :chart_with_upwards_trend: Works with macOS and Linux

  •    HTML

A simple device monitoring tool for e-cigarettes with Evolv DNA chipset. Works with macOS and Linux. This tool can't - and will never - replace the Escribe software, it's just the device-monitoring part. For configuration of preheat, profiles, wires, themes etc you still need Escribe.

NtSeq - JavaScript (node + browser) bioinformatics library for nucleotide sequence manipulation and analysis

  •    Javascript

NtSeq is an open source Bioinformatics library written in JavaScript that provides DNA sequence manipulation and analysis tools for node and the browser. More specifically, it's a library for dealing with all kinds of nucleotide sequences, including degenerate nucleotides. It's built with the developer (and scientist) in mind with simple, readable methods that are part of the standard molecular biologist's vocabulary.

catch - A package for designing compact and comprehensive probe sets.

  •    Python

CATCH is a Python package for designing probe sets to use in hybrid capture experiments. Installing CATCH with pip, as described below, will install NumPy and SciPy if they are not already installed.

kevlar - Reference-free variant discovery in large eukaryotic genomes

  •    Python

Welcome to kevlar, software for predicting de novo genetic variants without mapping reads to a reference genome! kevlar's k-mer abundance based method calls single nucleotide variants (SNVs) as well as short, medium and long insertion/deletion variants (indels) simultaneously. This software is free for use under the MIT license. If you have questions or need help with kevlar, the GitHub issue tracker should be your first point of contact.

Bio.jl - Bioinformatics and Computational Biology Infrastructure for Julia

  •    Julia

BioJulia is a bioinformatics and computational biology infrastructure project, built with and for the julia language for technical computing. This package Bio is the flagship package of the project. Bio is actually better described as a meta-package. It actually consolidates many other smaller packages in the BioJulia package ecosystem and makes them easier to install and use together, with less worry about version compatiblity and dependencies.

shasta - Experimental software for de novo assembly from Nanopore sequencing data.

  •    C++

In addition to Oxford Nanopore reads, these methods might also apply to long reads from other technologies, such as the Pacific Biosciences DNA sequencing platforms. This project is at an early stage. Its main output is currently in the form of a marker graph, not assembled sequence. It does include functionality to extract and display a local portion of the global marker graph, and to use it interactively for local assembly.

PHAT - Pathogen-Host Analysis Tool - A modern Next-Generation Sequencing (NGS) analysis platform

  •    TypeScript

The Pathogen-Host Analysis Tool (PHAT) is an application for processing and analyzing next-generation sequencing (NGS) data as it relates to relationships between pathogen and host organisms. PHAT provides quality control (QC) reporting on sequence files, alignment of sequence files against reference files, single-nucleotide polymorphism (SNP) prediction, linear and circular alignment viewing, and Excel and comma separated values (CSV) output. PHAT is under development in the Zehbe Lab (http://zehbelab.weebly.com/) at the Thunder Bay Regional Health Research Institute (TBRHRI) and Lakehead University (LU) under the supervison of Dr. Ingeborg Zehbe. This work is supported by a Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery grant to Dr. Ingeborg Zehbe (#RGPIN-2015-03855) and a NSERC Alexander Graham Bell Canada Graduate Scholarship-Doctoral (CGS-D) to Robert Jackson (#454402-2014).

IAP - Illumina analysis pipeline

  •    Perl

Illumina analysis pipeline. IAP is configured using ini files and on run/analysis level using a config file. The idea is to have one ini file per run/analysis type (e.g. exome sequencing). Every setting can be reconfigured in the run/analysis config file. All ini files are located in the settings subfolder. The run/analysis config is created using the illumina_createConfig script and is stored in the ouput directory.

pyfaidx - Efficient pythonic random access to fasta subsequences

  •    Python

Samtools provides a function "faidx" (FAsta InDeX), which creates a small flat index file ".fai" allowing for fast random access to any subsequence in the indexed FASTA file, while loading a minimal amount of the file in to memory. This python module implements pure Python classes for indexing, retrieval, and in-place modification of FASTA files using a samtools compatible index. The pyfaidx module is API compatible with the pygr seqdb module. A command-line script "faidx" is installed alongside the pyfaidx module, and facilitates complex manipulation of FASTA files without any programming knowledge. Shirley MD, Ma Z, Pedersen B, Wheelan S. Efficient "pythonic" access to FASTA files using pyfaidx. PeerJ PrePrints 3:e1196. 2015.

We have large collection of open source products. Follow the tags from Tag Cloud >>

Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.