DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.DeepVariant is a suite of Python/C++ programs that run on any Unix-like operating system. For convenience the documentation refers to building and running DeepVariant on Google Cloud Platform, but the tools themselves can be built and run on any standard Linux computer, including on-premise machines. Note that DeepVariant currently requires Python 2.7 and does not yet work with Python 3.
tensorflow deep-neural-network genomics science dna sequencing genome bioinformatics deep-learning ngs deepvariant machine-learningBuild on top of Plotly.js, React, and Flask, Dash ties modern UI elements like dropdowns, sliders, and graphs directly to your analytical python code. Here’s a 43-line example of a Dash App that ties a Dropdown to a D3.js Plotly Graph. As the user selects a value in the Dropdown, the application code dynamically exports data from Google Finance into a Pandas DataFrame. This app was written in just 43 lines of code (view the source).
dash plotly data-visualization data-science gui-framework flask react finance bioinformatics technical-computing charting plotly-dash web-appConda is a platform- and language-independent package manager that sports easy distribution, installation and version management of software. The bioconda channel is a Conda channel providing bioinformatics related packages for Linux and Mac OS. This repository hosts the corresponding recipes. Please visit https://bioconda.github.io for details.
bioinformatics conda package-managementThe Biopython Project is an international association of developers of freely available Python tools for computational molecular biology. The NEWS file summarises the changes in each release of Biopython.
bioinformatics genomics protein-structure dna protein biopython phylogenetics sequence-alignmentList of software packages (and the people developing these methods) for single-cell data analysis, including RNA-seq, ATAC-seq, etc. Contributions welcome... Gender bias at conferences is a well known problem (http://www.sciencemag.org/careers/2015/07/countering-gender-bias-conferences). Creating a list of potential speakers can help mitigate this bias and a community of people developing and maintaining helps to further diversify this list beyond smaller networks.
rna-seq-data gene-expression scrna-seq-data bioinformatics awesome-list dimensionality-reduction cell-cycle atac-seq analysis cell-differentiation clusteringNote: minimap2 has replaced BWA-MEM for PacBio and Nanopore read alignment. It retains all major BWA-MEM features, but is ~50 times as fast, more versatile, more accurate and produces better base-level alignment. BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to a few megabases. BWA-MEM and BWA-SW share similar features such as the support of long reads and chimeric alignment, but BWA-MEM, which is the latest, is generally recommended as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.
bioinformatics sequence-alignmentBioJava is an open-source project dedicated to providing a Java framework for processing biological data. It provides analytical and statistical routines, parsers for common file formats and allows the manipulation of sequences and 3D structures. The goal of the biojava project is to facilitate rapid application development for bioinformatics.
biological statistics bio-informatics bioinformatics analyticsSequence Processing includes tasks such as demultiplexing raw read data, and trimming low quality bases. The following tools can be used to visualize genomic data or for constructing customized visualizations of genomic data including sequence data from DNA-Seq, RNA-Seq, and ChIP-Seq, variants, and more.
bioinformaticsWith the rise of big data, techniques to analyse and run experiments on large datasets are increasingly necessary. Parallelization and distributed computing are the best ways to tackle this kind of problem, but the tools commonly available to the bioinformatics community traditionally lack good support for these techniques, or provide a model that fits badly with the specific requirements in the bioinformatics domain and, most of the time, require the knowledge of complex tools or low-level APIs.
bioinformatics workflow-engine pipeline pipeline-framework nextflow cloud data-flow sge slurm aws docker singularity hpc singularity-containers reproducible-science reproducible-researchThe official source code repository is at https://github.com/dib-lab/khmer and project documentation is available online at http://khmer.readthedocs.io. See http://khmer.readthedocs.io/en/stable/introduction.html for an overview of the khmer project. khmer is research software, so you should cite us when you use it in scientific publications! Please see the CITATION file for citation information.
dna k-mer bloom-filter count-min-sketch graph-traversal bioinformaticsYou may wish to make changes from the default configuration. This can be done in the config/galaxy.ini file. Note that not all dependencies for the tools provided in the tool_conf.xml.sample are included. To install them please visit "Manage dependencies" in the admin interface.
bioinformatics workflow genomics science sequencing ngs dna usegalaxy docker pipeline workflow-engineThe only library dependency is zlib.
bioinformatics sequence-analysisPlease see the GATK website, where you can download a precompiled executable, read documentation, ask questions, and receive technical support. This repository contains the next generation of the Genome Analysis Toolkit (GATK). The contents of this repository are 100% open source and released under the BSD 3-Clause license (see LICENSE.TXT).
genomics spark science dna ngs sequencing genome bioinformatics gatkBio4j is a bioinformatics graph based DB including most data available in Uniprot KB (SwissProt + Trembl), Gene Ontology (GO), UniRef (50,90,100), RefSeq, NCBI Taxonomy, and Expasy Enzyme DB. Bio4j provides a completely new and powerful framework for protein related information querying and management. Since it relies on a high-performance graph engine, data is stored in a way that semantically represents its own structure.
bioinformatics graph-database database graphsHail is an open-source, scalable framework for exploring and analyzing genomic data. The Hail project began in Fall 2015 to empower the worldwide genetics community to harness the flood of genomes to discover the biology of human disease. Since then, Hail has expanded to enable analysis of large-scale datasets beyond the field of genomics.
genetics vcf genomics spark gwas bioinformaticsMultiQC is a tool to create a single report with interactive plots for multiple bioinformatics analyses across many samples. MultiQC is written in Python (tested with v2.7, 3.4, 3.5 and 3.6). It is available on the Python Package Index and through conda using Bioconda.
bioinformatics analysis pypi bioconda multiqcA tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance. By default, the HTML report is saved to fastp.html (can be specified with -h option), and the JSON report is saved to fastp.json (can be specified with -j option).
fastq qc preprocessing filtering adapter overlap quality trimming splitting quality-control filter ngs bioinformatics overlapping error umi sequencing illumina polyg duplicationDash Bio is a suite of bioinformatics components built to work with Dash. Learn more about Dash at https://plotly.com/products/dash/.
bioinformatics dash biojsA Dash component library for creating interactive and customizable networks in Python, wrapped around Cytoscape.js. If you want to install the latest versions, check out the Dash docs on installation.
data-science bioinformatics plotly network-graph computational-biology dash biopython graph-theory cytoscape network-visualization cytoscapejs plotly-dashA powerful open source data warehouse system. InterMine allows users to integrate diverse data sources with a minimum of effort, providing powerful web-services and an elegant web-application with minimal configuration. InterMine powers some of the largest data-warehouses in the life sciences.
data-warehouse bioinformatics genomics genetics clojurescript biology perl postgresql lgplv3 tomcat data-visualization data-visualisation life-science genome biologists
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.