biomaj - BioMAJ

  •        4

This project is a complete rewrite of BioMAJ (http://biomaj.genouest.org). BioMAJ (BIOlogie Mise A Jour) is a workflow engine dedicated to data synchronization and processing. The Software automates the update cycle and the supervision of the locally mirrored databank repository.

http://genouest.github.io/biomaj/
https://github.com/genouest/biomaj

Tags
Implementation
License
Platform

   




Related Projects

scikit-bio - scikit-bio is an open-source, BSD-licensed, Python package providing data structures, algorithms, and educational resources for bioinformatics

  •    Python

scikit-bio is an open-source, BSD-licensed Python 3 package providing data structures, algorithms and educational resources for bioinformatics. To view scikit-bio's documentation, visit scikit-bio.org.

BioJava - Java Framework for Processing Biological Data

  •    Java

BioJava is an open-source project dedicated to providing a Java framework for processing biological data. It provides analytical and statistical routines, parsers for common file formats and allows the manipulation of sequences and 3D structures. The goal of the biojava project is to facilitate rapid application development for bioinformatics.

Open Babel - The Open Source Chemistry Toolbox

  •    C++

Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It is an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, biochemistry, or related areas.

Awesome-Bioinformatics - A curated list of awesome Bioinformatics libraries and software.

  •    

Sequence Processing includes tasks such as demultiplexing raw read data, and trimming low quality bases. The following tools can be used to visualize genomic data or for constructing customized visualizations of genomic data including sequence data from DNA-Seq, RNA-Seq, and ChIP-Seq, variants, and more.

nextflow - A DSL for data-driven computational pipelines

  •    Groovy

With the rise of big data, techniques to analyse and run experiments on large datasets are increasingly necessary. Parallelization and distributed computing are the best ways to tackle this kind of problem, but the tools commonly available to the bioinformatics community traditionally lack good support for these techniques, or provide a model that fits badly with the specific requirements in the bioinformatics domain and, most of the time, require the knowledge of complex tools or low-level APIs.


Bio4j - Bioinformatics Graph based DB

  •    Java

Bio4j is a bioinformatics graph based DB including most data available in Uniprot KB (SwissProt + Trembl), Gene Ontology (GO), UniRef (50,90,100), RefSeq, NCBI Taxonomy, and Expasy Enzyme DB. Bio4j provides a completely new and powerful framework for protein related information querying and management. Since it relies on a high-performance graph engine, data is stored in a way that semantically represents its own structure.

Bioinformatics-Training - Bioinformatics training resources

  •    R

This is a resource for learning more about PacBio data and bioinformatics analysis. THIS WEBSITE AND CONTENT AND ALL SITE-RELATED SERVICES, INCLUDING ANY DATA, ARE PROVIDED "AS IS," WITH ALL FAULTS, WITH NO REPRESENTATIONS OR WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY, SATISFACTORY QUALITY, NON-INFRINGEMENT OR FITNESS FOR A PARTICULAR PURPOSE. YOU ASSUME TOTAL RESPONSIBILITY AND RISK FOR YOUR USE OF THIS SITE, ALL SITE-RELATED SERVICES, AND ANY THIRD PARTY WEBSITES OR APPLICATIONS. NO ORAL OR WRITTEN INFORMATION OR ADVICE SHALL CREATE A WARRANTY OF ANY KIND. ANY REFERENCES TO SPECIFIC PRODUCTS OR SERVICES ON THE WEBSITES DO NOT CONSTITUTE OR IMPLY A RECOMMENDATION OR ENDORSEMENT BY PACIFIC BIOSCIENCES.

csvtk - A cross-platform, efficient and practical CSV/TSV toolkit in Golang

  •    Go

Similar to FASTA/Q format in field of Bioinformatics, CSV/TSV formats are basic and ubiquitous file formats in both Bioinformatics and data sicence. People usually use spreadsheet softwares like MS Excel to do process table data. However it's all by clicking and typing, which is not automatically and time-consuming to repeat, especially when we want to apply similar operations with different datasets or purposes.

bioinformatics - :microscope: Path to a free self-taught education in Bioinformatics!

  •    

This is a solid path for those of you who want to complete a Bioinformatics course on your own time, for free, with courses from the best universities in the World. In our curriculum, we give preference to MOOC (Massive Open Online Course) style courses because these courses were created with our style of learning in mind.

reflow - A language and runtime for distributed, incremental data processing in the cloud

  •    Go

Reflow is a system for incremental data processing in the cloud. Reflow enables scientists and engineers to compose existing tools (packaged in Docker images) using ordinary programming constructs. Reflow then evaluates these programs in a cloud environment, transparently parallelizing work and memoizing results. Reflow was created at GRAIL to manage our NGS (next generation sequencing) bioinformatics workloads on AWS, but has also been used for many other applications, including model training and ad-hoc data analyses. Reflow thus allows scientists and engineers to write straightforward programs and then have them transparently executed in a cloud environment. Programs are automatically parallelized and distributed across multiple machines, and redundant computations (even across runs and users) are eliminated by its memoization cache. Reflow evaluates its programs incrementally: whenever the input data or program changes, only those outputs that depend on the changed data or code are recomputed.

hail - Scalable genomic data analysis.

  •    Scala

Hail is an open-source, scalable framework for exploring and analyzing genomic data. The Hail project began in Fall 2015 to empower the worldwide genetics community to harness the flood of genomes to discover the biology of human disease. Since then, Hail has expanded to enable analysis of large-scale datasets beyond the field of genomics.

BioDWH: Bioinformatics Data Warehouse

  •    Java

BioDWH is a bioinformatics data warehouse software kit that integrates biological information from multiple public life science data sources into a local RDBMS. It provides up-to-date integrated knowledge, platform and database independence.

awesome - Awesome resources on Bioinformatics, data science, machine learning, programming language (Python, Golang, R, Perl) and miscellaneous stuff

  •    

Collection of useful resources on Bioinformatics, data science, machine learning, programming language (Python, Golang, R, Perl, etc.) and miscellaneous stuff.

common-workflow-language - Repository for the CWL standards

  •    Python

The Common Workflow Language (CWL) is a specification for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry. CWL is developed by a multi-vendor working group consisting of organizations and individuals aiming to enable scientists to share data analysis workflows. The CWL project is maintained on Github and we follow the Open-Stand.org principles for collaborative open standards development. Legally, CWL is a member project of Software Freedom Conservancy and is formally managed by the elected CWL leadership team, however every-day project decisions are made by the CWL community which is open for participation by anyone.

EDAM Ontology

  •    

EDAM is a bioinformatics tools and data ontology. Its defined terms and relationships provide a structured, controlled vocabulary to describe in semantic terms bioinformatics web services, data schema, tools, web servers, databases and so on.

rust-bio - This library provides implementations of many algorithms and data structures that are useful for bioinformatics

  •    Rust

This library provides implementations of many algorithms and data structures that are useful for bioinformatics. All provided implementations are rigorously tested via continuous integration. Please see the homepage for examples and documentation.

BOW - Bioinformatics On Windows

  •    

A group of tools run on Windows for Bioinformatics. Include ported tools from Linux (e.g. BWA, SAMTOOLS), and later original Windows applications.

vuong-mediapp: Multimedia BioInformatics

  •    Java

Multimedia, Medicine Computing and BioInformatics --- This Project is a collection of several subprojects for Solutions in Multimedia, Medicine Computing and BioInformatics focus on video-,EEG- amp; Multichanels-signals developped in Web 20, J2EE.

MultiQC - Aggregate results from bioinformatics analyses across many samples into a single report.

  •    Python

MultiQC is a tool to create a single report with interactive plots for multiple bioinformatics analyses across many samples. MultiQC is written in Python (tested with v2.7, 3.4, 3.5 and 3.6). It is available on the Python Package Index and through conda using Bioconda.