Benchmarking framework for 3D histology

We developed a benchmarking framework to evaluate the accuracy of several free and commercial 3D reconstruction methods using two whole slide image datasets. The results provide a solid basis for further development and application of 3D histology algorithms and indicate that methods capable of compensating for local tissue deformation are superior to simpler approaches. Published in Bioinformatics. Code available at and benchmarking dataset available here

Vipie: web pipeline for parallel characterization of viral populations from multiple NGS samples

The Vipie web application is a unique tool for multi-sample metagenomic analysis of viral data, producing searchable hits tables, interactive population maps, alpha diversity measures and clustered heatmaps that are grouped in applicable custom sample categories. Published in BMC Genomics. 18(1):378, 2017. Available at:


Interactive visualization of selected cancer datasets. Contains various boxplot, survival analysis, scatterplot, and feature matrix visualizations depending on datatypes and datasets. Available at:

Hemap resource

Hemap resource contains a dataset of 9,544 gene expression profiles collected and manually curated from the Gene Expression Omnibus (GEO) database comprising patient samples representing different cancers and proliferative disorders of hematopoietic lineage origin, cell lines and normal blood cell types. Preprint available at biorxiv. Available at

Pomo – Plotting Omics analysis results for Multiple Organisms

An interactive web-based application to visually explore omics data analysis results and associations in circular, network and grid views. POMO has built-in references for human, mouse, nematode, fly, yeast, zebrafish, rice, tomato, Arabidopsis, and Escherichia coli. In addition, POMO provides custom options that allow integrated plotting of unsupported strains or closely related species associations, such as human and mouse orthologs or two yeast wild types, studied together within a single analysis. Published in BMC Genomics 14:918, 2013. Available at

Pypette – Pythonic utilities for the analysis of high throughput sequencing data

Pypette is a collection of command line utilities and libraries for analyzing biological data. Current functionality includes: Gene expression quantification, copy number analysis, mutation and SNP analysis, chromosomal rearrangement analysis, FASTA, VCF and SAM file manipulation and other miscellaneous functionality. Available at

Segmentum: a tool for copy number analysis of cancer genome

Segmentum is a tool for the identification of CNAs and copy-neutral loss of heterozygosity (LOH) in tumor samples using whole-genome sequencing data. Segmentum segments the genome by analyzing the read-depth and B-allele fraction profiles using a double sliding window method. It requires a matched normal sample to correct for biases and to discriminate somatic from germline events. Segmentum, written in the Python programming language, is fast and performs segmentation of a whole genome in less than two minutes. Published in BMC Bioinformatics 18: 215, 2017. Available at

Breakfast – a software for detecting genomic structural variants from DNA sequencing data

Breakfast is a software for detecting genomic structural variants from DNA sequencing data. Its features include: Identifies structural variants based on breakpoint-overlapping reads, extremely fast, analyzes ~40 million reads per minute per CPU core, reports the full sequence of all breakpoint supporting reads, and shows mismatched bases. It identifies PCR/optical duplicates and does not count them as independent sources of evidence, can be run on sorted or unsorted BAM files, or can read BAM input from a pipe, uses pre-existing Bowtie indexes to speed up alignment (does not require its own index), and provides tools for filtering out rearrangements that are present in control samples. Breakfast has been used in a number of publications, most recently in TCGA PanCanAtlas Fusion analysis. Available at