Bioinformatics Data Science for Research

Together, we ask questions – and find valuable insights

VUGENE offers a complete bioinformatics platform for statistical and machine learning analyses of multi-omics research data.

Our services are tailored for research laboratories studying the origins, causes, treatments and biomarkers of various complex disorders, such as cancers and neurodegenerative disorders.

Diverse Expertise

We lead the data analytic process from raw data to publication. That includes processing of raw data, quality control, statistical analysis, training of machine learning models, biological interpretation of the results and their eventual dissemination.

Quality control. Our pipelines include rigorous QC analyses and plenty of graphs that help you understand whether the data are what they should be.
Statistics and machine learning. We find biomarkers, train and evaluate predictive models, perform functional analysis to elucidate the biological meaning of the findings.
Quick turnaround. We leverage in-house pipelines to quickly perform initial analysis. Our team builds on top of these results to deliver actionable insights into the data.
Data visualization. Our graphic design helps to make your insights clear to any audience.

*Clustering dendrogram is an important part of experiment quality control. Visualisation of sample clusters allows to identify outliers and trends in the data.*

Transparent workflow

*Clustering and visualization of gene set enrichment pathways enables quick overview of biological functions and changes in the data.*

Whatever we do you are always in control of the time, the code and the results.

Reproducible data analysis. Every research finding that we report has a corresponding piece of code making it fully reproducible.
Continuous integration and development. Whenever we make a change to the data analysis code, we review it and repeat the full analysis to make sure all the reports stay consistent.
Agile practices. We talk with our customers and document their research questions as requirements. We tackle the questions through a sprint and then reconvene to discuss the results.
Time tracking. We are your bioinformatics team for as much as you need us.

Computational platform for any scale

We performed more than 200 analyses of individual datasets last year.

Process automation. For repeated experiments, we will develop a pipeline that triggers as soon as new data arrives.
Cloud data storage close to the customer. If you are in the US, your data will not leave US. Likewise, for our EU customers the data stays within the EU. We write the code needed to study the data and it runs in virtual machines where the data is.
Scalable computing infrastructure. Our code runners will run as many analyses as needed on the computers best suited for the task. For example, we use GPU cards to align terabytes of whole genome bisulfite sequencing reads (WGBS) in mere hours.

*We use GitLab for code versioning and JIRA for project planning and agile development. All of our programs run on Google Cloud.*