THE CHALLENGE

The high-throughput DNA sequencing equipments produce huge volumes of data with an ever increasing rate. While the sequencing cost drops by ≈5x per year,  the cost of computing at best reduces by ≈2x. Very soon, interpreting the omics data will cost a lot more than generating it.

 

 

Cloud computing services dedicated to bioinformatics seems promising to keep pace with that massive sequencing data generation. Thus, nowadays we hear a lot on cloud computing services providing speedy sequence analysis.

 

The Need

 

Despite the difficulties of processing sequencing data due to its volume and structure, the initial problem is to transmit the data from where it is produced to where it will be processed.

 

 

Typically, this is uploading the DNA sequencing data from a life science center to a remote cloud computing environment.

 

The Barriers

In such a scenario, using the regular courier services will not be sustainable due to many problems they mainly introduce stemming from the size of the data to be transmitted in practice such as privacy, tracing, accountability, and etc. Thus, the data needs to flow from digital lines. Notice that the size of the data to flow from the transmission lines will be so huge that, even in case of leased lines or direct connections, the importance of transmission is not expected to diminish.

Proposing an efficient solution for that problem requires deep understanding of the needs in the life science side, and also the capabilities on the computing side. Since doing something to better compress the data is not the ultimate goal, but reducing the response time is, the whole operation from transmission to the end of processing should be considered simultaneously with an industry perspective rather than a pure computer science research opinion.

The Gap

An interesting gap appears at this point. Life sciences people are busy with perfecting their data generation, and similarly cloud service providers focus on speeding up their processing pipeline, but the transmission of data becomes orphan and not considered seriously.

The classical way of massive data transmission is the compress – transmit – decompress scheme. However, compression/decompression of sequencing data is not an easy task due to its structure and massive size. Sometimes it takes more time and resources than required to analyze the genomic data itself. More, the resources spent on compression does not help much in further steps of analysis, and in some sense, the effort deployed on efficient transmission actually becomes wasted at the end.

Interestingly, some DNA sequencing centers submit their data to the cloud service providers on a hard that is carried by a regular courier service such as FedEx, or UPS. Surely, such a transmission is not sustainable with the increasing number of sequencing centers.

The Opportunity

The recent shining results in the genomics area will revolutionize the medicine. Personalized medicine is coming closer than ever. In the very near future, sequencing will become a regular diagnosis tool, where not only the research centers but almost all health institutions will begin generating data. Not surprisingly, getting engaged with dedicated cloud services will be the choice of those institutions, which does not prefer to deal with the hassle of computing infrastructure.

They will surely seek for the best service that provides shortest response time, which is not the time elapsed in the cloud computing side, but the total time elapsed between the beginning of the transmission and the receipt of the final result. Thus, transmitting the data in an efficient way will become more important in that sense.

OUR SOLUTION

 

Genpute is a computational system for efficient transmission of genetic data produced by high-throughput DNA sequencing equipment to cloud computing service providers, where the data is processed and turned into information.  The transmission not only helps huge sequencing data to be transferred efficiently over the Internet, but also generates highly useful information to be used in the further steps of analysis on the cloud side.

A simple analogy to describe what Genpute does is, it acts as an acrobat carrying the DNA sequencing data from where it is produced to where it will be processed through a tiny rope, the communication channel. The volume it needs to transport from one point to another is so huge that it needs to travel over that rope many times. Thus, it aims to carry as much as the rope lets at each pass. To achieve this, before beginning to carry the raw data, it puts them into special vacuum bags where the goods are squeezed to carry more at each iteration. However, only the related items are packed in the vacuum bags, and thus, when a package arrives to the destination, the destination has knowledge about what is inside and where to put it. The packaging of the volumes into vacuum bags surely takes time, but since this is done in a novel proper way, the time spent on packaging is gained back on the destination during the further operations carried out there.

Genpute offers an innovative solution to re-organize the processing pipeline beginning from data generation till the end of the final analysis. It assumes the life science center as a part of the big cloud, and begins with preprocessing of data on the sender side.  This preprocessing not only improves the compression ratio and speed but also generates useful knowledge that will help on the cloud side.


InnovationGencrobat

We cannot expect all DNA sequencing centers to have large computing resources as well as perfect internet connections. Considering that fact, Genpute offers a flexible structure that aims best utilization of the available resources on the sender side while also taking care of the bandwidth. Thus, Genpute supports connecting all sequencing centers having small commodity computers to huge clusters as well as  varying qualities of digital connection lines to the bioinformatics cloud service providers.

THE OBJECTIVES:

The Genpute platform bridges the sequencing centers and bioinformatics cloud computing services in an innovative with the aim to shorten the high-throughput DNA sequncing data analysis time. By integrating the Genpute platform, the bioinformatics cloud computing service providers have the potential to improve their service quality by reduced response time. They also will have an opportunity to enlarge their customer base with the ability of Genpute to support any sequencing center regardless of their computing infrastructure.

CARREERS

We’re looking for more great engineers!

Genpute engineering team has already transformed the computation side of DNA Squencing –and that’s only the begining. We believe their creativity and technical depth has the potential to radically disrupt industries while unlocking new value for our business. Want to add your brain to our team? Solve hard problems. Build scalable solutions.

ABOUT US

 

Founded in 2006. ERLAB is succeded 28 different R&D projects in 8 years. Today, ERLAB is focusing on bioinformatics and media sectors. Some of ERLAB’s solutions reach to 8,000,000 subscriber. The company has deep knowledge and expertise over following key technologies;

CONTACT US

İSTANBUL

Resitpasa Mahallesi, Katar Caddesi, ITU Ayazaga Kampusu, Ari 1 Teknokent Binasi, No: 2/5/1 
PK: 34467 Sariyer ISTANBUL / TURKEY 

 Phone: +90 (212) 286 63 64 

Business Hours

Mon. - Fri. 8am to 6pm

 

we enjoy sharing our experience & meeting new challenges