97. Developing A Scalable Platform For Next-Generation Sequencing Data Analytics Over Heterogeneos Clouds and HPCs : A Case for Transcriptomes and Metagenomes
Event Type
Poster
LocationLower Lobby Concourse
DescriptionA novel scalable pipeline for metagenome/transcriptome is presented. Thanks to the underlying distributed computing platform, a significant roadblock in Next-Generation Sequencing data analytics, associated with ever-growing and noisy data sets, can be effectively resolved.
On top of the core feature for accessing and utilizing heterogeneous distributed computing resources including HPCs and Clouds (EC2, OpenStack-based, and IBM Bluemix), the distributed application runtime environment is built for efficient management of massive workloads and data processing tasks by leveraging high-end HPC technologies, emerging Hadoop-based software models, and DOCKER. The consequently available repertoire of options for flexible and scalable runtime scenarios constitutes the pipeline for dealing with any size of data sets. In order to maximize benefits from the scalable platform, a novel method was developed for de novo genome sequence reconstruction with Multiple Assembly Multiple Parameter (MAMP) and available with the pipeline. Preliminary results indicate great potentials of MAMP.
On top of the core feature for accessing and utilizing heterogeneous distributed computing resources including HPCs and Clouds (EC2, OpenStack-based, and IBM Bluemix), the distributed application runtime environment is built for efficient management of massive workloads and data processing tasks by leveraging high-end HPC technologies, emerging Hadoop-based software models, and DOCKER. The consequently available repertoire of options for flexible and scalable runtime scenarios constitutes the pipeline for dealing with any size of data sets. In order to maximize benefits from the scalable platform, a novel method was developed for de novo genome sequence reconstruction with Multiple Assembly Multiple Parameter (MAMP) and available with the pipeline. Preliminary results indicate great potentials of MAMP.
Archive
Authors








