HyperLoom: A platform for defining and executing scientific pipelines in distributed environments

Loading...
Thumbnail Image

Downloads

13

Date issued

Journal Title

Journal ISSN

Volume Title

Publisher

Association for Computing Machinery

Location

Signature

Abstract

Real-world scientific applications often encompass end-to-end data processing pipelines composed of a large number of interconnected computational tasks of various granularity. We introduce HyperLoom, an open source platform for defining and executing such pipelines in distributed environments and providing a Python interface for defining tasks. HyperLoom is a self-contained system that does not use an external scheduler for the actual execution of the task. We have successfully employed HyperLoom for executing chemogenomics pipelines used in pharmaceutic industry for novel drug discovery.

Description

Subject(s)

HPC, scientific pipeline, machine learning, big data, distributed computing, chemogenomics, task scheduling

Citation

ACM International Conference Proceeding Series. 2018, p. 1-6.

Collections