Publications the journey from on-premises to serverless in scientific computing

The Journey from On-premises to Serverless in Scientific Computing

Germán Moltó, Miguel Caballer, Carlos de Alfonso, Alfonso Pérez, Amanda Calatrava, and Ignacio Blanquer. The Journey from On-premises to Serverless in Scientific Computing. In SIAM Conference on Parallel Processing for Scientific Computing, 2018.

Download

(unavailable)

Abstract

The evolution of scientific computing from on-premises infrastructures to Clouds has resulted in unprecedented changes in the last decades. The transition from physical clusters of PCs to cost-aware virtual elastic clusters introduced significant advantages for scientific computing. Then, the advancements of hypervisors and container-based technologies paved the way for serverless computing to surge in the field of scientific computing. Functions coded in different programming languages are executed in response to events on the infrastructure of a public Cloud provider, as is the case of AWS Lambda. This has introduced significant elasticity improvements, with respect to the use of Virtual Machines in an Infrastructure as a Service Cloud. Thus, it is now possible to execute in parallel thousands of invocations of the same function to perform complex distributed computations under a time-constrained execution limit. In this talk we describe the challenges of this journey and the open-source solutions developed in the context of large-scale projects, such as INDIGO-DataCloud, adopted to address these issues. We cover from more mature developments being already used in the EGI Federated Cloud to support parallel computing in scientific communities, to innovative technologies such as the execution of Docker containers on a serverless computing platform (AWS Lambda) to perform parallel Deep Learning analysis of datasets.

BibTeX Entry

@inproceedings{Molto2018jop,
   abstract = {The evolution of scientific computing from on-premises infrastructures to Clouds has resulted in unprecedented changes in the last decades. The transition from physical clusters of PCs to cost-aware virtual elastic clusters introduced significant advantages for scientific computing. Then, the advancements of hypervisors and container-based technologies paved the way for serverless computing to surge in the field of scientific computing. Functions coded in different programming languages are executed in response to events on the infrastructure of a public Cloud provider, as is the case of AWS Lambda. This has introduced significant elasticity improvements, with respect to the use of Virtual Machines in an Infrastructure as a Service Cloud. Thus, it is now possible to execute in parallel thousands of invocations of the same function to perform complex distributed computations under a time-constrained execution limit. In this talk we describe the challenges of this journey and the open-source solutions developed in the context of large-scale projects, such as INDIGO-DataCloud, adopted to address these issues. We cover from more mature developments being already used in the EGI Federated Cloud to support parallel computing in scientific communities, to innovative technologies such as the execution of Docker containers on a serverless computing platform (AWS Lambda) to perform parallel Deep Learning analysis of datasets.},
   author = {Germán Moltó and Miguel Caballer and Carlos de Alfonso and Alfonso Pérez and Amanda Calatrava and Ignacio Blanquer},
   booktitle = {SIAM Conference on Parallel Processing for Scientific Computing},
   title = {The Journey from On-premises to Serverless in Scientific Computing},
   year = {2018}
}