Community Online Office Hour
Today’s conventional wisdom states that network latency across the two ends of a hybrid cloud prevents you from running analytic workloads in the cloud with the data on-prem. As a result, most companies copy their data into a cloud environment and maintain that duplicate data. All of this means that it is challenging to make both on-prem HDFS data accessible with the desired application performance.
In this talk, we will show you how to leverage any public cloud (AWS, Google Cloud Platform, or Microsoft Azure) to scale analytics workloads directly on on-prem data without copying and synchronizing the data into the cloud.
In this Office Hour, we will go over:
- A strategy to embrace the hybrid cloud, including an architecture for running ephemeral compute clusters using on-prem HDFS.
- An example of running on-demand Presto, Spark, and Hive with Alluxio in the public cloud.
- An analysis of experiments with TPC-DS to demonstrate the benefits of the given architecture.
Interested in learning more?
Save your spot
Burst Presto & Spark workloads to AWS EMR
with no data copies
Tuesday, April 28
Adit Madan is a core maintainer and PMC member of the Alluxio Open Source project. His experience is in distributed systems, storage systems, and large scale data analytics. He has an M.S. from Carnegie Mellon University and a B.S. from IIT.
Software Engineer, Alluxio
Speaker: Adit Madan
Speaker: Bin Fan
Founding Engineer & VP of OS, Alluxio
Bin Fan is the founding engineer and VP of Open Source at Alluxio, Inc. Prior to Alluxio, he worked for Google to build the next-generation storage infrastructure. Bin received his Ph.D. in Computer Science from Carnegie Mellon University on the design and implementation of distributed systems.
...a data orchestration layer for compute in any cloud. It unifies data silos on-premise and across any cloud to give you data locality, accessibility, and elasticity.
Whether it’s accelerating big data frameworks on the public cloud, running big data workloads in hybrid cloud environments, or enabling big data on object stores or multiple clouds, Alluxio reduces the complexities associated with orchestrating data for today’s big data and AI/ML workloads.