Many organizations want to run big data analytics with frameworks such as Presto on public clouds. However, reading and writing data to S3 directly can result in slow and inconsistent performance. Alluxio is a data orchestration layer for the cloud, and in this use case it caches data for S3, ensuring high and predictable performance as well as reduced network traffic.
In this Office Hour we'll go over:
How to set up Presto with Alluxio such that Presto jobs can seamlessly read from and write to S3
Compare the performance between Presto on S3 with Presto and Alluxio on S3
Open Session for discussion on any topics such as solving the separation of compute and storage problem, and more
Building Fast SQL Analytics with Presto, Alluxio, and S3
Prior to Alluxio, Nakkul worked as a consultant where he built and supported an entirely open source Hadoop platform for financial services clients.
Software Engineer at Alluxio
Speaker: Nakkul Sreenivas
Bin Fan is the founding engineer of Alluxio, Inc. and the PMC member of Alluxio open source project. Prior to Alluxio, he worked for Google where he won the Technical Infrastructure Award. Bin received his Ph.D. in Computer Science from Carnegie Mellon University working on distributed systems
Evangelist and Founding Member at Alluxio
Speaker: Bin Fan
...a data orchestration layer for compute in any cloud. It unifies data silos on-premise and across any cloud to give you data locality, accessibility, and elasticity.
Whether it’s accelerating big data frameworks on the public cloud, running big data workloads in hybrid cloud environments, or enabling big data on object stores or multiple clouds, Alluxio reduces the complexities associated with orchestrating data for today’s big data and AI/ML workloads.