Tech Talk | Integrating Google Cloud Dataproc with Alluxio for faster performance in the cloud


Google Cloud Dataproc is a widely used fully managed Spark and Hadoop service to run big data analytics and compute workloads in the cloud. Services like Dataproc reduce hardware spend, eliminate the need to overbuy capacity, and provide business agility. Yet users still face challenges for performance sensitive workloads or workloads running on remote data. 


Alluxio is an open source cloud data orchestration platform that increases performance of analytic workloads running on Dataproc by intelligently caching data and bringing back lost data locality. Alluxio also enables users to run compute workloads against on-prem storage like Hadoop HDFS without any app changes. 


Join us for this tech talk where Chris Crosbie and Roderick Yao from the Google Dataproc team and Dipti Borkar of Alluxio will demo how to set up Google Cloud Dataproc with Alluxio so jobs can seamlessly read from and write to Cloud Storage. They’ll also show how to run Dataproc Spark against a remote HDFS cluster. 


Save your seat!

Tuesday, December 4

10am PDT

Strategic Cloud Engineer

Speaker: Roderick Yao

Speaker: Christopher Crosbie

Product Manager, Google

Speaker: Dipti Borkar

VP, Product and Marketing

Dipti Borkar is the VP of Product & Marketing at Alluxio with over 15 years experience in data and database technology across relational and non-relational. Prior to Alluxio, Dipti was VP of Product Marketing at Kinetica and Couchbase. Dipti holds a M.S. in Computer Science from the UC San Diego, and an MBA from the Haas School of Business at UC Berkeley.