Datasapiens is an international data-analytics startup based in Prague. We help our clients to uncover the value of their data and open up new revenue streams for them. We provide an end-to-end service that manages the data pipeline and automates the process of generating data insights.
In this talk, we will describe how we have solved an issue with large S3 API costs incurred by Presto under several usage concurrency levels by implementing Alluxio as a data orchestration layer between S3 and Presto. Also, we will show the results of an experiment with estimating the per-query S3 API costs using the TPC-DS dataset.
This talk will focus on:
- The Hadoop ecosystem at Datasapiens
- Drastic increase of S3 API costs during performance tests with Presto
- S3 API costs tests with TPC-DS
- Implications to the cloud data lake architecture
Interested in learning more?
Save your spot
Online Meetup | Reducing large S3 API costs using Alluxio at Datasapiens
Tuesday, August 4
Koen holds a Master in Marketing Communication Sciences and an Advanced Master of Science – magna cum laude – in Marketing Analysis from Ghent University. He worked 7 years at dunnhumby in various roles including promotions, trade intelligence & shopper thoughts in the UK. Later he became the head of the solutions team for the CZ & SK market spearheading the innovation of cloud technologies, open source software development and interactive data visualizations. Koen has extensive experience in delivering insights at board level.
CEO, co-founder, Datasapiens
Speaker: Koen Michiels
Juraj leads the technical development. Covering application development, data engineering, and data science. He studied pure and applied mathematics at the Czech Technical University in Prague. Juraj’s past experience includes Deloitte – as a financial modeler – and Deutsche Boerse as a software developer. Juraj is passionate about modern technologies and mathematical models.
Speaker: Juraj Pohanka
Bin Fan is the founding engineer and VP of Open Source at Alluxio, Inc. Prior to Alluxio, he worked for Google to build the next-generation storage infrastructure. Bin received his Ph.D. in Computer Science from Carnegie Mellon University on the design and implementation of distributed systems.
Speaker: Bin Fan
Founding Engineer & VP of OS, Alluxio
...a data orchestration layer for compute in any cloud. It unifies data silos on-premise and across any cloud to give you data locality, accessibility, and elasticity.
Whether it’s accelerating big data frameworks on the public cloud, running big data workloads in hybrid cloud environments, or enabling big data on object stores or multiple clouds, Alluxio reduces the complexities associated with orchestrating data for today’s big data and AI/ML workloads.