Spark Index


27

Sep 2017

5 ways to rethink your data access strategy

Leveraging data is increasingly becoming critical to business success. However much of the data is locked in slow datamarts, legacy OLAP cubes and inaccessible Hadoop clusters. As a result, business users are stuck in second gear on an old car, unable to move at the speed needed to execute effectively. Dashboards, A.I and cool visualizations may dominate the conversation around analytics and B.I, but the...

Read More


06

Sep 2017

ROI with SNAP

We have seen before, from our benchmarking exercises, how efficient SNAP can be in providing the best performance at the lowest cost. SNAP does not need specialized hardware, GPUs or large clusters. Many benchmarks do not focus on real use cases involving true adhoc queries accessing multiple regions of a multi-dimensional  large dataset. For example running Tableau workloads is different from hand written benchmark queries....

Read More


05

Apr 2017

Querying S3 datalakes – SQL and Tableau on S3

For many, AWS S3 is not just a deep storage, but a viable option for storing data that can be consumed by reporting and analytics tools. Sparkline SNAP works seamlessly with S3 data directly and exposes your data to tools like Tableau with very fast response times across hundreds of concurrent user sessions. Below is a video of data from a Star Schema Benchmark data...

Read More


14

Mar 2017

Making data useful and ubiquitous

Datawarehouses have evolved over the years. With Hadoop reaching a level of maturity and Spark as a powerful engine to power various workloads, we are now at a point to truly democratize consumption of data to power insights. Savvy data driven companies, combine the power of automated data analysis with human insights. In order to get everyone in an organization to leverage the data, data...

Read More


17

Sep 2016

Going beyond Data Lakes

We often see customers start to build data lakes on Hadoop or S3 as way to get their transactional data with dimensional data in a common place. This data is cleaned and organized in a star schema like in an enterprise data warehouse. The challenge begins here since consuming data in a Hadoop data lake is not easy. The first challenge is ad hoc analytics....

Read More