Apache Spark


Oct 2017

Apache Spark for enterprise datawarehousing

Most people, when they think of Apache Spark think machine learning and data science. Spark is so much more than that. Enterprises today, struggle to make sense of the alphabet soup in Hadoop. Big Data was synonymous with Hadoop. However Hadoop is not one thing. The biggest value of Hadoop for analytics and B.I was, and is HDFS which is a distributed file system. But...

Read More


Sep 2017

5 ways to rethink your data access strategy

Leveraging data is increasingly becoming critical to business success. However much of the data is locked in slow datamarts, legacy OLAP cubes and inaccessible Hadoop clusters. As a result, business users are stuck in second gear on an old car, unable to move at the speed needed to execute effectively. Dashboards, A.I and cool visualizations may dominate the conversation around analytics and B.I, but the...

Read More


Sep 2017


We have seen before, from our benchmarking exercises, how efficient SNAP can be in providing the best performance at the lowest cost. SNAP does not need specialized hardware, GPUs or large clusters. Many benchmarks do not focus on real use cases involving true adhoc queries accessing multiple regions of a multi-dimensional  large dataset. For example running Tableau workloads is different from hand written benchmark queries....

Read More


Aug 2017

Connecting Excel to SNAP

SNAP can be accessed using standard B.I tools and Excel is one of the most widely used tools for analysis, Pivots and much more. To connect Excel simply use an ODBC Driver for Spark such as from Simba. On a mac you can setup a user DSN as follows   To connect to SNAP you can use the Simba ODBC Spark Driver. Since SNAP is...

Read More


May 2017

IOT data analysis at scale with Tableau and SNAP

SNAP, is being used by ad-tech, fortune 50 and in general, companies that have large amounts of data. An area that SNAP is particularly good at is analyzing multi-dimensional data for very fast ad-hoc queries. Structured IOT data and machine data from sensors in devices are particularly suited for this. For example take a sample dataset recording readings from sensors every minute about temperature and...

Read More

Page 2 of 512345