Apache Spark


27

Sep 2017

5 ways to rethink your data access strategy

Leveraging data is increasingly becoming critical to business success. However much of the data is locked in slow datamarts, legacy OLAP cubes and inaccessible Hadoop clusters. As a result, business users are stuck in second gear on an old car, unable to move at the speed needed to execute effectively. Dashboards, A.I and cool visualizations may dominate the conversation around analytics and B.I, but the...

Read More


06

Sep 2017

ROI with SNAP

We have seen before, from our benchmarking exercises, how efficient SNAP can be in providing the best performance at the lowest cost. SNAP does not need specialized hardware, GPUs or large clusters. Many benchmarks do not focus on real use cases involving true adhoc queries accessing multiple regions of a multi-dimensional  large dataset. For example running Tableau workloads is different from hand written benchmark queries....

Read More


17

Aug 2017

Connecting Excel to SNAP

SNAP can be accessed using standard B.I tools and Excel is one of the most widely used tools for analysis, Pivots and much more. To connect Excel simply use an ODBC Driver for Spark such as from Simba. On a mac you can setup a user DSN as follows   To connect to SNAP you can use the Simba ODBC Spark Driver. Since SNAP is...

Read More


25

May 2017

IOT data analysis at scale with Tableau and SNAP

SNAP, is being used by ad-tech, fortune 50 and in general, companies that have large amounts of data. An area that SNAP is particularly good at is analyzing multi-dimensional data for very fast ad-hoc queries. Structured IOT data and machine data from sensors in devices are particularly suited for this. For example take a sample dataset recording readings from sensors every minute about temperature and...

Read More


10

May 2017

Tableau and OLAP analytics on Hadoop data

Most customers find Hadoop based query access to tools like Tableau cumbersome and slow. When you account for concurrency requirements of enterprises, using a B.I tool on top of Hadoop turns to a project with a slew of summarized extracts consuming enormous resources and time and leading to productivity sinks. With SNAP we have a different approach to dealing with the ad-hoc query requirements on...

Read More


28

Apr 2017

Sensors, IOT data and Spark

When we think of Big data we think of media, ad tech and the social apps generating billions of events. Other industries such as energy and medical device industry have always produced data – generally large amounts of it. The explosion in data for these industries is happening with “sensors”. Connected computing devices with sensors collect and transmit data. This data has to be harnessed,...

Read More


05

Apr 2017

Querying S3 datalakes – SQL and Tableau on S3

For many, AWS S3 is not just a deep storage, but a viable option for storing data that can be consumed by reporting and analytics tools. Sparkline SNAP works seamlessly with S3 data directly and exposes your data to tools like Tableau with very fast response times across hundreds of concurrent user sessions. Below is a video of data from a Star Schema Benchmark data...

Read More


31

Mar 2017

Fast analytics on Spark – Really fast

Interactive ad-hoc analytics requires fast responses. Fast, in many benchmarks are single user tests that do not really reflect the realities of how business users use an analytics or Big Data platform. A good test is a comprehensive simulation of Tableau users pounding a system with a variety of queries. We simulated one such use case on a single r3.2x node on AWS ( 6...

Read More


31

Mar 2017

Integrated Business Intelligence on Big Data Stacks.

Read the new post from our CTO on the changing landscape of B.I and analytics on Big Data stacks. Until recently Big Data stacks have primarily focused on SQL capabilities. Of late, support for Business Intelligence(BI) applications and workloads is coming into focus for both Big Data providers and consumers. BI is not just faster, better SQL: in its essence BI is about enabling the Business Analyst...

Read More


14

Mar 2017

Making data useful and ubiquitous

Datawarehouses have evolved over the years. With Hadoop reaching a level of maturity and Spark as a powerful engine to power various workloads, we are now at a point to truly democratize consumption of data to power insights. Savvy data driven companies, combine the power of automated data analysis with human insights. In order to get everyone in an organization to leverage the data, data...

Read More



Page 1 of 212