Fast BI


28

Apr 2017

Sensors, IOT data and Spark

When we think of Big data we think of media, ad tech and the social apps generating billions of events. Other industries such as energy and medical device industry have always produced data – generally large amounts of it. The explosion in data for these industries is happening with “sensors”. Connected computing devices with sensors collect and transmit data. This data has to be harnessed,...

Read More


05

Apr 2017

Querying S3 datalakes – SQL and Tableau on S3

For many, AWS S3 is not just a deep storage, but a viable option for storing data that can be consumed by reporting and analytics tools. Sparkline SNAP works seamlessly with S3 data directly and exposes your data to tools like Tableau with very fast response times across hundreds of concurrent user sessions. Below is a video of data from a Star Schema Benchmark data...

Read More


31

Mar 2017

Fast analytics on Spark – Really fast

Interactive ad-hoc analytics requires fast responses. Fast, in many benchmarks are single user tests that do not really reflect the realities of how business users use an analytics or Big Data platform. A good test is a comprehensive simulation of Tableau users pounding a system with a variety of queries. We simulated one such use case on a single r3.2x node on AWS ( 6...

Read More


31

Mar 2017

Integrated Business Intelligence on Big Data Stacks.

Read the new post from our CTO on the changing landscape of B.I and analytics on Big Data stacks. Until recently Big Data stacks have primarily focused on SQL capabilities. Of late, support for Business Intelligence(BI) applications and workloads is coming into focus for both Big Data providers and consumers. BI is not just faster, better SQL: in its essence BI is about enabling the Business Analyst...

Read More


14

Mar 2017

Making data useful and ubiquitous

Datawarehouses have evolved over the years. With Hadoop reaching a level of maturity and Spark as a powerful engine to power various workloads, we are now at a point to truly democratize consumption of data to power insights. Savvy data driven companies, combine the power of automated data analysis with human insights. In order to get everyone in an organization to leverage the data, data...

Read More


07

Feb 2017

Advanced Tableau on Spark /Hadoop

Most benchmarks on datawarehouse optimizations and SQL engines stop with simple examples. The real world uses business intelligence tools where the use cases are not single user single SQL as in a simulated benchmark, Modern B.I on Big Data should satisfy three key requirements Should be able respond interactively as a user drills down into data in Hadoop/Spark, in seconds. While B.I is not about retrieving...

Read More


04

Nov 2016

Optimizing an Enterprise Datawarehouse on Hadoop

As companies move from analytic datamarts and datawarehouses built on Teradata, Vertica or even Oracle/MYSQL to a Hadoop based architecture, consumption of data for B.I and Analytics workloads become critical. Hadoop has traditionally not been geared for consumption of data as users of Tableau know very well. Hive queries are slow. Products like Impala and Presto have eased the pain a bit but the challenge...

Read More


17

Sep 2016

Going beyond Data Lakes

We often see customers start to build data lakes on Hadoop or S3 as way to get their transactional data with dimensional data in a common place. This data is cleaned and organized in a star schema like in an enterprise data warehouse. The challenge begins here since consuming data in a Hadoop data lake is not easy. The first challenge is ad hoc analytics....

Read More


24

Jun 2016

Fast B.I on Spark SQL

A typical slice and dice query on a database has the following pattern. On large datasets, the response for such interactive queries have to be in the order of 1 or 2 seconds as users navigate across different Tableau worksheets or choose filters on their web application. A standard in-memory solution may be suboptimal for such slice and dice queries. First, caching large amounts of data...

Read More


07

May 2016

Analyzing a billion rows with Tableau

“How do I make Tableau go against a live table with 100+ million rows and perform ad hoc queries on various slices of data”. This is a question we get often from data teams across all industries. With growing data across Hadoop, Oracle, Teradata – whatever be the environment, the need to do dimension analysis on the data in an ad-hoc manner with timely responses is...

Read More



Page 1 of 212