Apache Spark


04

Dec 2017

Last mile analysis and modern B.I

Last mile analysis, modern B.I Srikanth Desikan Last mile analysis : The analysis that is performed by a business user with an intent to test a hypothesis or explore an anomaly.  Enterprise data warehouses and B.I tools have existed for decades and yet Excel continues to be the tool of choice for “last mile analytics”. What is last mile analytics? The analysis that is performed...

Read More


28

Nov 2017

Multi-Dimensional analytics at scale

In a typical enterprise there are broadly two kinds of B.I projects . Focus on factual reporting and analysis – These projects involve implementing Hadoop or some Big Data stack for organizing and managing an enterprise datawarehouse and running SQL queries for reporting or connecting Tableau etc for slice and dice analysis – Some level of ad hoc querying using tools provided by single node B.I...

Read More


25

Nov 2017

10 signs you are going nowhere with B.I on Big Data

1. You are fixated on figuring out whether to use Hadoop or Big Query or stick with Oracle-on-premise or move to AWS 2. You are spending endless cycles evaluating Cloudera or HDP or something else and arguing the benefits of Impala over Hive when you have no idea about the business query patterns 3. You are worked up about Spark SQL vs Presto and have...

Read More


14

Nov 2017

Exploring Big Data with Sparkline SNAP, Spark SQL and nteract with Python

Sparkline SNAP is used as a full fledged datawarehouse in place of traditional MPPs at large enterprises where fast data access is required for ad hoc analysis and reporting. Increasingly we see data engineers, used to tools like Jupyter notebooks, accessing SNAP data because its fast and iterative and simple to use with Python. Below is an example of a typical B.I drill down analysis...

Read More


28

Oct 2017

Enterprise B.I platform at scale

More and more of Enterprise Data is moving to Data Lakes: which could be on an on-premise scale out cluster, but increasingly it could just as well be a cloud object store.  Enterprises are in the process of leveraging these datasets for a variety of analysis: from Operational Analytics to Reporting to Business Intelligence/Data Science and everything in between.  There is a plethora of  scale-out...

Read More



Page 1 of 512345