Sensors, IOT data and Spark
When we think of Big data we think of media, ad tech and the social apps generating billions of events. Other industries such as energy and medical device industry have always produced data – generally large amounts of it. The explosion in data for these industries is happening with “sensors”.
Connected computing devices with sensors collect and transmit data. This data has to be harnessed, organized and analyzed.
As in media and adTech, data will become a source of competitive advantage for traditional industries.
A following is a subset of industries where data becomes a competitive advantage if harnessed properly.
- Transport —sensor data from delivery trucks is helping businesses schedule preventive maintenance before mechanical issues can disrupt fleet operations. Planning and logistics are improved with timely data from these sensors.
- Healthcare—biosensors enable efficient and cost effective patient care impacting the delivery of services such as telemedicine and mobile health.
- Product Analytics—manufacturers are investing in sensors to monitor the health and performance of their products deployed at their customer locations and to work proactively to address service and maintenance issues. The sensors also enable product feedback especially in environment sensitive cases by monitoring for atmospheric conditions, particles etc and enabling correlation of device performance to the environment.
- Predictive maintenance—A variety of industries use data from airplane sensors to proactively manage maintenance. Reducing warranties costs and driving down unplanned maintenance are key factors influencing the use of data.
- Compliance—Oil and gas and energy companies analyze sensor data collected from oil drilling platforms to verify compliance with safety requirements.
- Smart buildings and Security—Commercial real estate and facilities personnel use sensors to collect data on a variety of environmental factors – temperature, humidity, sound levels and building pollution levels. With the advent of electronic security mechanisms, massive data is being generated on card key ins and out. This data is being used to track anomalies in entry and exit of buildings.
The challenge facing traditional industries is that current data and analytics platforms were designed to accommodate “small” human generated data. Now companies with connected devices are looking at analyzing billions of rows with sub-second latency.
The future of fast, data exploration is purpose built platforms to handle ad-hoc queries on large datasets – whether it is for business intelligence, analytics, data science or AI.
Open source distributed computing frameworks coupled with the competitive price points on cloud VMs mean that companies can now analyze billions of rows with just one node. The key to unlocking the power of ad-hoc fast analytics is organizing data to take advantage of distributed computing and having a powerful optimization engine to process and return results in sub seconds.
With Sparkline SNAP analyzing sensor data is simple. Whether the data is in your cloud or on-premise, log or other event data, you can quickly build a fast datalake and datamart around it and expose analysis using Notebooks such as Jupyter or Zeppelin or analyze them using traditional B.I tools such as Tableau.