PaxChat Two Cloud BI Dominance

The Cloud…

Bring distributed resources up when you need them, shut them down when you don’t.

Focus IT and business resources on the important opportunities and fun challenges.

Watch development and operations skills merge into devops.

Interweave open source and proprietary technologies, and data from open, government, third-party and internal sources.

What’s not to like?

The technologies and business processes for elastic, cloud computing, a.k.a. “The Cloud”, are coming into the mainstream, building upon the work of companies such as Salesforce and Amazon and Enomaly, and the vision of avant-garde CIOs such as Ben Haines. Growing out of grid computing in the 1990s, evolving through application service providers to everything-as-a-Service [*aaS], the Cloud is now a major component of IT strategies. There has been one area resistant to this type of IT outsourcing: data management and analytics [DMA] in its various forms. Early attempts to bring BI into the Cloud ended in bankruptcy for pioneering companies such as LucidEra, which ended operations in 2009. Recently, other entrepreneurial ventures have targeted the BI market with Cloud services and Business Intelligence as a Service [BIaaS] and Data as a Service [DaaS] platforms. Self-service BI, data visualization and data preparation companies also operate in the Cloud. There is little doubt [though still some lingering doubt] that data integration and governance, data warehousing, business intelligence, advanced analytics and data science can thrive as Cloud platforms and services.

Cloud BI Market Dominance

Who will dominate in the Cloud DMA market? Will one or two of the current players not only survive as independents but emerge with undeniable victory? Or will established enterprise software and Cloud companies acquire those startups, much as the traditional BI vendors became absorbed into the larger IT vendors?

At the end of 2014, Prakash Nanduri, the CEO of the ground breaking firm Paxata turned a forecaster’s gaze upon 2015, to help with his strategy in guiding Paxata and in advising their PaxPro customers. The results were six predictions published as an article in Forbes. If you’ve been following along, you know that the Paxata Data Divas. Lilia Gutnik, Dr. Julie Mayhew, Tricia Lee McNabb and Cari Jaquet, invited us to do a series of short webinars and tweetchats to have some fun with these predictions. The first prediction, "The lines will blur between data scientists and data analysts.", was the subject of the first #PaxChat on March 11. The next will be held on Wednesday, March 25 at 11:00 a.m. PDT covering Prakash’s second prediction, “Microsoft and Salesforce.com will take a dominant share of the Cloud based BI market”. What a bold and interesting prediction!!! One, a large enterprise and consumer player, that has built a reputation for proprietary hardware and software, backed by an enthusiastic developer and partner community, with ofter derisive customers. The other, arguably, the dominant Cloud Software as a Service [SaaS] company that made its motto “No More Software” and has grown from customer relations management through salesforce automation, into customer service and marketing, growing amazing platforms allowing early adoption of mobile and the Internet of Things [IoT], naming this the Internet of Customers and Internet of Connected Products [or Internet of Carrier Pigeons]. Despite its best efforts over two decades, Microsoft is not accepted as a BI company, let alone a CloudBI company. Salesforce has left analytics up to its partner community until very recently, with the acquisition of relateIQ and EdgeSpring, and introduction of Wave. Read Prakash’s predictions for his well reasoned arguments on why these two might just wind up being the dominant players in CloudBI this year!

The Tweetchat #PaxChat Two

Listen to the webinar to find out why the Data Divas and the Data Archons are skeptical ;-)

Join the tweetchat, by following #PaxChat and give your own take on the following questions.

Prediction Two: Microsoft and Salesforce.com will take a dominant share of the Cloud based BI market.
Q1 #PaxChat Are you ready to move your #BI to the #Cloud?
Q2 #PaxChat Which parts of data management and analytics #DMA are enhanced by Cloud?
Q3 #PaxChat Are @Microsoft @Azure #PowerBI @Office365 and @Salesforce #Wave on your roadmap?
Q4 #PaxChat If 80% of the #DMA work is #DataPrep do @Microsoft and @Salesforce offerings address self-service governance?
Bonus Q #PaxChat Will the current #CloudBI players fall behind in 2015?

The action comes alive March 25:

Paxata's CEO and co-founder Prakash Nanduri's 2015 Predictions Tweet Chats

As we started to celebrate the end of 2014 and anticipate all that 2015 will bring, Prakash Nanduri published an article in Forbes with six predictions. This wasn’t to show his prowess as a prognosticator or futurist, but to bring focus to his strategy as CEO of Paxata, and to his advice to Paxata customers. Prakash’s strategic thought can be found through the Paxata newsroom and blogs.

Predictions

  1. The lines will blur between data scientists and data analysts.
  2. Microsoft and Salesforce.com will take a dominant share of the Cloud based BI market.
  3. Data Preparation replaces Big Data as hottest topic in Enterprise Analytics.
  4. Hadoop faces a make-or-break year in the larger enterprise market.
  5. Marketing becomes the biggest driver of BI decisions.
  6. The IoT becomes real for B2B.

Soon after the Forbes article was published, the Paxata Data Divas invited me to do a series of short [less than ten minutes each] webinars, where we discussed each prediction.

If you don't know the Paxata Data Divas, you should:

Diva-in-Charge: Cari Jaquet, VP of Marketing
Diva Control: Tricia Lee McNabb, Director Marketing Programs
Diva-at-Large: Lilia Gutnik, Product Person and Cruise Director
Diva-of-Mayhem: Dr. Julie Mayhew, a.k.a. Doctor Mayhem, Pre-sales Engineer

Following up on these webinars, the Data Divas asked us to host a series of tweetchats based upon Prakash’s predictions, extending our discussion to everyone following on Twitter. The first in this series of TweetChats will begin at 11:00 a.m. Pacific Time, using the hashtag #PaxChat. Coincident with this, Julie will be at The Drive Conference, while Clarise and I will be at IoT day at EclipseCon. As with most tweetchats, we will have five questions based on the first prediction “The lines will blur between data scientists and data analysts." As has become the custom, we will use the format Qn as we tweet each question; please respond with An and the hashtag #PaxChat

Q1: How do you define data science?
Q2: Is data science a solo or team sport?
Q3: What is the difference between a data scientist and a data analyst?
Q4: How has your company introduced data science?
Q5: How do you bring data science into production?

If there is time, we will have a bonus question. You can find a list of all 11 questions that were considered, and more from Cari, at her blog post "PaxChat - The Predictions Come Alive" as well as comment there with any questions or recommendations.

You can view the webinar on YouTube first.

Thank you Divas and Paxata for sponsoring these #PaxChats on the business impact of an increasingly complex and interesting data landscape.

Now that the #PaxChat One is over, you can see the results below. The next PaxChat is scheduled for Wednesday, March 25 at 11:00 a.m. PDT. It will cover the second of Prakash's predictions for 2015: "Microsoft and Salesforce.com will take a dominant share of the Cloud based BI market".

Springbok Leaps into Data Harmonization

Springbok by Informatica is the latest entry in the nascent self-service data preparation market. Springbok is impressive on several fronts.

Data Harmonization.

Rather than data preparation, Informatica uses the term data harmonization, to emphasize the capabilities within Springbok to bridge the divide between the business and information technology. This feature truly differentiates Springbok. Other products for self-service data preparation focus 100% on the business side.

Self-service.

Springbok is truly a self-service tool. Though it is possible to integrate with other Informatica products, Springbok is a Cloud offering, available today, wherein you can upload your data all by yourself. For free. Try it yourself at

http://springbok.is/

Social Collaboration and Permutation Management.

These are Informatica’s terms that represent two sides of the same coin: identification of the most valued data players and the most trust data sources, allowing collaboration among business users of those data sets, and visibility to IT into the business use of data. Springbok ranks data users, as well as their Springbok recipes, data sources and data permutations to allow other users of that data to have confidence in unfamiliar data sources. Additionally, IT gains understanding of what internal and third-party business users are actually using and how they are actually using that data; all before a business user makes a request. This prevents IT being blind-sided by Shadow IT.

Connectivity.

While Springbok is fully a Cloud product, it easily connects to both on-premises and Cloud data sources. The family traits from Informatica’s long history in data integration show up here.

Why has a self-service data preparation market come to be, fast on the heels of adoption of self-service BI tools, such as Tableau and Qlik? To solve a problem. With the advent of next generation BI tools and the trend towards self service Data Management and Analytics (DMA) business users manipulate the data themselves. They always have. The first question that we are always asked in a data warehouse or business intelligence project is “Can I export that to Excel?" As Big Data and Data Science have moved from buzzwords to business practices, it has become widely known that 80% of Data Analytics process involves preparing the data for use by locating, cleansing and standardizing the data. Whether this is done by a Data Scientist using Unix shell tools like sed and awk, or as an iterative process between IT and business, it is time consuming. It is also boring. IT gets caught in the dilemma of handling the increasing data preparation requests and is playing catch up.

Springbok is a self service data harmonization tool that empowers business users to find the data and guide them through the process of enriching and shaping the data without the need for deep technical skills nor dependence on outside help. Let’ s take a closer look at the capabilities Springbok brings to all users along the gradient from non-technical to quant to IT specialist.

Automatic data suggestion.

Project springbok provides a quick and easy way to take data from one source and accurately combine them together. It provides the ability to automatically suggest data for data enrichment. Using any single file, any combination of sources, or all available data sources, Springbok suggests completion of spotty records or of an entire data set required for analysis through semantic analysis of the data. For example, if a column of data contains city names, but some records are blank, Springbok can use a Zip Code column within that file, or a Golden Record from a Master Data management System, or a third-party source, such as Dun & Bradstreet, to complete the data set.

Business user social Collaboration.

Springbok Promotes business user collaboration by allowing business users to access correct data, to know who is the person responsible for the evolution of that data, and to understand the lineage of that data. This promotes collaboration in the enterprise through reputation building trust. Within Springbok, a user can find other users and other data sources that their peers trust and use. This is invaluable to both new employees and to old-timers being confronted by new sources of data and changing business processes.

Again, Self-Service.

The basic tenant of the design philosophy of Springbok is self-service from a user uploading a file to the Springbok Cloud and immediately being able to play with their data, to that same user being able to export to the Self-Service BI tool of their choice.

Permutation Management.

This feature is a major differentiator of Springbok. IT is able to have visibility and understand the evolution of a data set as well as the identity of key data influencers. This promotes collaboration between IT and their respective business partners. Permutation Management also aids in finding key external sources, shining a light into Shadow IT. Further, Springbok is a stand-alone product; however, for Informatica customers, data is easily centralized with one-click to bring the business users’ recipes into informatica Power Center for production use in the analytic environment. This last function does raise questions about configuration management, change control, and regulatory compliance. We were assured by Informatica that this is a consideration for Springbok, and again, those enterprise roots show. This capability will be the customer's choice on how they wish to use it. Regulatory compliance and traceability will be handled by exposing the Springbok logs for audits. Notice the “will be”. The one-click instantiation of a Springbok recipe as a PowerCenter transformation is on the roadmap, but not available in the current version of Springbok, which is freely available.

an image of the Springbok Permutation Management
Springbok Permutation Management

One of the most impressive things about Springbok is the rapid adoption among Informatica customers and non-customers. One hundred users representing approximately 30 Informatica customers participated in the development of Springbok. In the three months since the announcement of the public Springbok beta program, over 1700 (now 2300 since our last briefing) users from more than 350 (now:500) organizations have been uploading data into the Springbok cloud, and happily manipulating that data. One other area where Informatica has recently delighted us, is the growth of the Informatica Marketplace. We are looking forward to the day when users can contribute non-proprietary Springbok recipes to the Marketplace. In today’s connected world, data management and analytics is the competitive edge. Participation in such a wide-ranging community provides the cross-fertilization necessary to fully leverage the changes coming about through evolving technologies from social media to the Internet of Things.

Good Vibe The Informatica of Things

Vibe Data Stream [Vibe] and Virtual Data Machine [VDM] combine at the center of Informatica’s Internet of Things strategy. Primarily for Machine-to-Machine [M2M] data, and by connecting through Power Center, ultimately leading to Machine-to-Human [M2H] Data. The goal is to have VDMs residing in mobile devices, sensor packages, or as part of sensor networks. At this point, VDMs require more processing power than available in most components. Thus, Vibe and VDM are primarily suited today to data, network operations, and communication centers.

However, Informatica is seeing a broad range of use cases involving both large machines and sensor networks, from many different sectors including

  • telcos,
  • oil and gas,
  • financial services,
  • government,
  • data center operations, and
  • building services.

The Proof is Out There

One Proof of Concept [PoC] currently underway is with a Heating, Ventilation and Air Conditioning [HVAC] company. In the PoC, the HVAC company is looking at streaming data from all of their installations. Using Informatica products, they are bringing this data into their data center for both streaming and batch analytics. There are actually three use cases being examined in this PoC:

  • Improving customer service
  • Internal analytics on generic patterns of use for improved design, reliability and maintainability
  • Predictive maintenance from the provider rather than from the building management team

Other field trials look at Vibe and VDM capabilities in regard to Pub/Sub models working with Informatica Ultra Messaging, as well as persisting data in all forms of data stores from traditional Enterprise Data Warehouses [EDW] to Hadoop [HDFS] and NoSQL databases such as Cassandra. These field trials involve solving the ongoing problems of the different areas mentioned above.

  • In a financial services case, both application log data and financial information exchange [FIX] log data are being used to pull in log data real time for market, order flow and trade data.
  • For online retail, Vibe is used to track web-site visitor paths through the site using log data.
  • Data center operational efficiency optimization for green IT, sustainability or improving the bottom line through log data from switches, servers, applications and call centers.
  • For one governmental agency, Informatica Vibe and VDM are maintaining the Service Level Agreement [SLA] in real time, for 800 separate field organizations over more than a million devices, using industry-standard Security Content Automation Protocols [SCAP] data formats.

Perhaps the most involved trials begin done to date with Informatica Vibe and VDM, are within the Telecommunications space. As one might expect, the explosion of data and customer expectations, as cellular goes from 2G to 3G to 4G/LTE requires real-time management of ever increasing amounts of data. But also the wireline/fiber and cable use cases are exploding as the traditional market places of voice, entertainment and connectivity intertwine.

Out to the Edge

Informatica is aggressively working with partners, such as chip, sensor and package manufacturers, to understand how to optimally implement Vibe, whether that is through streaming collection capability of Vibe on the device itself or as part of the larger infrastructure at some point in the collection tier to implement the needed streaming collection. Currently, collecting sensor data can hit performance limits using the sensor or communication base protocols. Thus, for example in the oil and gas industry, Informatica is working with both vertical-specific sensor manufactures and large organizations in the industry, to determine how Vibe can supplement or even replace the collection tier.

SAE Fit

What Informatica brings to evolving sensor analytics ecosystems [SAE] is not only their specific technologies of Vibe and VDM, but combining these with a complete package for supporting streaming analytics, operational intelligence, complex event processing [CEP], batch analytics, predictives, reporting, data marts and EDW, through their existing technology families such as Ultra Messaging, Power Center, Master Data Management, Data Quality, and more, both through traditional and Cloud deployments. This results in bringing mature market features to the SAE in the form of

  • Guaranteed delivery
  • Automated zero latency fail-over
  • Centralized GUI administration
  • No intermediary staging of data at source, broker, or target
  • Fail-over does not require shared file systems

References

This blog post is based upon both the Informatica Press release referenced below, and a private briefing from the Informatica team that allowed us to gather more information and get answers to our questions. Also referenced are other of our blog posts on IoT and Big Data, for context.

  1. Informatica Press Release from Strata + Hadoop World
  2. What does IoT All Mean
  3. The IoT and Change
  4. Big Data: It’s Not the Size, It’s How You Use It
  5. New Hope from Big Data

https://www.constellationr.com/content/internet-things-and-change

Salesforce1

Back in July, I wrote

[An] excellent example of the importance of the Industrial Internet comes from Salesforce.com use of The Social Machine by Digi International and its Etherios business unit, in bringing sensor data into customer relationship management [CRM] by allowing sensors embedded in industrial refrigerators, hot tubs, and heavy and light equipment of all types to open SFDC chatter sessions and to file cases.

At Dreamforce 2013, Salesforce.com is announcing Salesforce1, their new Internet of Customers ecosystem, bringing together Force.com, Heroku, and ExactTarget FUEL platforms under a united series of APIs controlled by the Salesforce1 App.

Today and tomorrow, Dreamforce is all about the Internet of Things, and I'll be providing my analyses of how SFDC is building out it's massive existing ecosystem of parnters, services and customers into Marc Benioff's evolving vision of the Internet of the Customer. The message here, is how Salesforce1 is ready today to prepare their customers to leverage the opportunities presented by the Internet of Things today. As Cisco states, over a trillion dollars in added value was left on the table this year by companies not taking advantage of IoT. For 2014, SFDC's customers won't have an excuse to leave this money behind.

One challenge for Salesforce1 is its dependence on partners for analytics. Are SFDC partners ready to help in bringing the Internet of Customers to full potential through connected analytics? How will IBM's MQTT, Smarter, and Cognitive Computing, Oracle's Device-to-Data-Center, Teradata's Hub for Monetizing the IoT, Infobright's M2M optimized ADBMS, and many other data management & analytics initiatives focused on M2M and M2H data fit in?

Will Salesforce1 create or be integrated into Sensor Analytics Ecosystems, with the necessary marketplaces for raw, processed and insights from M2M & M2H data? SFCD has never been up to the challenge of analytics in the past. While there are many general BI and Analytics partners, SFDC specific analytics firms have come and gone. Salesforce1 is a broader concept and brings SFDC into a future beyond salesforce automation and customer relationship management.

You can hear more about Salesforce1 on this YouTube video, peruse the official Salesforce1 page, and read a more general account of Salesforce1 by R. Ray Wang.

The IoT Keynote at Dreamforce today, and the packed sessions on IoT will answer some of these questions. I'll be providing my analysis of how well these questions are answered in an Event Report blog post after the close of Dreamforce 2013.

December 2019
Mon Tue Wed Thu Fri Sat Sun
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31          
 << <   > >>
The TeleInterActive Press is a collection of blogs by Clarise Z. Doval Santos and Joseph A. di Paolantonio, covering the Internet of Things, Data Management and Analytics, and other topics for business and pleasure. 37.540686772871 -122.516149406889

Search

Categories

The TeleInterActive Lifestyle

Yackity Blog Blog

The Cynosural Blog

Open Source Solutions

DataArchon

The TeleInterActive Press

  XML Feeds

Mindmaps

Our current thinking on sensor analytics ecosystems (SAE) bringing together critical solution spaces best addressed by Internet of Things (IoT) and advances in Data Management and Analytics (DMA) is updated frequently. The following links to a static, scaleable vector graphic of the mindmap.

Recent Posts

free blog tool