I was in a meeting the other day and for the first time I heard the phrase ‘Citizen Data Scientist’ being bandied about. A quick look on Google Trends suggests that it hasn’t quite penetrated the collective lexicon yet, but I promise you, it is on its way. So who are ‘citizen data scientists’ and how do they differ from traditional ‘data scientists’?
A data scientist is an employee who specialises in extracting knowledge and insight from (often big) data. Typically this is quite a niche and specialised role and is performed by only a few people within a business. Citizen data scientists on the other hand are traditional business users who are now being given the tools to perform this deep analysis themselves.
Take a look at the graph above, it is only in recent years that we have seen the rise of data science. What then is behind this new clamour for ‘Citizen’ data science?
One reason is that the traditional data scientist role requires a combination of statistical knowledge, intimate understanding of business processes and a high level of technical ability. The problem is, combining all these skills is really difficult, making quality data scientists a rare and expensive commodity.
On top of this, BI and analytics are simply becoming more prevalent. Whether it is in your online bank account, sleep tracker app, or Fitbit, BI and analysis of data is now commonplace in our lives. This means BI consumers are now starting to expect robust reporting as a minimum and are no longer afraid of exploring the data deeper to find more meaningful analyses themselves. It also conveniently provides demonstrable value of analytics in their lives. In turn, this has put analytics on the agenda in the boardroom.
As a result there has been a rise in the number of data discovery tools. Products like SAP’s Lumira are tailored towards enabling traditional end users to do meaningful analytics. They provide an intuitive UI but sit upon advanced in-memory databases. This means that not only do users have potent front-end tools, but also the computing power behind them to enable large and complicated queries. Further to that, big data is now being meaningfully brought into data analysis engines, tools like HANA VORA enable the querying of Hadoop ‘big-data’ clusters in existing tools. This democratises Big Data, users can now get the questions they want answered themselves.
So what does this all mean? Citizen data science is on the agenda in 2016, expect it to be right up there in conference headlines just as we have seen ‘big data’, ‘simple’, and ‘Internet of things’ in recent years. But, for all of the recent business fads, this one has some real meat behind it. Whether ‘citizen data science’ sticks around as a buzzword or not, it is reflective of a real and imminent trend in BI. End-user empowerment is here to stay.