Where is My Big Data Coming From and Who Can Handle It

Recently, a reader asked my insights on the article (Data Scientists are the New Rock Stars as Big Data Demands Big Talent).  Here is my response.

It seems like in today’s world people and organizations are somewhat struggling with this big data concept and do not know where to begin. Due to this reason they are collecting everything they can think of in the hopes that one day they will be able to use this data in a meaningful way such as better customer experience, new products/services, better collaboration, increasing revenue etc. This hope approach of “lets collect data and later decide what we can use it for” on the surface might seem sound but last I checked hope is not a strategy. Perhaps this is one of the reasons that even now only <1% of data collected is actually being analyzed. What good is more data when one cannot even make sense of the other 99%+ of data it already has. Are we chasing a ghost?

While it is true that vast amounts of data are and will be generated from financial transactions, medical records, mobile phones and social media to the Internet of Things but there are questions that need to be asked to understand data’s meaningful use:

  1. How will data be managed?
  2. How will data be shared?

I believe that in order to come to a point where data becomes meaningful and useful it would require (broadly speaking) three phases:

  1. Establishment of standards, governance, guidelines. (E.g., open architectures)
  2. Creation of industry specific data exchanges. (E.g., healthcare data exchanges, environment data exchanges etc.)
  3. Creation of cross-industry data exchanges. (E.g., healthcare data exchanges seamlessly interacting with environmental data exchanges etc.)

Additionally, lets keep this in mind that the data we are talking about is data that can be captured by current tools and systems but the data which is perhaps the most difficult to capture is unstructured human data which within organizations is called Institutional Knowledge. This does not reside in a document or a system but in the minds of the people of an organization who understand what needs to be done in order to move things forward.

So, the question becomes, do we really need Data Scientists who have a mix of coding skills with PhDs in scientific disciplines and business sense or do we need someone who is able to connect the dots and have the ability to create the future. The answer is not a simple one. Perhaps you need both. The ability to code should not be the deciding factor but rather the ability to leverage technology and data should be. I agree that there is shortage of people with diverse talent but there is also shortage of people who actually know how to leverage this kind of talent.

Before organizations go on a hiring spree they should consider:

  1. Why do they need a Data Scientist? (E.g., have strategic intent, jumping on the bandwagon etc.)
  2. Who will the Data Scientist report to? (E.g., Board, CEO, CFO, COO, CIO etc.)
  3. Does the organization have the ability to enhance/change its business model? (E.g., making customers happy, leading employees etc.)
  4. Is the Data Scientist really an IT person with advanced skills or does s/he have advanced skills and happens to know how to leverage technology and data?
  5. How often will you measure the relevancy of the data? (E.g., key data indicators)
3 Phases of Big Data Harmonization

3 Phases of Big Data Harmonization

Advertisements

One Comment

Comments are closed.