What is Data Science & What Does a Data Scientist Do?

Intro to Data Science 101

I decided to explore the possibility of bringing my varied education credentials and life experiences into the field of data science. I’m super excited about this new direction in my life, so I hope you don’t mind if I share some highlights with you as I study and learn about this burgeoning field. What better place to start than the beginning? 101 baby!

Apparently, this is the hot new vocation of the 21st century. According to McKinsey & Company:

“Data and analytics are changing the basis of competition. Leading companies are using their capabilities not only to improve their core operations but to launch entirely new business models.”


Pretty cool for those of us that love data, right? The good news, data science as a career is set to pay off for us data nerds as well because while demand for data scientists has grown 6.5x over the past 5 years and is expected to grow to over 2,720,000 jobs by 2020, there are far more jobs than qualified candidates.

data science

Not sure about you, but I can see the writing on the wall. With all the 11th house activity happening in my natal chart, the tech industry is where I want to be.  I felt this way especially after watching Mike Gualtieri’s Youtube video “What is a Data Scientist?” published way back in 2013! I included the video below for those of you who are curious and want to hear how he – and by extension – Forrester Technopolitics (that name in and of itself excites me to no end) – define the role of a data scientist.

Highlights from Mike’s Video

  • The function of a data scientist is to find meaning, knowledge and data in data
  • Give a data scientist a question, a set of data, and the proper team and tools to have at it and come back with an answer.
  • A data scientist discovers “the new” by
    a.) creating a hypothesis and
    b.) investigating that hypothesis and
    c.) formulating a theory based on what the data says (c is my own addition)
  • Data science includes
    1.) visualizing data
    2.) using algorithms to process the data
  • A data scientist has a unique skill set that is at the intersection of Business Intelligence and Programming/Tech Skills

And with that – I’m off to dig into my studies a little deeper! I’ll hopefully be back with more to share soon!

Looking at Gaining Advantage from Proprietary Data

What is “free” vs. “expensive” data collected by corporations? In this Wall Street Journal article we remove our illusions of personal privacy. What privacy?  No such thing any more… advocate for MIT Media Lab’s Alex “Sandy” Pentland’s New Deal on Data (but that’s a different article). proprietary data

In this article, the authors, Thomas H. Davenport and Thomas Redman, discuss the difference between “free data” and “expensive data”, also known as Proprietary Data (PD).

PD is owned and controlled by the organization and used for their own purposes. Although collection & management of PD is expensive, as in the development of useful and working algorithms, it can give companies that know how to use it a competitive advantage from ad targeting to more efficient supply chain management.

So much room for innovation, disruptors and new leaders as many corporations lag with only 1% actively harnessing. The ones that do lean heavily on “infrastructure” data which has quickly become available to all and thus loses its competitive advantage. Proprietary data, on the other hand, once put into unique processes and procedures, avails itself to be used in creative ways which benefit the company. Can also be protected for longer (e.g. patents).

Example organizations include Google Inc (structure of internet), Uber (where people want to pay for rides to), and the American Banker’s Association CUSIP (means of identifying securities in order to process trades efficiently).