It’s probably not who you think. It’s not DJ Patil or Hilary Mason. The first data scientist was Tobias Mayer. Who? Yeah, that’s exactly right, I had never heard of him either. Thankfully, John Rauser, a Data Scientist at Amazon, gave a great talk about this person at Strata New York 2011.

Well, Tobias was an astronomer way back in the mid 1700s. He spent a lot of time observing the libration (wobbling) of the moon, and he came up with the following formula:

\beta - x = y \alpha - z \alpha \sin{\theta}

He could measure x, y and z . Thus he needed to solve for \alpha, \beta and \theta. Given measurements from 3 observations and 3 equations, Tobias could solve for the unknown. That is when the real problem arose. Tobias had 27 observations instead of 3. He had too much data. This may have been the first known occurrence of big data. For more on Tobias Mayer’s solution, you will need to watch the video below. Hint: he strategically grouped the data.

Rauser has this to say about why Mayer qualifies as the first data scientist.

As far as I know, the first time in history that someone made a quantitative argument that more data is better.

Rauser doesn’t stop there though. The rest of his talk goes on to explain the path to becoming a data scientist and the necessary skillset. Below are the skills he mentions.

  • Math
  • Engineering
  • Writing
  • Skepticism
  • Curiousity

So, do yourself a favor, and take a few minutes to watch this great talk.

As I watched this video, I kept asking myself the same question. Why have I never seen this video before?