It is no secret that data science is difficult. Companies struggle to succeed with data science projects. Even Gartner predicts that by 2022 only 20% of analytics projects will deliver business value. That means about 80% will fail to deliver value. Thus, companies need to be very careful about running data analytics projects.
There are many reasons for the failure of data science projects. They are well-documented on the internet in: Why Data Science Projects Fail, Five Reasons Your Data Science Project is likely to Fail, and many others. Below are a few challenges I have seen when being a part of data science projects.
Solutions Determined at the Beginning
Here is an example of a flawed beginning of the project.
Hey if I had a chart which goes up and to the right, it would look really great in the presentation. Can you go get the data to build that chart?
In this scenario, the final solution has already been determined. This project becomes a problem once the data is collected and the resulting chart does not go up and to the right. This scenario can often look like a failure of data science. However, the data science did not fail. There actually wasn’t any data science or problem solving involved. The solution was predetermined. It is a better idea to start with some business problems rather than starting with the solution.
Algorithms Provide the Final Answer
The world loves and hates algorithms. They can provide information on just what we need at just the right time. However, they can go wrong and leave us scratching our heads. This is due to bias, poor data, unclear requirements, and any number of other things.
A better approach is not to take the humans out of the process. This technique helps people to gain comfort in using algorithms to make decisions. It saves time by using an algorithm to narrow down the options.
For many business problems, there are hundreds and potentially thousands of solutions. It is not a good use of a persons time to filter through hundreds of solutions. That is where algorithms can prove useful. Here is the trick. Do not have the algorithm produce just one final answer. Have the algorithm narrow the decisions down to 3 or 4 and have a human select the best choice from those options. This allows decisions to be made with both data and human input. Don’t start by removing the human.
Not having the correct data
More data is not always better. It needs to be the right data. I was once presented with a problem: Can you predict which customers will leave? I said, “maybe” and asked for some data. I was presented with tons of data about software bugs and defects. I tried to explain the data was not very helpful for this problem. I kept getting this response, “but it is a lot of data.” Unfortunately, the data had nothing to do with customers. Thus, the predictions could not happen until more relevant data was discovered. More of the wrong data cannot replace a small amount of the correct data.
Data Science is not magic. I will say that again. Data Science is not magic. There needs to be buy-in from the company and someone with decision-making authority needs to be invested in the project. There needs to be a goal and a vision for what value the data science can provide. Even better, there needs to be a plan.
Hiring a data scientist and hoping things will magically happen, is not a recipe for success.
Have some goals and processes. Plus, provide support when (not if) it is needed.
Create A Data Strategy
Before you start your next data science project, consider creating a Data Strategy. It should include a future vision and a plan to get there.