The moment you say the words big data, who do you think should be on the team that brings the benefits home?
Should it be the Information Technology group? To many, this answer makes sense. The project involves data — BIG data. They think that there will be programming, hardware, and software, so it only makes sense for IT to run the show.
I don’t agree. In fact, handing a big data project over to IT is going to ensure a slow, painful, and expensive delivery that fails to produce returns that justify the expense.
Why do I believe that IT is the last department that should run a big data project? Three reasons:
Look at where the benefits of the last big wave of IT hype, business intelligence, really came from. The benefits did not come from the creation of the data warehouse, or the analysis tools. The benefits came from the non-technical users who used the internal data to answer questions about the performance of the business. The answers to these questions provided the insights that allowed managers to make changes to the business processes.
Benefit comes from the application of a tool, not the mere presence of the tool.
Perhaps my view is different because of my experience in leading strategic analysis projects. In so many of these projects, the Management Information Systems (MIS) and Information Technology (IT) groups believed too much in the information part of their name. Getting to the data was hard, and once we did get to the data we found it insufficient. Worse, we found the information to be incorrect. Digging into the causes became difficult because we did not have access to the real data.
The wall between the consumer of the data and the data source is part of the problem. IT folks are great at developing and integrating systems of hardware and software that can collect, store, process, and display data. However, most IT professionals lack the basic fundamental skills required to analyze things like human behavior. A programmer can identify a data outlier, but has a hard time understanding why the data is an outlier because the causal factor is not in the data.
I made this point with a client a few months ago. It is somewhat easier to identify what is different than it is to understand why it is different. Figuring out the why requires a different level of cognition. Figuring out the why requires experience, the ability to approach the problem with a creative, almost childlike fascination with more than how things work, but why they work.
My client asked me who I would pick if I were choosing the team. Here’s how I answered.
First we need psychology. We are trying to figure out the non-customers, so we need to understand the non-customer’s behavior. Psychology is the science of human behavior. Anthropologists are good too; they look at cultures. I want a psychologist or an anthropologist on the team.
Second, we need strong math and statistics skills. We are going to be looking at piles, hills, mountains of data. We are going to do regression analysis, look at trends, and develop ways to determine correlation and causation. We need someone who can work through theory. This is beyond simple math, so we need somebody with an economics background, a statistician, or a physicist. One of the best business and manufacturing minds of the 20th century was a physicist, not a business school graduate.
Third, we have to communicate. There are three levels of communication we need:
Someone who knows how to write is vital, because that is how we can get clarity. If we can find someone who is an actor, they can tell the story, make the presentations, and most important, help gather the field data we need from the people in the marketplace.
Fourth, we need a lawyer. The legal questions surrounding the use of data are not heating up just because of the government and NSA. The concept of data privacy is a sharp stick with Facebook and other social media. I want a lawyer on the team to tell me what we do that breaks the law. I want the lawyer on the team to help negotiate with the suppliers of the data, because we must get data from companies that make their revenue collecting and distributing data.
Fifth, there are logistical problems to solve. Somebody has to coordinate the moving parts of the process, someone who pulls resources together to support the rest of the team, gets the computers, arranges the meeting space, arranges the transportation, and helps find the data. There are a lot of moving parts in the physical space to support the project. This is a supply chain — a data supply chain.
Sixth, we must have technical support. We don’t need programmers; we need access to the skills of the programmers. We don’t need to build elaborate systems; we need access to the data, and the simple systems that we can use to study and analyze the data. The programming load is small, but we must have someone on our team who knows where all the data is, and how to get it.
Seventh, we need financial skills. As we look at all the opportunities, we have to evaluate the potential financial return of those opportunities. This is not economics; this is language of business financial analysis. We have to translate what we discover into an ROI, and that requires someone who can quantify the discoveries and the investments so the CFO and the rest of the senior executive team can understand the financial impacts, and trust the returns.
If we bring in a single person with each of these skills, we have seven on the team. The sweet truth is that the people who are good at these projects are polymaths, people who have many of these skills. We may find that our lawyer can write, and our psychologist can act and present.
Does this group sound like a team you can pull from the IT department?