Wednesday, February 15, 2012

Big Data-- Data Scientist -- New repertoire of skills needed

The first company to really use big data was Google. Now it's spread to most companies, many educational institutions, and individuals. IT has used structural data in databases, but big data is different.

Big Data was fundamentally different in four ways:
It’s a rich data model, it contains both structured and unstructured data, so it could be things like video, or Twitter feeds, as well as structured data.
The size is big—no longer terabytes but petabytes of data. And it is multisource data.
It is real time. “If my Google search took me to Monday morning to do, I wouldn’t do very many of them.”
It is collaborative. Many people will be working on it at the same time.
(Pat Gelsinger, President and COO of EMC-http://blogs.wsj.com/tech-europe/2012/02/10/big-data-demands-new-skills/?mod=google_news_blog)

What are the skill sets for these new data scientists? How do we establish these
Data scientists- how to gather and use big data? We need to use the big data in ways that can show actual value.

This new field of data scientist will be a combination of math, computer science, and other interdisciplinary fields.

“We see this confluence of skills across math, across computer science, and across the interdisciplinary applications of these new tools. It is learning new tools combining them with existing statistics and math.” (Gelsinger)

No comments:

Post a Comment