Data Scientist: Making Sense Out of Data
Scientist: Making Sense Out of Data
S. Shyam Prasad
growth of Information Technology has spawned many new professions; the
evolution of big data has ushered in a new career line of data scientist. This
article attempts to answer a) what is data science b) Who is a data scientist? c)
What does he do? and d) Why is this job
attracting much attention?
Courtesy: Wikipedia – The Free Encyclopedia
data scientist job has evolved from the business or data analyst role. Data
scientists source rich data sets, analyze them, create visual representation of
the data, build mathematical models and prepare reports on their findings. A
data scientist uses varied techniques and theories drawn from wide variety of
fields such as mathematics, statistics, operations research, information
science, and computer science, including signal processing, probability
models, machine learning, statistical learning, data mining, database, data
engineering, pattern recognition and learning, visualization, predictive
analytics, uncertainty modeling, data warehousing, data
compression, computer programming, artificial intelligence, and high
performance computing. Data scientists’ role encompasses the whole gamut
of data analysis in contrast to the traditional roles of data analyst.
to KDnuggets, 88 percent of data
scientists have at least a master’s degree and 46 percent have PhDs. Anjul
Bhambhri, vice president of big data products at IBM, says, “A data scientist
is somebody who is inquisitive, who can stare at data and spot trends. It’s
almost like a Renaissance individual who really wants to learn and bring change
to an organization.” A good data scientist also needs to possess strong
business intellect and flair to communicate findings to his or her business
managers in an appropriate manner. The data scientist needs have an eye for
detail as well as the ability to see the big picture.
difference between a traditional data analyst and a data scientist is that the
former may look only at data from a single source – for example from sales
system – whereas a data scientist examines data from multiple sources. His job
is not just report on data but to look at it from multiple angles and discern
what it means and report. He must be somebody like an entrepreneur.
job of a data scientist is a dynamic one. Data scientists work on all kinds of
data such as aircraft flying data, footballer’s performance data, court case
data, poll performance data, tax payers spending data, peoples’ health data,
credit card holders’ usage data and so on. One day they may analyze TV channels
data to find out what drives their business and on another day they may be
analyzing text data to decipher social media responses to their promotion.
Though the job involves handling varied data, some common patterns do emerge
that can be applied elsewhere. Sometimes though, the analyzed data may spring
some surprises. For example, polling data analyses showed that of the 88 people
at Modakkuruchi, Tamil Nadu, who contested for the elections, did not even vote
for themselves, something strange and shocking. Another one that teachers who have
high pass percentage of students in their course don’t grade them honestly. Another
one that, people taking electric or water meter readings don’t always visit the
premises and cook up numbers that are typically multiples of 10. And, children
born in August score much more that children born in July(Anand, chief data
scientist at Gramener).
data analysis may also throw up counter-intuitive insights. For example, once a
data analyst told an American wholesale retailer’s credit card programme to
invert their loyalty reward system and reward low spenders instead of the
higher ones. This is because, the analyst identified that the highest spenders
as mostly re-sellers who did not need to be incentivized. This saved the
company millions of dollars (Raam Nayakar, Myntra).
for data scientists?
future looks bright for a data scientist. “Data Scientist” has become
a popular occupation with Harvard Business Review dubbing it
“The Sexiest Job of the 21st Century”. McKinsey & Company has
projected a global demand excess of 1.5 million new data scientists. Accenture
Research predicted 32000 analytics jobs in India by the year end. Besides
greater opportunity for employment, they are also paid better. Randstad reports
that pay hikes in analytics industry is 50% more than IT and the median salary
for analytics in India is over 13 LPA (Payscale.com).
increase in demand for data scientists has also opened the doors of many universities
for offering master’s courses in data science. Many startups in India are now
offering data science courses and boot camps to train data analysts. There is
only one way the future of data scientists and course to train them can go –
foregoing discussion clearly establishes the growing importance of data
scientists. Besides being paid handsomely, there are opportunities galore. For
the young aspiring post-graduates or graduates who are good at numbers and can visualize
pattern latent in them have good prospects. They will be required to learn
computer programming for mining data and Python and R, the two premier
programmes for data science. This is
also opportune moment for education institutions to offer specialized courses
to prepare the students for this profession.
profession will play an important role in predicting the future or the emerging
trends; it will be of immense value to policy makers and in strategic decision
making in the industry. They may also forecast the spread of infectious
diseases, the occurrence of natural phenomenon and stock market behaviour etc. Thus
making sense of the data available, one may say that the future of the data
scientist is bright.
sincere thanks to Dr. Swaroop Reddy for his numerous inputs.)