Page 1
Data Sciences
Introduction
As we have discussed earlier in class 9, Artificial Intelligence is a technology which completely depends
on data. It is the data which is fed into the machine which makes it intelligent. And depending upon
the type of data we have; AI can be classified into three broad domains:
Each domain has its own type of data which gets fed into the machine and hence has its own way of
working around it. Talking about Data Sciences, it is a concept to unify statistics, data analysis, machine
learning and their related methods in order to understand and analyse actual phenomena with data.
It employs techniques and theories drawn from many fields within the context of Mathematics,
Statistics, Computer Science, and Information Science.
Now before we get into the concepts of Data Sciences, let us experience this domain with the help of
the following game:
* Rock, Paper & Scissors: https://www.afiniti.com/corporate/rock-paper-
scissors
Go to this link and try to play the game of Rock, Paper Scissors against an AI model. The challenge here
is to win 20 games against AI before AI wins them against you.
Did you manage to win?
__________________________________________________________________________________
__________________________________________________________________________________
Data
• Data Sciences
• Working around numeric and alpha-numeric data.
CV
• Computer Vision
• Working around image and visual data.
NLP
• Natural Language Processing
• Working around textual and speech-based data.
Page 2
Data Sciences
Introduction
As we have discussed earlier in class 9, Artificial Intelligence is a technology which completely depends
on data. It is the data which is fed into the machine which makes it intelligent. And depending upon
the type of data we have; AI can be classified into three broad domains:
Each domain has its own type of data which gets fed into the machine and hence has its own way of
working around it. Talking about Data Sciences, it is a concept to unify statistics, data analysis, machine
learning and their related methods in order to understand and analyse actual phenomena with data.
It employs techniques and theories drawn from many fields within the context of Mathematics,
Statistics, Computer Science, and Information Science.
Now before we get into the concepts of Data Sciences, let us experience this domain with the help of
the following game:
* Rock, Paper & Scissors: https://www.afiniti.com/corporate/rock-paper-
scissors
Go to this link and try to play the game of Rock, Paper Scissors against an AI model. The challenge here
is to win 20 games against AI before AI wins them against you.
Did you manage to win?
__________________________________________________________________________________
__________________________________________________________________________________
Data
• Data Sciences
• Working around numeric and alpha-numeric data.
CV
• Computer Vision
• Working around image and visual data.
NLP
• Natural Language Processing
• Working around textual and speech-based data.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
What was the strategy that you applied to win this game against the AI machine?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Was it different playing Rock, Paper & Scissors with an AI machine as compared to a human?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
What approach was the machine following while playing against you?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Applications of Data Sciences
Data Science is not a new field. Data Sciences majorly work around analysing the data and when it
comes to AI, the analysis helps in making the machine intelligent enough to perform tasks by itself.
There exist various applications of Data Science in today’s world. Some of them are:
Fraud and Risk Detection*: The earliest applications of data
science were in Finance. Companies were fed up of bad debts and
losses every year. However, they had a lot of data which use to get
collected during the initial paperwork while sanctioning loans. They
decided to bring in data scientists in order to rescue them from
losses.
Over the years, banking companies learned to divide and conquer
data via customer profiling, past expenditures, and other essential
variables to analyse the probabilities of risk and default. Moreover,
it also helped them to push their banking products based on
customer’s purchasing power.
Genetics & Genomics*: Data Science applications also enable
an advanced level of treatment personalization through research
in genetics and genomics. The goal is to understand the impact
of the DNA on our health and find individual biological
connections between genetics, diseases, and drug response.
Data science techniques allow integration of different kinds of
data with genomic data in disease research, which provides a
deeper understanding of genetic issues in reactions to particular
drugs and diseases. As soon as we acquire reliable personal
genome data, we will achieve a deeper understanding of the
human DNA. The advanced genetic risk prediction will be a major step towards more individual care.
Page 3
Data Sciences
Introduction
As we have discussed earlier in class 9, Artificial Intelligence is a technology which completely depends
on data. It is the data which is fed into the machine which makes it intelligent. And depending upon
the type of data we have; AI can be classified into three broad domains:
Each domain has its own type of data which gets fed into the machine and hence has its own way of
working around it. Talking about Data Sciences, it is a concept to unify statistics, data analysis, machine
learning and their related methods in order to understand and analyse actual phenomena with data.
It employs techniques and theories drawn from many fields within the context of Mathematics,
Statistics, Computer Science, and Information Science.
Now before we get into the concepts of Data Sciences, let us experience this domain with the help of
the following game:
* Rock, Paper & Scissors: https://www.afiniti.com/corporate/rock-paper-
scissors
Go to this link and try to play the game of Rock, Paper Scissors against an AI model. The challenge here
is to win 20 games against AI before AI wins them against you.
Did you manage to win?
__________________________________________________________________________________
__________________________________________________________________________________
Data
• Data Sciences
• Working around numeric and alpha-numeric data.
CV
• Computer Vision
• Working around image and visual data.
NLP
• Natural Language Processing
• Working around textual and speech-based data.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
What was the strategy that you applied to win this game against the AI machine?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Was it different playing Rock, Paper & Scissors with an AI machine as compared to a human?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
What approach was the machine following while playing against you?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Applications of Data Sciences
Data Science is not a new field. Data Sciences majorly work around analysing the data and when it
comes to AI, the analysis helps in making the machine intelligent enough to perform tasks by itself.
There exist various applications of Data Science in today’s world. Some of them are:
Fraud and Risk Detection*: The earliest applications of data
science were in Finance. Companies were fed up of bad debts and
losses every year. However, they had a lot of data which use to get
collected during the initial paperwork while sanctioning loans. They
decided to bring in data scientists in order to rescue them from
losses.
Over the years, banking companies learned to divide and conquer
data via customer profiling, past expenditures, and other essential
variables to analyse the probabilities of risk and default. Moreover,
it also helped them to push their banking products based on
customer’s purchasing power.
Genetics & Genomics*: Data Science applications also enable
an advanced level of treatment personalization through research
in genetics and genomics. The goal is to understand the impact
of the DNA on our health and find individual biological
connections between genetics, diseases, and drug response.
Data science techniques allow integration of different kinds of
data with genomic data in disease research, which provides a
deeper understanding of genetic issues in reactions to particular
drugs and diseases. As soon as we acquire reliable personal
genome data, we will achieve a deeper understanding of the
human DNA. The advanced genetic risk prediction will be a major step towards more individual care.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Internet Search*: When we talk about search engines, we think
‘Google’. Right? But there are many other search engines like
Yahoo, Bing, Ask, AOL, and so on. All these search engines
(including Google) make use of data science algorithms to deliver
the best result for our searched query in the fraction of a second.
Considering the fact that Google processes more than 20 petabytes
of data every day, had there been no data science, Google wouldn’t
have been the ‘Google’ we know today.
Targeted Advertising*: If you thought Search would have been
the biggest of all data science applications, here is a challenger –
the entire digital marketing spectrum. Starting from the display
banilrs on various websites to the digital billboards at the airports
– almost all of them are decided by using data science algorithms.
This is the reason why digital ads have been able to get a much
higher CTR (Call-Through Rate) than traditional advertisements.
They can be targeted based on a user’s past behaviour.
Website Recommendations:* Aren’t we all used to the
suggestions about similar products on Amazon? They not only
help us find relevant products from billions of products
available with them but also add a lot to the user experience.
A lot of companies have fervidly used this engine to promote
their products in accordance with the user’s interest and
relevance of information. Internet giants like Amazon, Twitter,
Google Play, Netflix, LinkedIn, IMDB and many more use this
system to improve the user experience. The recommendations
are made based on previous search results for a user.
Airline Route Planning*: The Airline
Industry across the world is known to
bear heavy losses. Except for a few airline
service providers, companies are
struggling to maintain their occupancy
ratio and operating profits. With high rise
in air-fuel prices and the need to offer
heavy discounts to customers, the
situation has got worse. It wasn’t long
before airline companies started using
Data Science to identify the strategic areas of improvements. Now, while using Data Science, the
airline companies can:
Page 4
Data Sciences
Introduction
As we have discussed earlier in class 9, Artificial Intelligence is a technology which completely depends
on data. It is the data which is fed into the machine which makes it intelligent. And depending upon
the type of data we have; AI can be classified into three broad domains:
Each domain has its own type of data which gets fed into the machine and hence has its own way of
working around it. Talking about Data Sciences, it is a concept to unify statistics, data analysis, machine
learning and their related methods in order to understand and analyse actual phenomena with data.
It employs techniques and theories drawn from many fields within the context of Mathematics,
Statistics, Computer Science, and Information Science.
Now before we get into the concepts of Data Sciences, let us experience this domain with the help of
the following game:
* Rock, Paper & Scissors: https://www.afiniti.com/corporate/rock-paper-
scissors
Go to this link and try to play the game of Rock, Paper Scissors against an AI model. The challenge here
is to win 20 games against AI before AI wins them against you.
Did you manage to win?
__________________________________________________________________________________
__________________________________________________________________________________
Data
• Data Sciences
• Working around numeric and alpha-numeric data.
CV
• Computer Vision
• Working around image and visual data.
NLP
• Natural Language Processing
• Working around textual and speech-based data.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
What was the strategy that you applied to win this game against the AI machine?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Was it different playing Rock, Paper & Scissors with an AI machine as compared to a human?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
What approach was the machine following while playing against you?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Applications of Data Sciences
Data Science is not a new field. Data Sciences majorly work around analysing the data and when it
comes to AI, the analysis helps in making the machine intelligent enough to perform tasks by itself.
There exist various applications of Data Science in today’s world. Some of them are:
Fraud and Risk Detection*: The earliest applications of data
science were in Finance. Companies were fed up of bad debts and
losses every year. However, they had a lot of data which use to get
collected during the initial paperwork while sanctioning loans. They
decided to bring in data scientists in order to rescue them from
losses.
Over the years, banking companies learned to divide and conquer
data via customer profiling, past expenditures, and other essential
variables to analyse the probabilities of risk and default. Moreover,
it also helped them to push their banking products based on
customer’s purchasing power.
Genetics & Genomics*: Data Science applications also enable
an advanced level of treatment personalization through research
in genetics and genomics. The goal is to understand the impact
of the DNA on our health and find individual biological
connections between genetics, diseases, and drug response.
Data science techniques allow integration of different kinds of
data with genomic data in disease research, which provides a
deeper understanding of genetic issues in reactions to particular
drugs and diseases. As soon as we acquire reliable personal
genome data, we will achieve a deeper understanding of the
human DNA. The advanced genetic risk prediction will be a major step towards more individual care.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Internet Search*: When we talk about search engines, we think
‘Google’. Right? But there are many other search engines like
Yahoo, Bing, Ask, AOL, and so on. All these search engines
(including Google) make use of data science algorithms to deliver
the best result for our searched query in the fraction of a second.
Considering the fact that Google processes more than 20 petabytes
of data every day, had there been no data science, Google wouldn’t
have been the ‘Google’ we know today.
Targeted Advertising*: If you thought Search would have been
the biggest of all data science applications, here is a challenger –
the entire digital marketing spectrum. Starting from the display
banilrs on various websites to the digital billboards at the airports
– almost all of them are decided by using data science algorithms.
This is the reason why digital ads have been able to get a much
higher CTR (Call-Through Rate) than traditional advertisements.
They can be targeted based on a user’s past behaviour.
Website Recommendations:* Aren’t we all used to the
suggestions about similar products on Amazon? They not only
help us find relevant products from billions of products
available with them but also add a lot to the user experience.
A lot of companies have fervidly used this engine to promote
their products in accordance with the user’s interest and
relevance of information. Internet giants like Amazon, Twitter,
Google Play, Netflix, LinkedIn, IMDB and many more use this
system to improve the user experience. The recommendations
are made based on previous search results for a user.
Airline Route Planning*: The Airline
Industry across the world is known to
bear heavy losses. Except for a few airline
service providers, companies are
struggling to maintain their occupancy
ratio and operating profits. With high rise
in air-fuel prices and the need to offer
heavy discounts to customers, the
situation has got worse. It wasn’t long
before airline companies started using
Data Science to identify the strategic areas of improvements. Now, while using Data Science, the
airline companies can:
• Predict flight delay
• Decide which class of airplanes to buy
• Whether to directly land at the destination or take a halt in between (For example, A flight
can have a direct route from New Delhi to New York. Alternatively, it can also choose to halt
in any country.)
• Effectively drive customer loyalty programs
Getting Started
Data Sciences is a combination of Python and Mathematical concepts like Statistics, Data Analysis,
probability, etc. Concepts of Data Science can be used in developing applications around AI as it gives
a strong base for data analysis in Python.
Revisiting AI Project Cycle
But, before we get deeper into data analysis, let us recall how Data Sciences can be leveraged to solve
some of the pressing problems around us. For this, let us understand the AI project cycle framework
around Data Sciences with the help of an example.
Do you remember the AI Project Cycle?
Fill in all the stages of the cycle here:
Page 5
Data Sciences
Introduction
As we have discussed earlier in class 9, Artificial Intelligence is a technology which completely depends
on data. It is the data which is fed into the machine which makes it intelligent. And depending upon
the type of data we have; AI can be classified into three broad domains:
Each domain has its own type of data which gets fed into the machine and hence has its own way of
working around it. Talking about Data Sciences, it is a concept to unify statistics, data analysis, machine
learning and their related methods in order to understand and analyse actual phenomena with data.
It employs techniques and theories drawn from many fields within the context of Mathematics,
Statistics, Computer Science, and Information Science.
Now before we get into the concepts of Data Sciences, let us experience this domain with the help of
the following game:
* Rock, Paper & Scissors: https://www.afiniti.com/corporate/rock-paper-
scissors
Go to this link and try to play the game of Rock, Paper Scissors against an AI model. The challenge here
is to win 20 games against AI before AI wins them against you.
Did you manage to win?
__________________________________________________________________________________
__________________________________________________________________________________
Data
• Data Sciences
• Working around numeric and alpha-numeric data.
CV
• Computer Vision
• Working around image and visual data.
NLP
• Natural Language Processing
• Working around textual and speech-based data.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
What was the strategy that you applied to win this game against the AI machine?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Was it different playing Rock, Paper & Scissors with an AI machine as compared to a human?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
What approach was the machine following while playing against you?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Applications of Data Sciences
Data Science is not a new field. Data Sciences majorly work around analysing the data and when it
comes to AI, the analysis helps in making the machine intelligent enough to perform tasks by itself.
There exist various applications of Data Science in today’s world. Some of them are:
Fraud and Risk Detection*: The earliest applications of data
science were in Finance. Companies were fed up of bad debts and
losses every year. However, they had a lot of data which use to get
collected during the initial paperwork while sanctioning loans. They
decided to bring in data scientists in order to rescue them from
losses.
Over the years, banking companies learned to divide and conquer
data via customer profiling, past expenditures, and other essential
variables to analyse the probabilities of risk and default. Moreover,
it also helped them to push their banking products based on
customer’s purchasing power.
Genetics & Genomics*: Data Science applications also enable
an advanced level of treatment personalization through research
in genetics and genomics. The goal is to understand the impact
of the DNA on our health and find individual biological
connections between genetics, diseases, and drug response.
Data science techniques allow integration of different kinds of
data with genomic data in disease research, which provides a
deeper understanding of genetic issues in reactions to particular
drugs and diseases. As soon as we acquire reliable personal
genome data, we will achieve a deeper understanding of the
human DNA. The advanced genetic risk prediction will be a major step towards more individual care.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Internet Search*: When we talk about search engines, we think
‘Google’. Right? But there are many other search engines like
Yahoo, Bing, Ask, AOL, and so on. All these search engines
(including Google) make use of data science algorithms to deliver
the best result for our searched query in the fraction of a second.
Considering the fact that Google processes more than 20 petabytes
of data every day, had there been no data science, Google wouldn’t
have been the ‘Google’ we know today.
Targeted Advertising*: If you thought Search would have been
the biggest of all data science applications, here is a challenger –
the entire digital marketing spectrum. Starting from the display
banilrs on various websites to the digital billboards at the airports
– almost all of them are decided by using data science algorithms.
This is the reason why digital ads have been able to get a much
higher CTR (Call-Through Rate) than traditional advertisements.
They can be targeted based on a user’s past behaviour.
Website Recommendations:* Aren’t we all used to the
suggestions about similar products on Amazon? They not only
help us find relevant products from billions of products
available with them but also add a lot to the user experience.
A lot of companies have fervidly used this engine to promote
their products in accordance with the user’s interest and
relevance of information. Internet giants like Amazon, Twitter,
Google Play, Netflix, LinkedIn, IMDB and many more use this
system to improve the user experience. The recommendations
are made based on previous search results for a user.
Airline Route Planning*: The Airline
Industry across the world is known to
bear heavy losses. Except for a few airline
service providers, companies are
struggling to maintain their occupancy
ratio and operating profits. With high rise
in air-fuel prices and the need to offer
heavy discounts to customers, the
situation has got worse. It wasn’t long
before airline companies started using
Data Science to identify the strategic areas of improvements. Now, while using Data Science, the
airline companies can:
• Predict flight delay
• Decide which class of airplanes to buy
• Whether to directly land at the destination or take a halt in between (For example, A flight
can have a direct route from New Delhi to New York. Alternatively, it can also choose to halt
in any country.)
• Effectively drive customer loyalty programs
Getting Started
Data Sciences is a combination of Python and Mathematical concepts like Statistics, Data Analysis,
probability, etc. Concepts of Data Science can be used in developing applications around AI as it gives
a strong base for data analysis in Python.
Revisiting AI Project Cycle
But, before we get deeper into data analysis, let us recall how Data Sciences can be leveraged to solve
some of the pressing problems around us. For this, let us understand the AI project cycle framework
around Data Sciences with the help of an example.
Do you remember the AI Project Cycle?
Fill in all the stages of the cycle here:
* Images shown here are the property of individual organisations and are used here for reference purpose only.
The Scenario*
Humans are social animals. We tend to organise and/or participate in various kinds of social gatherings
all the time. We love eating out with friends and family because of which we can find restaurants
almost everywhere and out of these, many of the restaurants arrange for buffets to offer a variety of
food items to their customers. Be it small shops or big outlets, every restaurant prepares food in bulk
as they expect a good crowd to come and enjoy their food. But in most cases, after the day ends, a lot
of food is left which becomes unusable for the restaurant as they do not wish to serve stale food to
their customers the next day. So, every day, they prepare food in large quantities keeping in mind the
probable number of customers walking into their outlet. But if the expectations are not met, a good
amount of food gets wasted which eventually becomes a loss for the restaurant as they either have
to dump it or give it to hungry people for free. And if this daily loss is taken into account for a year, it
becomes quite a big amount.
Problem Scoping
Now that we have understood the scenario well, let us take a deeper look into the problem to find out
more about various factors around it. Let us fill up the 4Ws problem canvas to find out.
Who Canvas – Who is having the problem?
Who are the
stakeholders?
o Restaurants offering buffets
o Restaurant Chefs
What do we
know about
them?
o Restaurants cook food in bulk every day for their buffets to meet their
customer needs.
o They estimate the number of customers that would walk into their
restaurant every day.
Read More