Page 1
30
1.2.2 Data Acquisition
Lesson Title: Data Acquisition Approach: Interactive Session + System Maps
Summary: Students will learn how to acquire data from reliable and authentic sources and will
understand how to analyse the data features which affect their problem scoped. Also, they will learn
the concept of System Maps
Learning Objectives:
? Students will learn various ways to acquire data.
? Students will learn about data features.
? Students will learn about System Maps.
Learning Outcomes:
? Identify data required regarding a given problem.
? Draw System Maps.
Pre-requisites: Basic computer literacy
Key-concepts:
? Develop an understanding of reliable and authentic data sources.
? System Mapping
In the previous module, we learnt how to scope a problem and set a Goal for the project. After
setting the goal, we listed down all the necessary elements which are directly/indirectly related
to our problem. This was done using the 4Ws problem canvas. 4Ws were:
1. Who?
a. Who are the stakeholders?
b. What do we know about them?
2. What?
a. What is the problem?
b. How do you that it is a problem? (is there an evidence?)
3. Where?
a. What is the context/situation the stakeholders experience this problem?
b. Where is the problem located?
4. Why?
a. What would hold value for the stakeholders?
b. How will the solution improve their situation?
To summarise, we then go for the problem statement template where we put in all the details
together at one place.
Our [Stakeholders] has/have a problem that [issue, problem,
need] when/while
[context, situation]. An ideal situation would be [benefit of
solution for them] .
Page 2
30
1.2.2 Data Acquisition
Lesson Title: Data Acquisition Approach: Interactive Session + System Maps
Summary: Students will learn how to acquire data from reliable and authentic sources and will
understand how to analyse the data features which affect their problem scoped. Also, they will learn
the concept of System Maps
Learning Objectives:
? Students will learn various ways to acquire data.
? Students will learn about data features.
? Students will learn about System Maps.
Learning Outcomes:
? Identify data required regarding a given problem.
? Draw System Maps.
Pre-requisites: Basic computer literacy
Key-concepts:
? Develop an understanding of reliable and authentic data sources.
? System Mapping
In the previous module, we learnt how to scope a problem and set a Goal for the project. After
setting the goal, we listed down all the necessary elements which are directly/indirectly related
to our problem. This was done using the 4Ws problem canvas. 4Ws were:
1. Who?
a. Who are the stakeholders?
b. What do we know about them?
2. What?
a. What is the problem?
b. How do you that it is a problem? (is there an evidence?)
3. Where?
a. What is the context/situation the stakeholders experience this problem?
b. Where is the problem located?
4. Why?
a. What would hold value for the stakeholders?
b. How will the solution improve their situation?
To summarise, we then go for the problem statement template where we put in all the details
together at one place.
Our [Stakeholders] has/have a problem that [issue, problem,
need] when/while
[context, situation]. An ideal situation would be [benefit of
solution for them] .
31
What is Data Acquisition?
As we move ahead in the AI Project Cycle, we come across the second element which is: Data
Acquisition. As the term clearly mentions, this stage is about acquiring data for the project. Let us first
understand what is data. Data can be a piece of information or facts and statistics collected together for
reference or analysis. Whenever we want an AI project to be able to predict an output, we need to train
it first using data.
For example, If you want to make an Artificially Intelligent system which can predict the salary of any
employee based on his previous salaries, you would feed the data of his previous salaries into the
machine. This is the data with which the machine can be trained. Now, once it is ready, it will predict his
next salary efficiently. The previous salary data here is known as Training Data while the next salary
prediction data set is known as the Testing Data.
For better efficiency of an AI project, the Training data needs to be relevant and authentic. In the
previous example, if the training data was not of the previous salaries but of his expenses, the machine
would not have predicted his next salary correctly since the whole training went wrong. Similarly, if the
previous salary data was not authentic, that is, it was not correct, then too the prediction could have
gone wrong. Hence ….
For any AI project to be efficient, the training data should be authentic and relevant to the problem
statement scoped.
Data Features
Look at your problem statement once again and try to find the data features required to address this
issue. Data features refer to the type of data you want to collect. In our previous example, data
features would be salary amount, increment percentage, increment period, bonus, etc.
Acquiring Data from reliable sources
Page 3
30
1.2.2 Data Acquisition
Lesson Title: Data Acquisition Approach: Interactive Session + System Maps
Summary: Students will learn how to acquire data from reliable and authentic sources and will
understand how to analyse the data features which affect their problem scoped. Also, they will learn
the concept of System Maps
Learning Objectives:
? Students will learn various ways to acquire data.
? Students will learn about data features.
? Students will learn about System Maps.
Learning Outcomes:
? Identify data required regarding a given problem.
? Draw System Maps.
Pre-requisites: Basic computer literacy
Key-concepts:
? Develop an understanding of reliable and authentic data sources.
? System Mapping
In the previous module, we learnt how to scope a problem and set a Goal for the project. After
setting the goal, we listed down all the necessary elements which are directly/indirectly related
to our problem. This was done using the 4Ws problem canvas. 4Ws were:
1. Who?
a. Who are the stakeholders?
b. What do we know about them?
2. What?
a. What is the problem?
b. How do you that it is a problem? (is there an evidence?)
3. Where?
a. What is the context/situation the stakeholders experience this problem?
b. Where is the problem located?
4. Why?
a. What would hold value for the stakeholders?
b. How will the solution improve their situation?
To summarise, we then go for the problem statement template where we put in all the details
together at one place.
Our [Stakeholders] has/have a problem that [issue, problem,
need] when/while
[context, situation]. An ideal situation would be [benefit of
solution for them] .
31
What is Data Acquisition?
As we move ahead in the AI Project Cycle, we come across the second element which is: Data
Acquisition. As the term clearly mentions, this stage is about acquiring data for the project. Let us first
understand what is data. Data can be a piece of information or facts and statistics collected together for
reference or analysis. Whenever we want an AI project to be able to predict an output, we need to train
it first using data.
For example, If you want to make an Artificially Intelligent system which can predict the salary of any
employee based on his previous salaries, you would feed the data of his previous salaries into the
machine. This is the data with which the machine can be trained. Now, once it is ready, it will predict his
next salary efficiently. The previous salary data here is known as Training Data while the next salary
prediction data set is known as the Testing Data.
For better efficiency of an AI project, the Training data needs to be relevant and authentic. In the
previous example, if the training data was not of the previous salaries but of his expenses, the machine
would not have predicted his next salary correctly since the whole training went wrong. Similarly, if the
previous salary data was not authentic, that is, it was not correct, then too the prediction could have
gone wrong. Hence ….
For any AI project to be efficient, the training data should be authentic and relevant to the problem
statement scoped.
Data Features
Look at your problem statement once again and try to find the data features required to address this
issue. Data features refer to the type of data you want to collect. In our previous example, data
features would be salary amount, increment percentage, increment period, bonus, etc.
Acquiring Data from reliable sources
32
After mentioning the Data features, you get to know what sort of data is to be collected. Now, the
question arises- From where can we get this data? There can be various ways in which you can collect
data. Some of them are:
Sometimes, you use the internet and try to acquire data for your project from some random websites.
Such data might not be authentic as its accuracy cannot be proved. Due to this, it becomes necessary to
find a reliable source of data from where some authentic information can be taken. At the same time,
we should keep in mind that the data which we collect is open-sourced and not someone’s property.
Extracting private data can be an offense. One of the most reliable and authentic sources of information
are the open-sourced websites hosted by the government. These government portals have general
information collected in suitable format which can be downloaded and used wisely.
Some of the open-sourced Govt. portals are: data.gov.in, india.gov.in
List down ways of acquiring data for a project below:
1.
2.
3.
Page 4
30
1.2.2 Data Acquisition
Lesson Title: Data Acquisition Approach: Interactive Session + System Maps
Summary: Students will learn how to acquire data from reliable and authentic sources and will
understand how to analyse the data features which affect their problem scoped. Also, they will learn
the concept of System Maps
Learning Objectives:
? Students will learn various ways to acquire data.
? Students will learn about data features.
? Students will learn about System Maps.
Learning Outcomes:
? Identify data required regarding a given problem.
? Draw System Maps.
Pre-requisites: Basic computer literacy
Key-concepts:
? Develop an understanding of reliable and authentic data sources.
? System Mapping
In the previous module, we learnt how to scope a problem and set a Goal for the project. After
setting the goal, we listed down all the necessary elements which are directly/indirectly related
to our problem. This was done using the 4Ws problem canvas. 4Ws were:
1. Who?
a. Who are the stakeholders?
b. What do we know about them?
2. What?
a. What is the problem?
b. How do you that it is a problem? (is there an evidence?)
3. Where?
a. What is the context/situation the stakeholders experience this problem?
b. Where is the problem located?
4. Why?
a. What would hold value for the stakeholders?
b. How will the solution improve their situation?
To summarise, we then go for the problem statement template where we put in all the details
together at one place.
Our [Stakeholders] has/have a problem that [issue, problem,
need] when/while
[context, situation]. An ideal situation would be [benefit of
solution for them] .
31
What is Data Acquisition?
As we move ahead in the AI Project Cycle, we come across the second element which is: Data
Acquisition. As the term clearly mentions, this stage is about acquiring data for the project. Let us first
understand what is data. Data can be a piece of information or facts and statistics collected together for
reference or analysis. Whenever we want an AI project to be able to predict an output, we need to train
it first using data.
For example, If you want to make an Artificially Intelligent system which can predict the salary of any
employee based on his previous salaries, you would feed the data of his previous salaries into the
machine. This is the data with which the machine can be trained. Now, once it is ready, it will predict his
next salary efficiently. The previous salary data here is known as Training Data while the next salary
prediction data set is known as the Testing Data.
For better efficiency of an AI project, the Training data needs to be relevant and authentic. In the
previous example, if the training data was not of the previous salaries but of his expenses, the machine
would not have predicted his next salary correctly since the whole training went wrong. Similarly, if the
previous salary data was not authentic, that is, it was not correct, then too the prediction could have
gone wrong. Hence ….
For any AI project to be efficient, the training data should be authentic and relevant to the problem
statement scoped.
Data Features
Look at your problem statement once again and try to find the data features required to address this
issue. Data features refer to the type of data you want to collect. In our previous example, data
features would be salary amount, increment percentage, increment period, bonus, etc.
Acquiring Data from reliable sources
32
After mentioning the Data features, you get to know what sort of data is to be collected. Now, the
question arises- From where can we get this data? There can be various ways in which you can collect
data. Some of them are:
Sometimes, you use the internet and try to acquire data for your project from some random websites.
Such data might not be authentic as its accuracy cannot be proved. Due to this, it becomes necessary to
find a reliable source of data from where some authentic information can be taken. At the same time,
we should keep in mind that the data which we collect is open-sourced and not someone’s property.
Extracting private data can be an offense. One of the most reliable and authentic sources of information
are the open-sourced websites hosted by the government. These government portals have general
information collected in suitable format which can be downloaded and used wisely.
Some of the open-sourced Govt. portals are: data.gov.in, india.gov.in
List down ways of acquiring data for a project below:
1.
2.
3.
33
System Maps
Session Preparation
Logistics: For a class of 40 students [Group Activity – Groups of 4]
Materials Required:
ITEM QUANTITY
Computers 10
Chart Paper 10
Sketch-Pens 40
Resources:
Link to make System maps Online using an Animated tool: https://ncase.me/loopy/
Purpose: The purpose of this section is to introduce the concepts System Maps and its elements,
relationships and feedback loops.
Say: “Now that we have listed all the Data features, let us look at the concept of System Maps.
System Maps help us to find relationships between different elements of the problem which we have
scoped. It helps us in strategizing the solution for achieving the goal of our project. Here is an
example of a System very familiar to you – Water Cycle. The major elements of this system are
mentioned here. Take a look at these elements and try to understand the System Map for this
system. Also take a look at the relations between all the elements. After this, make your own system
map for the data features which you have listed. You can also use the online animated tool for
creating your System Maps.”
Brief:
We use system maps to understand complex issues with multiple factors that affect each other. In a
system, every element is interconnected. In a system map, we try to represent that relationship
through the use of arrows. Within a system map, we will identify loops. These loops are important
because they represent a specific chain of causes and effects. A system typically has several chains
of causes and effects. You may notice that some arrows are longer than others. A longer arrow
represents a longer time for a change to happen. We also call this a time delay. To change the
outcome of a system, as a change maker, we have two options - change the elements in a system or
change the relationships between elements. It is usually more effective to change the relationship
between elements in a system. You may also notice the use of ‘+’ signs and ‘-’ signs. These are an
indicator of the nature of the relationship between elements. What we did was a very basic
introduction to systems thinking, you can use Google to find more detailed information on how to
make systems maps.
Page 5
30
1.2.2 Data Acquisition
Lesson Title: Data Acquisition Approach: Interactive Session + System Maps
Summary: Students will learn how to acquire data from reliable and authentic sources and will
understand how to analyse the data features which affect their problem scoped. Also, they will learn
the concept of System Maps
Learning Objectives:
? Students will learn various ways to acquire data.
? Students will learn about data features.
? Students will learn about System Maps.
Learning Outcomes:
? Identify data required regarding a given problem.
? Draw System Maps.
Pre-requisites: Basic computer literacy
Key-concepts:
? Develop an understanding of reliable and authentic data sources.
? System Mapping
In the previous module, we learnt how to scope a problem and set a Goal for the project. After
setting the goal, we listed down all the necessary elements which are directly/indirectly related
to our problem. This was done using the 4Ws problem canvas. 4Ws were:
1. Who?
a. Who are the stakeholders?
b. What do we know about them?
2. What?
a. What is the problem?
b. How do you that it is a problem? (is there an evidence?)
3. Where?
a. What is the context/situation the stakeholders experience this problem?
b. Where is the problem located?
4. Why?
a. What would hold value for the stakeholders?
b. How will the solution improve their situation?
To summarise, we then go for the problem statement template where we put in all the details
together at one place.
Our [Stakeholders] has/have a problem that [issue, problem,
need] when/while
[context, situation]. An ideal situation would be [benefit of
solution for them] .
31
What is Data Acquisition?
As we move ahead in the AI Project Cycle, we come across the second element which is: Data
Acquisition. As the term clearly mentions, this stage is about acquiring data for the project. Let us first
understand what is data. Data can be a piece of information or facts and statistics collected together for
reference or analysis. Whenever we want an AI project to be able to predict an output, we need to train
it first using data.
For example, If you want to make an Artificially Intelligent system which can predict the salary of any
employee based on his previous salaries, you would feed the data of his previous salaries into the
machine. This is the data with which the machine can be trained. Now, once it is ready, it will predict his
next salary efficiently. The previous salary data here is known as Training Data while the next salary
prediction data set is known as the Testing Data.
For better efficiency of an AI project, the Training data needs to be relevant and authentic. In the
previous example, if the training data was not of the previous salaries but of his expenses, the machine
would not have predicted his next salary correctly since the whole training went wrong. Similarly, if the
previous salary data was not authentic, that is, it was not correct, then too the prediction could have
gone wrong. Hence ….
For any AI project to be efficient, the training data should be authentic and relevant to the problem
statement scoped.
Data Features
Look at your problem statement once again and try to find the data features required to address this
issue. Data features refer to the type of data you want to collect. In our previous example, data
features would be salary amount, increment percentage, increment period, bonus, etc.
Acquiring Data from reliable sources
32
After mentioning the Data features, you get to know what sort of data is to be collected. Now, the
question arises- From where can we get this data? There can be various ways in which you can collect
data. Some of them are:
Sometimes, you use the internet and try to acquire data for your project from some random websites.
Such data might not be authentic as its accuracy cannot be proved. Due to this, it becomes necessary to
find a reliable source of data from where some authentic information can be taken. At the same time,
we should keep in mind that the data which we collect is open-sourced and not someone’s property.
Extracting private data can be an offense. One of the most reliable and authentic sources of information
are the open-sourced websites hosted by the government. These government portals have general
information collected in suitable format which can be downloaded and used wisely.
Some of the open-sourced Govt. portals are: data.gov.in, india.gov.in
List down ways of acquiring data for a project below:
1.
2.
3.
33
System Maps
Session Preparation
Logistics: For a class of 40 students [Group Activity – Groups of 4]
Materials Required:
ITEM QUANTITY
Computers 10
Chart Paper 10
Sketch-Pens 40
Resources:
Link to make System maps Online using an Animated tool: https://ncase.me/loopy/
Purpose: The purpose of this section is to introduce the concepts System Maps and its elements,
relationships and feedback loops.
Say: “Now that we have listed all the Data features, let us look at the concept of System Maps.
System Maps help us to find relationships between different elements of the problem which we have
scoped. It helps us in strategizing the solution for achieving the goal of our project. Here is an
example of a System very familiar to you – Water Cycle. The major elements of this system are
mentioned here. Take a look at these elements and try to understand the System Map for this
system. Also take a look at the relations between all the elements. After this, make your own system
map for the data features which you have listed. You can also use the online animated tool for
creating your System Maps.”
Brief:
We use system maps to understand complex issues with multiple factors that affect each other. In a
system, every element is interconnected. In a system map, we try to represent that relationship
through the use of arrows. Within a system map, we will identify loops. These loops are important
because they represent a specific chain of causes and effects. A system typically has several chains
of causes and effects. You may notice that some arrows are longer than others. A longer arrow
represents a longer time for a change to happen. We also call this a time delay. To change the
outcome of a system, as a change maker, we have two options - change the elements in a system or
change the relationships between elements. It is usually more effective to change the relationship
between elements in a system. You may also notice the use of ‘+’ signs and ‘-’ signs. These are an
indicator of the nature of the relationship between elements. What we did was a very basic
introduction to systems thinking, you can use Google to find more detailed information on how to
make systems maps.
34
A system map shows the components and boundaries of a system and the components of the
environment at a specific point in time. With the help of System Maps, one can easily define a
relationship amongst different elements which come under a system. Relating this concept to our
module, the Goal of our project becomes a system whose elements are the data features mentioned
above. Any change in these elements changes the system outcome too. For example, if a person
received 200% increment in a month, then this change in his salary would affect the prediction of his
future salary. The more the increment presently, the more salary in future is what the system would
predict. Here is a sample System Map:
The Water Cycle
The concept of Water cycle is very simple to understand and is known to all. It explains how water
completes its cycle transforming from one form to another. It also adds other elements which affect the
water cycle in some way.
The elements which define the Water cycle system are:
Clouds Snow Underground
Soil
Rivers
Oceans Trees Land Animals
Read More