Humanities/Arts Exam  >  Humanities/Arts Notes  >  Informatics Practices for Class 12  >  Chapter Notes: Plotting Data using Matplotlib

Plotting Data using Matplotlib Chapter Notes | Informatics Practices for Class 12 - Humanities/Arts PDF Download

Chapter Notes - Plotting Data using Matplotlib

Introduction

  • Data visualisation involves graphical or pictorial representation of data using graphs, charts, etc., to visualise variation or show relationships between variables.
  • Visualisation helps in better understanding of analysis results, especially when inferring from raw results is challenging.
  • It facilitates effective communication of information to intended users.
  • Examples of visualisation in daily life include traffic symbols, ultrasound reports, atlas maps, speedometers, and instrument tuners.
  • Data visualisation is widely used in fields such as health, finance, science, mathematics, and engineering

Plotting using Matplotlib

  • Matplotlib is a Python library used for creating static, animated, and interactive 2D plots or figures.
  • It can be installed using the command: pip install matplotlib.
  • The Pyplot module of Matplotlib is imported using: import matplotlib.pyplot as plt, where plt is an alias for matplotlib.pyplot.
  • The Pyplot module contains functions to work on plots, such as creating figures, plotting areas, lines, and decorating plots with labels.
  • A figure is the overall window where Pyplot outputs are plotted, containing components like the plotting area, legend, axis labels, ticks, and title.
  • The plot() function creates a figure, and plt.plot(x, y) plots x versus y.
  • The show() function displays the created figure.
  • Charts should include a title, axis labels, and a legend (if multiple datasets are plotted) to ensure clarity.
  • By default, plot() creates a line chart, but other plot types like bar, boxplot, histogram, pie, and scatter are supported, as listed in Table 4.1.
  • Plots can be saved as images using the save button in the output window or the savefig() function, e.g., plt.savefig('x.png').

Customisation of Plots

Matplotlib’s Pyplot provides functions to customise charts, such as adding titles, legends, and gridlines, as listed in Table 4.2.
Key customisation functions include:

  • grid(): Configures grid lines.
  • legend(): Places a legend on the axes.
  • savefig(): Saves the current figure.
  • show(): Displays all figures.
  • title(): Sets a title for the axes.
  • xlabel(): Sets the label for the x-axis.
  • xticks(): Sets tick locations and labels for the x-axis.
  • ylabel(): Sets the label for the y-axis.
  • yticks(): Sets tick locations and labels for the y-axis.

Marker

Markers are symbols representing data values in line or scatter plots.

  • Markers can be specified in the plot() function to highlight data points.
  • Table 4.3 lists markers, including:
    • .: Point
    • ,: Pixel
    • o: Circle
    • v: Triangle down
    • ^: Triangle up
    • 8: Octagon
    • s: Square
    • p: Pentagon
    • *: Star
    • h: Hexagon1
    • H: Hexagon2
    • +: Plus
    • x: X
    • D: Diamond

Colour

Plots can be formatted by changing the colour of plotted data using the color parameter in plot().
Colours can be specified using character codes or colour names, as listed in Table 4.4:

  • b: Blue
  • g: Green
  • r: Red
  • c: Cyan
  • m: Magenta
  • y: Yellow
  • k: Black
  • w: White

Linewidth and Line Style


The linewidth parameter changes the width of the line in pixels; the default is 1 pixel, and values greater than 1 create thicker lines.
The linestyleparameter sets the line style, with options including:

  • solid
  • dotted
  • dashed
  • dashdot

The Pandas Plot Function (Pandas Visualisation)

  • Starting from version 0.17.0, Pandas Series and DataFrame objects have a built-in plot() method, which wraps around Matplotlib’s plot() function.
  • For a Series s or DataFrame df, the plot can be called using s.plot() or df.plot().
  • The plot() method accepts arguments to customise plots, with the kindparameter specifying the plot type, as listed in Table 4.5:
    • line: Line plot (default)
    • bar: Vertical bar plot
    • barh: Horizontal bar plot
    • hist: Histogram
    • box: Boxplot
    • area: Area plot
    • pie: Pie plot
    • scatter: Scatter plot
  • Matplotlib’s Pyplot methods and functions can be used alongside Pandas’ plot() method.

Plotting a Line Chart

  • A line plot displays the frequency of data along a number line, suitable for continuous datasets.
  • It is used to visualise growth or decline in data over a time interval.
  • Pandas’ plot(kind='line') can plot line charts from DataFrame data, with customisations like colour, markers, and line styles.
  • The x-axis typically uses the DataFrame’s index, and custom tick labels can be set using plt.xticks(ticks, labels).

Plotting Bar Chart

  • Bar charts are used for comparing data across categories, capable of plotting strings on the x-axis.
  • They are created using df.plot(kind='bar'), with the option to specify columns for x and y axes.
  • Customisations include changing bar colours, edge colours, line styles, and line widths.

Plotting Histogram

  • Histograms are column charts where each column represents a range of values, and the height corresponds to the number of values in that range.
  • Data is sorted into bins, and the height of each column is proportional to the number of data points in the bin.
  • The df.plot(kind='hist') function automatically selects bin sizes based on data spread.
  • Customisations include setting bin numbers, edge colour, line style, line width, fill (boolean), and hatch patterns (e.g., -, +, x, o).

Using Open Data

  • Open data refers to freely available datasets for public use, primarily for educational purposes, promoting transparency, accessibility, and innovation.
  • The “Open Government Data (OGD) Platform India” (data.gov.in) provides large datasets on various projects and parameters.
  • Histograms can be plotted for open datasets, such as temperature data, to observe frequency distributions.

Plotting Scatter Chart

  • Scatter charts are two-dimensional plots using dots to represent values of two variables, one on the x-axis and one on the y-axis.
  • They are used to show relationships or correlations between variables, sometimes called correlation plots.
  • Dot size, shape, or colour can represent additional variables.
  • Customisations include changing marker type, size, colour, edge colour, and line width.

Plotting Quartiles and Box Plot

  • Quartiles divide data into four equal parts, each containing an equal number of observations, requiring median calculation.
  • They are used in educational, sales, and survey data to divide populations into groups, e.g., identifying the top 25% of students.
  • A box plot visualises a statistical summary, including minimum value, quartile 1, quartile 2 (median), quartile 4, maximum value, and whiskers extending to the highest and lowest values.
  • Box plots identify outliers, which are observations numerically distant from the rest of the data.
  • The distance between the box and whiskers indicates data variation; shorter distances suggest small variation, and longer distances suggest larger variation.

Plotting Pie Chart

  • Pie charts are circular graphs divided into sectors, each representing a proportion of the whole, used for numerical data.
  • They are created using df.plot(kind='pie', y='column'), with default labels from the DataFrame’s index.
  • Customisations include:
    • explode: Specifies the fraction of the radius to offset each sector.
    • autopct: Displays the percentage of each sector as a label.
    • colors: Changes the colour of each sector.
    • legend: Can be set to False to hide the legend.

Summary

  • A plot (also called a graph or chart) visually represents a dataset to show relationships between two or more variables.
  • The Pyplot module is imported using import matplotlib.pyplot as plt, where plt is an alias.
  • Pyplot functions create figures, plotting areas, lines, bars, histograms, etc., and decorate plots with labels.
  • Plot components include title, legend, ticks, x-label, and y-label.
  • plt.plot() builds a plot, and plt.show() displays it.
  • plt.xlabel() and plt.ylabel() set axis labels, and plt.title() sets the plot title.
  • Data can be plotted directly from a DataFrame using Pandas’ plot() function.
  • The format for DataFrame plotting is df.plot(kind=''), where kind can be line, bar, hist, scatter, or box.
The document Plotting Data using Matplotlib Chapter Notes | Informatics Practices for Class 12 - Humanities/Arts is a part of the Humanities/Arts Course Informatics Practices for Class 12.
All you need of Humanities/Arts at this link: Humanities/Arts
14 docs

FAQs on Plotting Data using Matplotlib Chapter Notes - Informatics Practices for Class 12 - Humanities/Arts

1. What is Matplotlib and why is it used in data visualization?
Ans. Matplotlib is a popular Python library used for plotting and visualizing data. It provides a wide range of tools to create static, animated, and interactive visualizations in Python. Matplotlib is widely used because it is versatile, easy to use, and integrates well with other libraries such as NumPy and Pandas, making it suitable for a variety of data analysis tasks.
2. How can I customize the appearance of plots in Matplotlib?
Ans. You can customize the appearance of plots in Matplotlib by modifying various parameters such as marker styles, colors, line widths, and line styles. For example, you can change the marker type using the 'marker' parameter, adjust the color with the 'color' parameter, and set the line width using the 'linewidth' parameter in the plotting functions. This allows you to create visually appealing charts that convey data effectively.
3. What is the Pandas plot function and how does it relate to Matplotlib?
Ans. The Pandas plot function is a convenient interface for creating visualizations directly from Pandas DataFrames. It is built on top of Matplotlib, meaning that it uses Matplotlib's capabilities to render plots. This function simplifies the plotting process by allowing users to generate various types of charts (such as line charts, bar charts, and histograms) with minimal code, making it easier for data analysts to visualize their data.
4. How do I plot a line chart using Matplotlib?
Ans. To plot a line chart using Matplotlib, you can use the 'plot()' function. First, you need to import the Matplotlib library and prepare your data (X and Y values). Then, you can call plt.plot(X, Y) to create the line chart. You can customize the chart by adding titles, labels, and adjusting the aesthetics like color and line style. Finally, use plt.show() to display the chart.
5. What steps are involved in creating a histogram with Matplotlib?
Ans. To create a histogram with Matplotlib, follow these steps: First, import the Matplotlib library and your data. Use the 'hist()' function to plot the histogram by passing your data as an argument, and you can specify the number of bins if needed. You can also customize the histogram by changing colors and adding titles or labels. Finally, display the histogram using plt.show(). This provides a visual representation of the distribution of your data.
Related Searches

pdf

,

Important questions

,

past year papers

,

Summary

,

Previous Year Questions with Solutions

,

practice quizzes

,

Free

,

MCQs

,

mock tests for examination

,

Viva Questions

,

Semester Notes

,

study material

,

shortcuts and tricks

,

Extra Questions

,

Plotting Data using Matplotlib Chapter Notes | Informatics Practices for Class 12 - Humanities/Arts

,

Plotting Data using Matplotlib Chapter Notes | Informatics Practices for Class 12 - Humanities/Arts

,

video lectures

,

Exam

,

Plotting Data using Matplotlib Chapter Notes | Informatics Practices for Class 12 - Humanities/Arts

,

ppt

,

Objective type Questions

,

Sample Paper

;