Practice
Resources
Contests
Online IDE
New
Free Mock
Scaler
Practice
Improve your coding skills with our resources
Contests
Compete in popular contests with top coders
Scaler
Explore Offerings by SCALER

Data Analysis

Go to Problems

Matplotlib

One package that is familiar to almost all the data science and machine learning community would be matplotlib and the reason would be the simplicity with which it allows us to plot data in different forms of plots.

In a Python file, we can import the pyplot function that allows us to interface with a MATLAB-like plotting environment. 

import matplotlib.pyplot as plt
%matplotlib inline

The %matplotlib inline is a jupyter notebook specific command that let’s you see the plots in the notebook itself.

 

# Plot
plt.plot([1,2,3,4,10])
#> [<matplotlib.lines.Line2D at 0x10edbab70>]-> its just the object matplotlib returned and we should use Plt.show() for matplotlib to show the plot not to return it.

Just a list of numbers was given to plt.plot() and it drew a line chart automatically. 

The plt.plot accepts 3 basic arguments in the following order: (x, y, format).

This format is a short hand combination of {color}{marker}{line}

In the above examples’ case, we have provided just one list which the matplotlib assumed as the frequency of values on the x-axis starting from 0.

 

plt.plot([1,4,9,16,25], [1,2,3,4,10], 'gs--')
plt.show()

We can even have two sets of points in a single plot.

# Draw two sets of points
plt.plot([1,2,3,4,5], [1,2,3,10,15], 'gs')  # green squares
plt.plot([1,2,3,4,5], [2,3,4,15,20], 'k*')  # black stars
plt.show()

We can even add the basic plot features: Title, Legend, X and Y axis labels.

plt.plot([1,2,3,4,5], [1,2,3,4,10], 'go', label='GreenDots')
plt.plot([1,2,3,4,5], [2,3,4,5,11], 'b*', label='Bluestars')
plt.title('A Simple Scatterplot')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend(loc='best')  # legend text comes from the plot's label parameter.
plt.show()


We can have the control of size of plots using plt.figure(figsize=(10,7))  #here 10 is the width and 7 is the height.



plt.subplots(x,y). This creates and returns two objects:

  •  the figure
  •  the axes (subplots) inside the figure

 

# Create Figure and Subplots

fig, (ax1, ax2) = plt.subplots(1,2, figsize=(10,4), sharey=True, dpi=120)

 

# Plot
ax1.plot([1,2,3,4,5], [1,2,3,4,10], 'gs')  # greensquares
ax2.plot([1,2,3,4,5], [2,3,4,5,11], 'b0')  # bluedots

 

# Title, X and Y labels, X and Y Lim
ax1.set_title('Scatterplot Greensquares'); ax2.set_title('Scatterplot Bluedots')
ax1.set_xlabel('X1');  ax2.set_xlabel('X2')  # x label
ax1.set_ylabel('Y1');  ax2.set_ylabel('Y2')  # y label
ax1.set_xlim(0, 6) ;  ax2.set_xlim(0, 6)   # setting x axis limits
ax1.set_ylim(0, 12);  ax2.set_ylim(0, 12)  # setting y axis limits
ax2.yaxis.set_ticks_position('none')
plt.tight_layout()
plt.show()


Setting sharey=True in plt.subplots() divides the Y-axis between the two subplots.

The above setting of xlabel, ylabel, xlim, ylim can be done in the following format also:

ax1.set(title='Scatterplot Greensquares', xlabel='X1', ylabel='Y1', xlim=(0,6), ylim=(0,12))
ax2.set(title='Scatterplot Bluedots', xlabel='X2', ylabel='Y2', xlim=(0,6), ylim=(0,12))



Matplotlib is also used in plotting and viewing images. After reading images we can plot them using plt.figure() and plt.imshow() functions.

After using each of these functions, we have to put plt.show(), which is used to display all the plots’ figures.

 

import matplotlib.image as img
# reading the image
testImage = img.imread('pic.png') # here pic.png is the image address accessible by your editor
# displaying the image as an array
print(testImage)    # it’ll print a matrix which actually represents the pixels of image.
plt.imshow(testImage)  # this will plot the image

 

Serious about Learning Data Science and Machine Learning ?

Learn this and a lot more with Scaler's Data Science industry vetted curriculum.
Vector analysis (numpy)
Problem Score Companies Time Status
find the one 30
1:58
choose the output 30
3:37
python broadcasting 30
3:51
Duplicates detection 50
34:42
Row-wise unique 50
36:03
Data handling (pandas)
Problem Score Companies Time Status
For 'series' 30
3:38
drop axis 30
1:12
Rename axis 30
1:31
iloc vs loc part I 30
1:00
As a Series 50
22:45
Max registrations they asked? 50
48:50
Basic computer vision (opencv)
Problem Score Companies Time Status
Which library it is? 30
0:46
Image dimensions 30
1:12
Dimension with components 30
0:51
Color interpretation 30
1:15
Image cropping 30
1:24
Data visualization (matplotlib)
Problem Score Companies Time Status
2d graphics 30
0:33
Suitable plot type 30
1:11
Subplot Coordinates 30
3:03
Vertically Stacked Bar Graph 30
2:02
Load RGB 30
1:11
Web scraping basics
Problem Score Companies Time Status
What does the code do? 30
2:20
Retrieval protocol 30
0:37
2-way communication 30
0:45
Search engine process 30
1:19
What does the code print? 30
1:05
Eda
Problem Score Companies Time Status
PCA's secondary objective 30
1:09
Five number theory 30
1:02
Free Mock Assessment
Fill up the details for personalised experience.
All fields are mandatory
Current Employer *
Enter company name *
Graduation Year *
Select an option *
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
Phone Number *
OTP will be sent to this number for verification
+1 *
+1
Change Number
Phone Number *
OTP will be sent to this number for verification
+1 *
+1
Change Number
Graduation Year *
Graduation Year *
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
*Enter the expected year of graduation if you're student
Current Employer *
Company Name *
Please verify your phone number
Edit
Resend OTP
By clicking on Start Test, I agree to be contacted by Scaler in the future.
Already have an account? Log in
Free Mock Assessment
Instructions from Interviewbit
Start Test