Machine Learning - Scatter Plot

Scatter Plot (Scatter Plot)

A scatter plot is a graph where each value in the dataset is represented by a point.


Matplotlib has a method for drawing scatter plots, which requires two arrays of the same length, one for the x-axis values and the other for the y-axis values:

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

The x array represents the age of each car.

The y array represents the speed of each car.

Example

Please use scatter() Method to draw a scatter plot:

import matplotlib.pyplot as plt
x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]
plt.scatter(x, y)
plt.show()

Result:


Run Example

Scatter Plot Explanation

The x-axis represents the age of the car, and the y-axis represents the speed.

As can be seen from the figure, the two fastest cars have been used for 2 years, and the slowest car has been used for 12 years.

Note:It seems that the faster the driving speed, the newer the car, but this may be a coincidence, after all, we only registered 13 cars.

Random Data Distribution

In machine learning, datasets can contain thousands, even millions, of values.

When testing algorithms, you may not have real data, and you may have to use randomly generated values.

As we learned in the previous chapter, the NumPy module can help us!

Let's create two arrays, both filled with 1000 random numbers from a normal data distribution.

The mean of the first array is set to 5.0, with a standard deviation of 1.0.

The mean of the second array is set to 10.0, with a standard deviation of 2.0:

Example

Scatter plot with 1000 points:

import numpy
import matplotlib.pyplot as plt
x = numpy.random.normal(5.0, 1.0, 1000)
y = numpy.random.normal(10.0, 2.0, 1000)
plt.scatter(x, y)
plt.show()

Result:


Run Example

Scatter Plot Explanation

We can see that the points are concentrated around the value 5 on the x-axis and 10 on the y-axis.

We can also see that the dispersion on the y-axis is greater than that on the x-axis.