Lessenrooster

Python Handleiding

Bestandshandeling

Python NumPy

Machine learning

Python MySQL

Python MongoDB

Python Reference Manual

Module referentiemanual

Python How To

Python Example

Electieve cursus

Aanbevolen cursus:

CodeW3C.com Treasure Box

Machine Learning - Scatter Plot

Previous Page Normal Data Distribution
Next Page Linear Regression

Scatter Plot

Een scatterplot is een grafiek waarbij elke waarde in de dataset wordt weergegeven door een punt.

De Matplotlib-module heeft een methode om een scatterplot te tekenen, die twee array's van dezelfde lengte nodig heeft: een array voor de waarden op de x-as en een array voor de waarden op de y-as:

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

De x-array vertegenwoordigt de leeftijd van elk voertuig.

De y-array vertegenwoordigt de snelheid van elke auto.

Example

Gebruik a.u.b. scatter() Methode om een scatterplot te tekenen:

import matplotlib.pyplot as plt
x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]
plt.scatter(x, y)
plt.show()

Result:

Run Example

Scatter Plot Explanation

De x-as geeft de leeftijd van de auto aan, de y-as geeft de snelheid aan.

Van de grafiek is te zien dat de twee snelste auto's twee jaar hebben gebruikt, terwijl de langzaamste auto twaalf jaar heeft gebruikt.

Note:It seems that the faster the driving speed, the newer the car, but this may be a coincidence, after all, we only registered 13 cars.

Random Data Distribution

In machine learning, datasets can contain thousands or even millions of values.

When testing algorithms, you may not have real data and you may have to use randomly generated values.

As we learned in the previous chapter, the NumPy module can help us!

Let's create two arrays that are both filled with 1000 random numbers from a normal data distribution.

The mean of the first array is set to 5.0 and the standard deviation is 1.0.

The mean of the second array is set to 10.0 and the standard deviation is 2.0: