Machine Learning - Normal Data Distribution

Normal Data Distribution (Normal Data Distribution)

In the previous chapter, we learned how to create a completely random array of a given size between two given values.

In this chapter, we will learn how to create an array with values concentrated around a given value.

In probability theory, after the formula of this data distribution was proposed by the mathematician Carl Friedrich Gauss, this data distribution is called normal data distribution or Gaussian data distribution.

Example

Typical Normal Data Distribution:

import numpy
import matplotlib.pyplot as plt
x = numpy.random.normal(5.0, 1.0, 100000)
plt.hist(x, 100)
plt.show()

Result:


Run Instance

Note:Since the normal distribution curve has a bell-shaped characteristic shape, it is also called a bell-shaped curve.

Histogram Explanation

We use numpy.random.normal() The array created by the method (with 100000 values) is plotted with a histogram of 100 columns.

We specify the mean as 5.0 and the standard deviation as 1.0.

This means that these values should be concentrated around 5.0 and rarely deviate from the average by 1.0.

As can be seen from the histogram, most values are between 4.0 and 6.0, with the highest value being about 5.0.