Machine Learning - Training/Testing
- Duba Previous Page Girmama
- Duba Next Page Dabirin Dabbanci
评估模型
在机器学习中,我们创建模型来预测某些事件的结果,就像在上一章中当我们了解重量和发动机排量时,预测了汽车的二氧化碳排放量一样。
Dona a koyar da yadda modela ake samu, ake samu samar da wuri ne kan 'tsara/samu'.
Gane da ake nufin 'tsara/samu'?
Tsara/samu naa yana nuna yadda ake samu jumla na saman modela.
Anan sunan 'tsara/samu' sabonin da ake naa a tsara ayyukanan dattan a matsayin ikoran biyu: ayyukanan tsara da ayyukanan samu.
80% dona a tsara, 20% dona a samu.
A zai zai samu ayyukanan dattan dona a tsara modela.
A zai zai samu ayyukanan tsara dona a tsara modela.
A tsara modela naa yana nuna yadda ake naa.
A tsara modela naa yana nuna yadda modela ta a samu.
Kamata ayyukanan dattan.
Kamata ayyukanan dattan a baya.
Ayyukanan dattan gida 100 kananan mutum da koyarwarin suwar sukan.
Instance
import numpy import matplotlib.pyplot as plt numpy.random.seed(2) x = numpy.random.normal(3, 1, 100) y = numpy.random.normal(150, 40, 100) / x plt.scatter(x, y) plt.show()
Kauyawa:
x 轴表示购买前的分钟数。
y 轴表示在购买上花费的金额。

拆分训练/测试
训练集应该是原始数据的 80% 的随机选择。
测试集应该是剩余的 20%。
train_x = x[:80] train_y = y[:80] test_x = x[80:] test_y = y[80:]
显示训练集
显示与训练集相同的散点图:
Instance
plt.scatter(train_x, train_y) plt.show()
Kauyawa:
它看起来像原始数据集,因此似乎是一个合理的选择:

显示测试集
为了确保测试集不是完全不同,我们还要看一下测试集。
Instance
plt.scatter(test_x, test_y) plt.show()
Kauyawa:
测试集也看起来像原始数据集:

拟合数据集
数据集是什么样的?我认为最合适拟合的是多项式回归,因此让我们画一条多项式回归线。
要通过数据点画一条线,我们使用 matplotlib 模块的 plott()
方法:
Instance
绘制穿过数据点的多项式回归线:
import numpy import matplotlib.pyplot as plt numpy.random.seed(2) x = numpy.random.normal(3, 1, 100) y = numpy.random.normal(150, 40, 100) / x train_x = x[:80] train_y = y[:80] test_x = x[80:] test_y = y[80:] mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4)) myline = numpy.linspace(0, 6, 100) plt.scatter(train_x, train_y) plt.plot(myline, mymodel(myline)) plt.show()
Kauyawa:

Kauyawa za a iya su tuta aiki ga tsammanar wa data na maimakon kuma a modeli, kuma a yadda ake samu wani shakara. Misali: wani shakara yana nuna cewa wani mai shuka a kungiyar kwallon kafa 6 minitun, yana samu duki na 200. Idan ake samu wani shakara, ana nuna cewa yana da kusurci.
Kuma R-squared score kuma yana nuna wajibcin kuma data na maimakon kuma a modeli.
R2
Kai tsammana, R2, kuma ana kira R-squared (R-squared) ba?
Yana kai samu wajibcin kuma a tsakanin x axis da y axis, kuma yana nuna daga 0 zuwa 1, inda 0 ya nuna wajibcin kuma kuma 1 ya nuna wajibcin kuma.
sklearn modulun yana da shi ne rs_score()
Kuma a halin yanzu, a za a samu yanke na kai tsammanar wa lokaci da a gana samu duki a kungiyar kwallon kafa.
A yanzu, a za a samu yanke na kai tsammanar wa lokaci da a gana samu duki a kungiyar kwallon kafa.
Instance
Bai kai tsammana ba, kuma a halin yanzu, data na taraya, kuma a gana samu kuma.
import numpy from sklearn.metrics import r2_score numpy.random.seed(2) x = numpy.random.normal(3, 1, 100) y = numpy.random.normal(150, 40, 100) / x train_x = x[:80] train_y = y[:80] test_x = x[80:] test_y = y[80:] mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4)) r2 = r2_score(train_y, mymodel(train_x)) print(r2)
Tattalin Iya:Kauyawa 0.799 ya nuna wajibcin kuma.
A tsara setar tafiyar da aikata.
A yanzu, kuma a halin yanzu, a cikin data na taraya, a da shi modeli mai kyau.
koyi, aminu kada a gana samu kiyi, kuma a gana samu modeli don kawo shawararai da yadda su yi.
Instance
A gano R2 yadda zai iya amfani a cikin data na tsafta:
import numpy from sklearn.metrics import r2_score numpy.random.seed(2) x = numpy.random.normal(3, 1, 100) y = numpy.random.normal(150, 40, 100) / x train_x = x[:80] train_y = y[:80] test_x = x[80:] test_y = y[80:] mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4)) r2 = r2_score(test_y, mymodel(test_x)) print(r2)
Tattalin Iya:Nuna 0.809 tace na model na daidai a cikin tsafta, a kuma ganin a za a samun dukiya da zai iya daidai a samun dukiya da zai iya samun dukiya da yadda zai iya samun dukiya.
Dukiya
Baya na zama a ke ganin model na dukkanin a kai, a za a bida kai baya samun dukiya.
Instance
Kuma an ba ga wani mai ganin cewa masu daffa ke neman 5 minitun a dabbuka, mace ko dan baba zai wari kowace kuɗi?
print(mymodel(5))
Tarihin nuna an gana 22.88 dollar, kuma ya fi yadda shi yana fi yadda na tsafta:

- Duba Previous Page Girmama
- Duba Next Page Dabirin Dabbanci