feat: added PW3
This commit is contained in:
333
PW-3/ex1/ex1-bayes-stud.ipynb
Normal file
333
PW-3/ex1/ex1-bayes-stud.ipynb
Normal file
@@ -0,0 +1,333 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"source": [
|
||||
"## Exercise 1 - Bayes classification system"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Import some useful libraries\n",
|
||||
"\n",
|
||||
"import math\n",
|
||||
"\n",
|
||||
"import matplotlib.pyplot as plt\n",
|
||||
"import numpy as np\n",
|
||||
"import pandas as pd\n",
|
||||
"from sklearn.model_selection import train_test_split\n",
|
||||
"from sklearn.preprocessing import OrdinalEncoder, StandardScaler"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 1a. Getting started with Bayes"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"a) Read the training data from file ex1-data-train.csv. The first two columns are x1 and x2. The last column holds the class label y."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"is_executing": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def read_data(file):\n",
|
||||
" dataset = pd.read_csv(file, names=['x1','x2','y'])\n",
|
||||
" print(dataset.head())\n",
|
||||
" return dataset[[\"x1\", \"x2\"]], dataset[\"y\"].values"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"X_train, y_train = read_data(\"ex1-data-train.csv\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Prepare a function to compute accuracy\n",
|
||||
"def accuracy_score(y_true, y_pred):\n",
|
||||
" return (y_true == y_pred).sum() / y_true.size"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"b) Compute the priors of both classes P(C0) and P(C1)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"is_executing": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# TODO: Compute the priors\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"c) Compute histograms of x1 and x2 for each class (total of 4 histograms). Plot these histograms. Advice : use the numpy `histogram(a, bins=\"auto\")` function."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"is_executing": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# TODO: Compute histograms\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# TODO: plot histograms\n",
|
||||
"\n",
|
||||
"plt.figure(figsize=(16,6))\n",
|
||||
"\n",
|
||||
"plt.subplot(1, 2, 1)\n",
|
||||
"...\n",
|
||||
"plt.xlabel('Likelihood hist - Exam 1')\n",
|
||||
"\n",
|
||||
"plt.subplot(1, 2, 2)\n",
|
||||
"...\n",
|
||||
"plt.xlabel('Likelihood hist - Exam 2')\n",
|
||||
"\n",
|
||||
"plt.show()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"d) Use the histograms to compute the likelihoods p(x1|C0), p(x1|C1), p(x2|C0) and p(x2|C1). For this define a function `likelihood_hist(x, hist_values, edge_values)` that returns the likelihood of x for a given histogram (defined by its values and bin edges as returned by the numpy `histogram()` function)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"is_executing": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def likelihood_hist(x: float, hist_values: np.ndarray, bin_edges: np.ndarray) -> float:\n",
|
||||
" # TODO: compute likelihoods from histograms outputs\n",
|
||||
"\n",
|
||||
" return ..."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"e) Implement the classification decision according to Bayes rule and compute the overall accuracy of the system on the test set ex1-data-test.csv. :\n",
|
||||
"- using only feature x1\n",
|
||||
"- using only feature x2\n",
|
||||
"- using x1 and x2 making the naive Bayes hypothesis of feature independence, i.e. p(X|Ck) = p(x1|Ck) · p(x2|Ck)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"is_executing": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"X_test, y_test = read_data(\"ex1-data-test.csv\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"is_executing": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# TODO: predict on test set in the 3 cases described above\n",
|
||||
"\n",
|
||||
"y_pred = []\n",
|
||||
"\n",
|
||||
"...\n",
|
||||
"\n",
|
||||
"accuracy_score(y_test, y_pred)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Which system is the best ?"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"TODO: answer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 1b. Bayes - Univariate Gaussian distribution"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Do the same as in a) but this time using univariate Gaussian distribution to model the likelihoods p(x1|C0), p(x1|C1), p(x2|C0) and p(x2|C1). You may use the numpy functions `mean()` and `var()` to compute the mean μ and variance σ2 of the distribution. To model the likelihood of both features, you may also do the naive Bayes hypothesis of feature independence, i.e. p(X|Ck) = p(x1|Ck) · p(x2|Ck).\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"is_executing": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def likelihood_univariate_gaussian(x: float, mean: float, var: float) -> float:\n",
|
||||
" # TODO: compute likelihoods from histograms outputs\n",
|
||||
"\n",
|
||||
" return ..."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"is_executing": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# TODO: Compute mean and variance for each classes and each features (8 values)\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# TODO: predict on test set in the 3 cases\n",
|
||||
"\n",
|
||||
"y_pred = []\n",
|
||||
"\n",
|
||||
"...\n",
|
||||
"\n",
|
||||
"accuracy_score(y_test, y_pred)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.8.7"
|
||||
},
|
||||
"pycharm": {
|
||||
"stem_cell": {
|
||||
"cell_type": "raw",
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"source": []
|
||||
}
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 1
|
||||
}
|
||||
100
PW-3/ex1/ex1-data-test.csv
Normal file
100
PW-3/ex1/ex1-data-test.csv
Normal file
@@ -0,0 +1,100 @@
|
||||
39.1963341568658,78.53029405902203,0
|
||||
40.448499233673424,86.83946993295656,1
|
||||
65.57192032694599,44.303496565835594,0
|
||||
79.64811329486565,70.8065641864705,1
|
||||
66.26022052135889,41.67270317074954,0
|
||||
97.6637443782087,68.3249232452966,1
|
||||
30.548823788843436,57.31847952965393,0
|
||||
89.47322095778219,85.94680780258534,1
|
||||
50.93087801180052,34.2357678392285,0
|
||||
39.79292275937423,83.42467462939659,1
|
||||
47.45440952767612,43.40242137611206,0
|
||||
69.97497171303611,84.4084067760751,1
|
||||
66.57906119077748,42.13570922437346,0
|
||||
85.05872976046471,54.31025004023918,1
|
||||
66.50445545099684,46.515380367647104,0
|
||||
75.67274744410004,93.79012528285647,1
|
||||
30.589637766842877,71.58841488039977,0
|
||||
43.2174833244174,83.55961536494472,1
|
||||
58.04023606927604,39.47235992846592,0
|
||||
40.15801957067056,94.28873609786281,1
|
||||
65.40785754453304,39.872039582416946,0
|
||||
58.25386824923051,64.96454852577446,1
|
||||
90.05150698066501,34.03096751205591,0
|
||||
72.24873848000416,90.1077757094509,1
|
||||
32.732305095404456,98.49269418173134,0
|
||||
74.06410532697512,66.96252809184301,1
|
||||
30.074888412046263,56.513104954256875,0
|
||||
87.57197590933474,68.15013081653733,1
|
||||
54.562040422189284,49.542441977062865,0
|
||||
78.30902280632358,72.23271250670665,1
|
||||
57.870305028845,48.514216465966285,0
|
||||
91.35751201085463,85.6201641726489,1
|
||||
32.89942225933118,68.89835152862396,0
|
||||
75.96271751468554,73.37079167632794,1
|
||||
49.73784613458287,59.13494209712587,0
|
||||
73.5544567377702,66.04140381033584,1
|
||||
34.20510941997501,72.62513617755425,0
|
||||
54.49230689236608,75.50968920375037,1
|
||||
48.50711697988822,47.74600670205531,0
|
||||
92.3876668476141,76.82950398511272,1
|
||||
39.89720264828788,62.09872615693186,0
|
||||
75.76883065897587,43.6375457580161,1
|
||||
32.938859931422954,75.6959591164835,0
|
||||
44.53335294213268,86.44202248365731,1
|
||||
51.265631719309845,60.12130845234037,0
|
||||
70.78776945843022,84.2462083261098,1
|
||||
28.94644639193278,39.599160546805116,0
|
||||
47.53708530844937,73.62887169594207,1
|
||||
49.02408652102979,48.50397486087145,0
|
||||
78.37067490088779,93.91476948225585,1
|
||||
48.806979396137145,62.206605350437144,0
|
||||
72.03919354554785,88.5636216577281,1
|
||||
31.23633606784064,96.30534895479137,0
|
||||
51.56156298671939,89.15548481990747,1
|
||||
65.08996501958059,39.488228986986606,0
|
||||
81.75983894249494,47.952028645978714,1
|
||||
46.466982795222684,43.17493123886225,0
|
||||
64.49601863360589,82.20819682836424,1
|
||||
65.59947425235588,42.79658543523777,0
|
||||
50.66778894002708,64.22662181783375,1
|
||||
30.665280235026138,42.70685221873931,0
|
||||
76.60228200416394,65.62163965042933,1
|
||||
60.39824874786827,38.54265995207925,0
|
||||
80.7498890348191,47.942468664004934,1
|
||||
81.83730756343084,39.62946723071423,0
|
||||
76.67188156208798,73.0039571691345,1
|
||||
31.702591304883626,73.4485451232566,0
|
||||
89.75853252236888,65.1794033434368,1
|
||||
31.111272744640324,77.90680809560692,0
|
||||
56.360076920020845,68.81541270666031,1
|
||||
47.365528695867354,59.268265092300844,0
|
||||
81.99701278469126,55.477765254828924,1
|
||||
73.19627144242138,28.399910031060564,0
|
||||
50.28593379220375,85.68597173591368,1
|
||||
30.532888808836397,77.17395841411421,0
|
||||
66.62736064332904,65.14099834530835,1
|
||||
30.563843972698294,44.15958836055778,0
|
||||
69.30483520344725,90.15732087213348,1
|
||||
40.63104177166124,61.47155968946135,0
|
||||
67.51887729702649,76.70896125160789,1
|
||||
33.6944962783859,43.961979616998335,0
|
||||
54.61941030575024,73.60040410454849,1
|
||||
29.956247697479498,91.60028497230863,0
|
||||
59.56176709683286,81.89054923262506,1
|
||||
29.097516205452173,92.0159604576793,0
|
||||
87.75444054660184,65.2841177353011,1
|
||||
79.14696413604753,40.118482227299694,0
|
||||
74.48492746059782,92.34246943037195,1
|
||||
26.332352061636747,44.9551699040027,0
|
||||
54.346942016509146,58.43293962287077,1
|
||||
29.947060203169244,93.06082834209418,0
|
||||
96.32633710641187,64.80350360838675,1
|
||||
29.864465690194475,73.11550264372423,0
|
||||
62.2263271267271,57.84956855286749,1
|
||||
35.2611254453108,72.85531587549292,0
|
||||
47.340681257438895,69.41232032562911,1
|
||||
63.19534209968015,36.963350930620166,0
|
||||
59.46464897992196,72.40245846384263,1
|
||||
60.08389682243888,42.48638233127113,0
|
||||
57.45295498601704,73.67928309399463,1
|
||||
|
100
PW-3/ex1/ex1-data-train.csv
Normal file
100
PW-3/ex1/ex1-data-train.csv
Normal file
@@ -0,0 +1,100 @@
|
||||
34.62365962451697,78.0246928153624,0
|
||||
30.28671076822607,43.89499752400101,0
|
||||
35.84740876993872,72.90219802708364,0
|
||||
60.18259938620976,86.30855209546826,1
|
||||
79.0327360507101,75.3443764369103,1
|
||||
45.08327747668339,56.3163717815305,0
|
||||
61.10666453684766,96.51142588489624,1
|
||||
75.02474556738889,46.55401354116538,1
|
||||
76.09878670226257,87.42056971926803,1
|
||||
84.43281996120035,43.53339331072109,1
|
||||
95.86155507093572,38.22527805795094,0
|
||||
75.01365838958247,30.60326323428011,0
|
||||
82.30705337399482,76.48196330235604,1
|
||||
69.36458875970939,97.71869196188608,1
|
||||
39.53833914367223,76.03681085115882,0
|
||||
53.9710521485623,89.20735013750205,1
|
||||
69.07014406283025,52.74046973016765,1
|
||||
67.94685547711617,46.67857410673128,0
|
||||
70.66150955499435,92.92713789364831,1
|
||||
76.97878372747498,47.57596364975532,1
|
||||
67.37202754570876,42.83843832029179,0
|
||||
89.67677575072079,65.79936592745237,1
|
||||
50.534788289883,48.85581152764205,0
|
||||
34.21206097786789,44.20952859866288,0
|
||||
77.9240914545704,68.9723599933059,1
|
||||
62.27101367004632,69.95445795447587,1
|
||||
80.1901807509566,44.82162893218353,1
|
||||
93.114388797442,38.80067033713209,0
|
||||
61.83020602312595,50.25610789244621,0
|
||||
38.78580379679423,64.99568095539578,0
|
||||
61.379289447425,72.80788731317097,1
|
||||
85.40451939411645,57.05198397627122,1
|
||||
52.10797973193984,63.12762376881715,0
|
||||
52.04540476831827,69.43286012045222,1
|
||||
40.23689373545111,71.16774802184875,0
|
||||
54.63510555424817,52.21388588061123,0
|
||||
33.91550010906887,98.86943574220611,0
|
||||
64.17698887494485,80.90806058670817,1
|
||||
74.78925295941542,41.57341522824434,0
|
||||
34.1836400264419,75.2377203360134,0
|
||||
83.90239366249155,56.30804621605327,1
|
||||
51.54772026906181,46.85629026349976,0
|
||||
94.44336776917852,65.56892160559052,1
|
||||
82.36875375713919,40.61825515970618,0
|
||||
51.04775177128865,45.82270145776001,0
|
||||
62.22267576120188,52.06099194836679,0
|
||||
77.19303492601364,70.45820000180959,1
|
||||
97.77159928000232,86.7278223300282,1
|
||||
62.07306379667647,96.76882412413983,1
|
||||
91.56497449807442,88.69629254546599,1
|
||||
79.94481794066932,74.16311935043758,1
|
||||
99.2725269292572,60.99903099844988,1
|
||||
90.54671411399852,43.39060180650027,1
|
||||
34.52451385320009,60.39634245837173,0
|
||||
50.2864961189907,49.80453881323059,0
|
||||
49.58667721632031,59.80895099453265,0
|
||||
97.64563396007767,68.86157272420604,1
|
||||
32.57720016809309,95.59854761387875,0
|
||||
74.24869136721598,69.82457122657193,1
|
||||
71.79646205863379,78.45356224515052,1
|
||||
75.3956114656803,85.75993667331619,1
|
||||
35.28611281526193,47.02051394723416,0
|
||||
56.25381749711624,39.26147251058019,0
|
||||
30.05882244669796,49.59297386723685,0
|
||||
44.66826172480893,66.45008614558913,0
|
||||
66.56089447242954,41.09209807936973,0
|
||||
40.45755098375164,97.53518548909936,1
|
||||
49.07256321908844,51.88321182073966,0
|
||||
80.27957401466998,92.11606081344084,1
|
||||
66.74671856944039,60.99139402740988,1
|
||||
32.72283304060323,43.30717306430063,0
|
||||
64.0393204150601,78.03168802018232,1
|
||||
72.34649422579923,96.22759296761404,1
|
||||
60.45788573918959,73.09499809758037,1
|
||||
58.84095621726802,75.85844831279042,1
|
||||
99.82785779692128,72.36925193383885,1
|
||||
47.26426910848174,88.47586499559782,1
|
||||
50.45815980285988,75.80985952982456,1
|
||||
60.45555629271532,42.50840943572217,0
|
||||
82.22666157785568,42.71987853716458,0
|
||||
88.9138964166533,69.80378889835472,1
|
||||
94.83450672430196,45.69430680250754,1
|
||||
67.31925746917527,66.58935317747915,1
|
||||
57.23870631569862,59.51428198012956,1
|
||||
80.36675600171273,90.96014789746954,1
|
||||
68.46852178591112,85.59430710452014,1
|
||||
42.0754545384731,78.84478600148043,0
|
||||
75.47770200533905,90.42453899753964,1
|
||||
78.63542434898018,96.64742716885644,1
|
||||
52.34800398794107,60.76950525602592,0
|
||||
94.09433112516793,77.15910509073893,1
|
||||
90.44855097096364,87.50879176484702,1
|
||||
55.48216114069585,35.57070347228866,0
|
||||
74.49269241843041,84.84513684930135,1
|
||||
89.84580670720979,45.35828361091658,1
|
||||
83.48916274498238,48.38028579728175,1
|
||||
42.2617008099817,87.10385094025457,1
|
||||
99.31500880510394,68.77540947206617,1
|
||||
55.34001756003703,64.9319380069486,1
|
||||
74.77589300092767,89.52981289513276,1
|
||||
|
Reference in New Issue
Block a user