Files
MachLePublic/PW-3/ex2/ex2-sys-eval-stud.ipynb
Joachim Bach aeb97473ea done ex2
2025-10-04 11:20:37 +02:00

746 lines
20 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "bcf79585",
"metadata": {},
"source": [
"# Exercice 2 - System evaluation"
]
},
{
"cell_type": "markdown",
"id": "f642cedb",
"metadata": {},
"source": [
"## Imports"
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "9421a4e1",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np"
]
},
{
"cell_type": "markdown",
"id": "a0d67fa6",
"metadata": {},
"source": [
"## Load data"
]
},
{
"cell_type": "markdown",
"id": "5fe90672",
"metadata": {},
"source": [
"Define the path of the data file"
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "ecd4a4cf",
"metadata": {},
"outputs": [],
"source": [
"path = \"ex2-system-a.csv\""
]
},
{
"cell_type": "markdown",
"id": "246e7392",
"metadata": {},
"source": [
"Read the CSV file using `read_csv`"
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "623096a5",
"metadata": {},
"outputs": [],
"source": [
"dataset_a = pd.read_csv(path, sep=\";\", index_col=False, names=[\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"y_true\"])"
]
},
{
"cell_type": "markdown",
"id": "6f764c56",
"metadata": {},
"source": [
"Display first rows"
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "c59a1651",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" <th>2</th>\n",
" <th>3</th>\n",
" <th>4</th>\n",
" <th>5</th>\n",
" <th>6</th>\n",
" <th>7</th>\n",
" <th>8</th>\n",
" <th>9</th>\n",
" <th>y_true</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>5.348450e-08</td>\n",
" <td>7.493480e-10</td>\n",
" <td>8.083470e-07</td>\n",
" <td>2.082290e-05</td>\n",
" <td>5.222360e-10</td>\n",
" <td>2.330260e-08</td>\n",
" <td>5.241270e-12</td>\n",
" <td>9.999650e-01</td>\n",
" <td>4.808590e-07</td>\n",
" <td>0.000013</td>\n",
" <td>7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1.334270e-03</td>\n",
" <td>3.202960e-05</td>\n",
" <td>8.504280e-01</td>\n",
" <td>1.669090e-03</td>\n",
" <td>1.546460e-07</td>\n",
" <td>2.412940e-04</td>\n",
" <td>1.448280e-01</td>\n",
" <td>1.122810e-11</td>\n",
" <td>1.456330e-03</td>\n",
" <td>0.000011</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3.643050e-06</td>\n",
" <td>9.962760e-01</td>\n",
" <td>2.045910e-03</td>\n",
" <td>4.210530e-04</td>\n",
" <td>2.194020e-05</td>\n",
" <td>1.644130e-05</td>\n",
" <td>2.838160e-04</td>\n",
" <td>3.722960e-04</td>\n",
" <td>5.150120e-04</td>\n",
" <td>0.000044</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>9.998200e-01</td>\n",
" <td>2.550390e-10</td>\n",
" <td>1.112010e-05</td>\n",
" <td>1.653200e-05</td>\n",
" <td>5.375730e-10</td>\n",
" <td>8.999750e-05</td>\n",
" <td>9.380920e-06</td>\n",
" <td>4.464470e-05</td>\n",
" <td>2.418440e-06</td>\n",
" <td>0.000006</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2.092460e-08</td>\n",
" <td>7.464220e-08</td>\n",
" <td>3.560820e-05</td>\n",
" <td>5.496200e-07</td>\n",
" <td>9.988960e-01</td>\n",
" <td>3.070920e-08</td>\n",
" <td>2.346150e-04</td>\n",
" <td>9.748010e-07</td>\n",
" <td>1.071610e-06</td>\n",
" <td>0.000831</td>\n",
" <td>4</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" 0 1 2 3 4 \\\n",
"0 5.348450e-08 7.493480e-10 8.083470e-07 2.082290e-05 5.222360e-10 \n",
"1 1.334270e-03 3.202960e-05 8.504280e-01 1.669090e-03 1.546460e-07 \n",
"2 3.643050e-06 9.962760e-01 2.045910e-03 4.210530e-04 2.194020e-05 \n",
"3 9.998200e-01 2.550390e-10 1.112010e-05 1.653200e-05 5.375730e-10 \n",
"4 2.092460e-08 7.464220e-08 3.560820e-05 5.496200e-07 9.988960e-01 \n",
"\n",
" 5 6 7 8 9 y_true \n",
"0 2.330260e-08 5.241270e-12 9.999650e-01 4.808590e-07 0.000013 7 \n",
"1 2.412940e-04 1.448280e-01 1.122810e-11 1.456330e-03 0.000011 2 \n",
"2 1.644130e-05 2.838160e-04 3.722960e-04 5.150120e-04 0.000044 1 \n",
"3 8.999750e-05 9.380920e-06 4.464470e-05 2.418440e-06 0.000006 0 \n",
"4 3.070920e-08 2.346150e-04 9.748010e-07 1.071610e-06 0.000831 4 "
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset_a.head()"
]
},
{
"cell_type": "markdown",
"id": "41f040b0",
"metadata": {},
"source": [
"Store some useful statistics (class names + number of classes)"
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "fd0adce4",
"metadata": {},
"outputs": [],
"source": [
"class_names = [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\"]\n",
"nb_classes = len(class_names)"
]
},
{
"cell_type": "markdown",
"id": "5a0ab85a",
"metadata": {},
"source": [
"## Exercise's steps"
]
},
{
"cell_type": "markdown",
"id": "66ae582e",
"metadata": {},
"source": [
"a) Write a function to take classification decisions on such outputs according to Bayesrule."
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "3c36b377",
"metadata": {},
"outputs": [],
"source": [
"def bayes_classification(df):\n",
" \"\"\"\n",
" Take classification decisions according to Bayes rule.\n",
"\n",
" Parameters\n",
" ----------\n",
" df : Pandas DataFrame of shape (n_samples, n_features + ground truth)\n",
" Dataset.\n",
"\n",
" Returns\n",
" -------\n",
" preds : Numpy array of shape (n_samples,)\n",
" Class labels for each data sample.\n",
" \"\"\"\n",
" y_pred = []\n",
" for i in range(df.shape[0]):\n",
" index = np.argmax(df.iloc[i,:10]) # take all the line except the y value\n",
" y_pred.append(index)\n",
" \n",
" return y_pred\n"
]
},
{
"cell_type": "markdown",
"id": "b5e8140b",
"metadata": {},
"source": [
"b) What is the overall error rate of the system ?"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "f3b21bfb",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Error rate = 0.10729999999999995\n"
]
}
],
"source": [
"# Your code here: compute and print the error rate of the system\n",
"y_pred_a = bayes_classification(dataset_a)\n",
"\n",
"correct = 0\n",
"for i in range(0, len(y_pred_a)):\n",
" if(dataset_a.iloc[i,10] == y_pred_a[i]):\n",
" correct += 1\n",
"\n",
"success = correct/len(y_pred_a)\n",
"print(f\"Error rate = {1-success}\")"
]
},
{
"cell_type": "markdown",
"id": "a4f0fa5f",
"metadata": {},
"source": [
"c) Compute and report the confusion matrix of the system."
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "bb106415",
"metadata": {},
"outputs": [],
"source": [
"def confusion_matrix(y_true, y_pred, n_classes):\n",
" \"\"\"\n",
" Compute the confusion matrix.\n",
" \n",
" Parameters\n",
" ----------\n",
" y_true : Numpy array of shape (n_samples,)\n",
" Ground truth.\n",
" y_pred : Numpy array of shape (n_samples,)\n",
" Predictions.\n",
" n_classes : Integer\n",
" Number of classes.\n",
" \n",
" Returns\n",
" -------\n",
" cm : Numpy array of shape (n_classes, n_classes)\n",
" Confusion matrix.\n",
" \"\"\"\n",
" matrix = np.zeros((n_classes, n_classes))\n",
"\n",
" for i in range(0, len(y_pred)):\n",
" matrix[y_true[i], y_pred[i]] += 1 \n",
"\n",
" return matrix"
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "1b38e3a8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0 1 2 3 4 5 6 7 8 9\n",
" 0 | 944 0 11 0 0 2 10 7 5 1\n",
" 1 | 0 1112 2 3 1 4 3 1 9 0\n",
" 2 | 10 6 921 12 15 3 19 15 26 5\n",
"t 3 | 1 1 31 862 2 72 5 14 12 10\n",
"r 4 | 2 3 6 2 910 1 12 6 4 36\n",
"u 5 | 12 3 6 29 19 768 19 9 21 6\n",
"e 6 | 14 3 21 2 22 28 865 0 3 0\n",
" 7 | 0 14 30 9 7 2 1 929 3 33\n",
" 8 | 12 16 18 26 24 46 22 19 772 19\n",
" 9 | 10 4 6 22 53 18 0 48 4 844\n",
" predicted \n"
]
}
],
"source": [
"# Your code here: compute and print the confusion matrix\n",
"\n",
"cm_a = confusion_matrix(dataset_a.iloc[:,10], y_pred_a, nb_classes)\n",
"\n",
"#headers\n",
"print(\" \", end=\"\")\n",
"for j in range(nb_classes):\n",
" print(f\"{j:5d}\", end=\"\")\n",
"print()\n",
"\n",
"#rows\n",
"for i in range(nb_classes):\n",
" match i:\n",
" case 3:\n",
" print(\"t\", end=\"\")\n",
" case 4:\n",
" print(\"r\", end=\"\")\n",
" case 5:\n",
" print(\"u\", end=\"\")\n",
" case 6:\n",
" print(\"e\", end=\"\")\n",
" case _:\n",
" print(\" \", end=\"\")\n",
"\n",
" print(f\"{i:3d} |\", end=\"\")\n",
" for j in range(nb_classes):\n",
" print(f\"{int(cm_a[i, j]):5d}\", end=\"\")\n",
"\n",
" print()\n",
"\n",
"\n",
"print(\" predicted \")\n",
"# print(cm.astype(int))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0cf5380f",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "ed8db908",
"metadata": {},
"source": [
"d) What are the worst and best classes in terms of precision and recall ?"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "0e229ce0",
"metadata": {},
"outputs": [],
"source": [
"def precision_per_class(cm):\n",
" \"\"\"\n",
" Compute the precision per class.\n",
" \n",
" Parameters\n",
" ----------\n",
" cm : Numpy array of shape (n_classes, n_classes)\n",
" Confusion matrix.\n",
" \n",
" Returns\n",
" -------\n",
" precisions : Numpy array of shape (n_classes,)\n",
" Precision per class.\n",
" \"\"\"\n",
" rates = []\n",
" for i in range(cm.shape[1]):\n",
" correct = cm[i,i]\n",
" incorrect = 0\n",
" for j in range(cm.shape[0]):\n",
" if i != j:\n",
" incorrect += cm[j,i]\n",
"\n",
" rates.append(correct/(correct+incorrect))\n",
"\n",
" return rates\n",
" \n",
" "
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "95325772",
"metadata": {},
"outputs": [],
"source": [
"def recall_per_class(cm):\n",
" \"\"\"\n",
" Compute the recall per class.\n",
" \n",
" Parameters\n",
" ----------\n",
" cm : Numpy array of shape (n_classes, n_classes)\n",
" Confusion matrix.\n",
" \n",
" Returns\n",
" -------\n",
" recalls : Numpy array of shape (n_classes,)\n",
" Recall per class.\n",
" \"\"\"\n",
" rates = []\n",
" for i in range(cm.shape[0]):\n",
" correct = cm[i,i]\n",
" incorrect = 0\n",
" for j in range(cm.shape[1]):\n",
" if i != j:\n",
" incorrect += cm[i,j]\n",
"\n",
" rates.append(correct/(correct+incorrect))\n",
"\n",
" return rates"
]
},
{
"cell_type": "code",
"execution_count": 48,
"id": "a0fb19e3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Class 0, precision = 0.9393034825870646\n",
"Class 1, precision = 0.9569707401032702\n",
"Class 2, precision = 0.8754752851711026\n",
"Class 3, precision = 0.8914167528438469\n",
"Class 4, precision = 0.8641975308641975\n",
"Class 5, precision = 0.8135593220338984\n",
"Class 6, precision = 0.9048117154811716\n",
"Class 7, precision = 0.8864503816793893\n",
"Class 8, precision = 0.8987194412107101\n",
"Class 9, precision = 0.8846960167714885\n",
"\n",
"Best = class 1, 0.9569707401032702\n",
"Worst = class 5, 0.8135593220338984\n"
]
}
],
"source": [
"# Your code here: find and print the worst and best classes in terms of precision\n",
"precision_a = precision_per_class(cm_a)\n",
"\n",
"for i in range(len(precision_a)):\n",
" print(f\"Class {i}, precision = {precision_a[i]}\")\n",
"\n",
"print(\"\")\n",
"\n",
"print(f\"Best = class {np.argmax(precision_a)}, {precision_a[np.argmax(precision_a)]}\")\n",
"print(f\"Worst = class {np.argmin(precision_a)}, {precision_a[np.argmin(precision_a)]}\")\n"
]
},
{
"cell_type": "code",
"execution_count": 49,
"id": "42c3edd8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Class 0, recall = 0.963265306122449\n",
"Class 1, recall = 0.9797356828193833\n",
"Class 2, recall = 0.8924418604651163\n",
"Class 3, recall = 0.8534653465346534\n",
"Class 4, recall = 0.9266802443991853\n",
"Class 5, recall = 0.8609865470852018\n",
"Class 6, recall = 0.9029227557411273\n",
"Class 7, recall = 0.9036964980544747\n",
"Class 8, recall = 0.7926078028747433\n",
"Class 9, recall = 0.8364717542120912\n",
"\n",
"Best = class 1, 0.9797356828193833\n",
"Worst = class 8, 0.7926078028747433\n"
]
}
],
"source": [
"# Your code here: find and print the worst and best classes in terms of recall\n",
"\n",
"recall_a = recall_per_class(cm_a)\n",
"\n",
"for i in range(len(recall_a)):\n",
" print(f\"Class {i}, recall = {recall_a[i]}\")\n",
"\n",
"print(\"\")\n",
"\n",
"print(f\"Best = class {np.argmax(recall_a)}, {recall_a[np.argmax(recall_a)]}\")\n",
"print(f\"Worst = class {np.argmin(recall_a)}, {recall_a[np.argmin(recall_a)]}\")\n"
]
},
{
"cell_type": "markdown",
"id": "7ac6fe5d",
"metadata": {},
"source": [
"e) In file `ex1-system-b.csv` you find the output of a second system B. What is the best system between (a) and (b) in terms of error rate and F1."
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "b98c2545",
"metadata": {},
"outputs": [],
"source": [
"# Your code here: load the data of the system B\n",
"path = \"ex2-system-b.csv\"\n",
"dataset_b = pd.read_csv(path, sep=\";\", index_col=False, names=[\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"y_true\"])\n"
]
},
{
"cell_type": "code",
"execution_count": 51,
"id": "050091b9",
"metadata": {},
"outputs": [],
"source": [
"def system_accuracy(cm):\n",
" \"\"\"\n",
" Compute the system accuracy.\n",
" \n",
" Parameters\n",
" ----------\n",
" cm : Numpy array of shape (n_classes, n_classes)\n",
" Confusion matrix.\n",
" \n",
" Returns\n",
" -------\n",
" accuracy : Float\n",
" Accuracy of the system.\n",
" \"\"\"\n",
"\n",
" diag = 0\n",
" for i in range(cm.shape[0]):\n",
" diag += cm[i,i]\n",
"\n",
" acc = diag / np.sum(cm)\n",
" return acc"
]
},
{
"cell_type": "code",
"execution_count": 52,
"id": "adc0f138",
"metadata": {},
"outputs": [],
"source": [
"def system_f1_score(cm):\n",
" \"\"\"\n",
" Compute the system F1 score.\n",
" \n",
" Parameters\n",
" ----------\n",
" cm : Numpy array of shape (n_classes, n_classes)\n",
" Confusion matrix.\n",
" \n",
" Returns\n",
" -------\n",
" f1_score : Float\n",
" F1 score of the system.\n",
" \"\"\"\n",
"\n",
" f1 = []\n",
" precision = precision_per_class(cm)\n",
" recall = recall_per_class(cm)\n",
"\n",
" for i in range(0, len(precision)):\n",
" f1.append(2*((precision[i] * recall[i])/(precision[i] + recall[i])))\n",
" return np.sum(f1)/len(f1)\n"
]
},
{
"cell_type": "code",
"execution_count": 53,
"id": "f1385c87",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"System A accuracy = 0.8927\n",
"System A f1 = 0.8907308492877297\n"
]
}
],
"source": [
"# Your code here: compute and print the accuracy and the F1 score of the system A\n",
"\n",
"acc_a = system_accuracy(cm_a)\n",
"print(f\"System A accuracy = {acc_a}\")\n",
"\n",
"f1_a = system_f1_score(cm_a)\n",
"\n",
"print(f\"System A f1 = {f1_a}\")"
]
},
{
"cell_type": "code",
"execution_count": 54,
"id": "50c64d08",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"System A accuracy = 0.9613\n",
"System A f1 = 0.9608568150389065\n"
]
}
],
"source": [
"# Your code here: compute and print the accuracy and the F1 score of the system B\n",
"y_pred_b = bayes_classification(dataset_b)\n",
"cm_b = confusion_matrix(dataset_b.iloc[:,10], y_pred_b, nb_classes)\n",
"\n",
"acc_b = system_accuracy(cm_b)\n",
"print(f\"System A accuracy = {acc_b}\")\n",
"\n",
"f1_b = system_f1_score(cm_b)\n",
"\n",
"print(f\"System A f1 = {f1_b}\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}