{ "cells": [ { "cell_type": "markdown", "id": "5da8da61", "metadata": {}, "source": [ "# Exercice 2: Classification system with KNN - To Loan or Not To Loan" ] }, { "cell_type": "markdown", "id": "9669e493", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "markdown", "id": "22bbd869", "metadata": {}, "source": [ "Import some useful libraries" ] }, { "cell_type": "code", "execution_count": 1, "id": "26758936", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "from sklearn.preprocessing import OrdinalEncoder, StandardScaler\n", "from sklearn.model_selection import train_test_split" ] }, { "cell_type": "markdown", "id": "abc131ca", "metadata": {}, "source": [ "## a. Getting started" ] }, { "cell_type": "markdown", "id": "45b518e5", "metadata": {}, "source": [ "### Data loading" ] }, { "cell_type": "markdown", "id": "1ef061f2", "metadata": {}, "source": [ "The original dataset comes from the Kaggle's [Loan Prediction](https://www.kaggle.com/ninzaami/loan-predication) problem. The provided dataset has already undergone some processing, such as removing some columns and invalid data. Pandas is used to read the CSV file." ] }, { "cell_type": "code", "execution_count": 2, "id": "a23f62b5", "metadata": {}, "outputs": [], "source": [ "data = pd.read_csv(\"loandata.csv\")" ] }, { "cell_type": "markdown", "id": "02ca77c7", "metadata": {}, "source": [ "Display the head of the data." ] }, { "cell_type": "code", "execution_count": 3, "id": "f4bec500", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | Gender | \n", "Married | \n", "Education | \n", "TotalIncome | \n", "LoanAmount | \n", "CreditHistory | \n", "LoanStatus | \n", "
|---|---|---|---|---|---|---|---|
| 0 | \n", "Male | \n", "Yes | \n", "Graduate | \n", "6091.0 | \n", "128.0 | \n", "1.0 | \n", "N | \n", "
| 1 | \n", "Male | \n", "Yes | \n", "Graduate | \n", "3000.0 | \n", "66.0 | \n", "1.0 | \n", "Y | \n", "
| 2 | \n", "Male | \n", "Yes | \n", "Not Graduate | \n", "4941.0 | \n", "120.0 | \n", "1.0 | \n", "Y | \n", "
| 3 | \n", "Male | \n", "No | \n", "Graduate | \n", "6000.0 | \n", "141.0 | \n", "1.0 | \n", "Y | \n", "
| 4 | \n", "Male | \n", "Yes | \n", "Graduate | \n", "9613.0 | \n", "267.0 | \n", "1.0 | \n", "Y | \n", "