{ "cells": [ { "cell_type": "markdown", "id": "31256bb0", "metadata": {}, "source": [ "\n", " \"Open\n", "" ] }, { "cell_type": "markdown", "id": "5194a212", "metadata": {}, "source": [ "# Optimization in Machine Learning" ] }, { "cell_type": "markdown", "id": "21f34e76", "metadata": {}, "source": [ "
\n", " Table of Contents \n", "\n", "1. [Introduction](#introduction)\n", "2. [K Fold](#kfold)\n", "3. [GridsearchCV](#grid)\n", "5. [Exercise](#exercise)" ] }, { "cell_type": "markdown", "id": "183bd208", "metadata": {}, "source": [ "
\n", " I. Introduction \n", "
" ] }, { "cell_type": "markdown", "id": "dd5cc007", "metadata": {}, "source": [ "In machine learning, optimization refers to the process of improving a model’s performance by finding the best possible configuration of its parameters and hyperparameters. The goal is to build a model that not only fits the training data well but also generalizes effectively to unseen data." ] }, { "cell_type": "code", "execution_count": 470, "id": "ad48043f", "metadata": {}, "outputs": [], "source": [ "from sklearn.datasets import load_iris\n", "import numpy as np\n", "from sklearn.model_selection import KFold\n", "from sklearn.linear_model import LogisticRegression\n", "from sklearn.model_selection import GridSearchCV\n", "from sklearn.metrics import accuracy_score, classification_report\n", "from sklearn.svm import SVC as svm" ] }, { "cell_type": "markdown", "id": "56f35a5a", "metadata": {}, "source": [ "We will use again the iris dataset." ] }, { "cell_type": "code", "execution_count": 471, "id": "0f4b4d60", "metadata": {}, "outputs": [], "source": [ "iris = load_iris()\n", "X = iris.data[:,:2]\n", "y = (iris.target == 0).astype(int) " ] }, { "cell_type": "markdown", "id": "eb13b663", "metadata": {}, "source": [ "
\n", " II. K-Folding \n", "
" ] }, { "cell_type": "markdown", "id": "1f6c3f88", "metadata": {}, "source": [ "K-Fold Cross Validation is a model evaluation technique that helps obtain a more robust measure of the performance of a classification model, especially when the available dataset is limited. This technique splits the dataset into k subsets or \"folds,\" and uses each subset multiple times to train and validate the model. This way, a more reliable estimate of the model’s ability to generalize to new data can be obtained.\n", "\n", "A model can perform well on training data but poorly on new data (overfitting). To avoid this, we use K-Fold Cross Validation.\n", "\n", "To perform this task, we again use scikit-learn — specifically, the KFold function. This function has the following parameters:\n", "\n", "- `n_splits`: Number of splits to make.\n", "\n", "- `shuffle`: Boolean indicating whether the data should be shuffled before splitting.\n", "\n", "- `random_state`: Random seed.\n", "\n", "It returns the different train and test splits, ensuring that the training set and test set sizes follow the specified distribution." ] }, { "cell_type": "markdown", "id": "4dbfaeb6", "metadata": {}, "source": [ "We will use the function ```KFold`` to do cross validation. This function takes these parameters: \n", "- ``n_splits``: how many folds you want.\n", "- ``shuffle``: Before splitting the data, it is shuffled randomly. \n", "- ``random_state``: To control randomness. " ] }, { "cell_type": "markdown", "id": "a8b23ac0", "metadata": {}, "source": [ "We can define a function that performs: \n", "- Kfolding \n", "- Training and Testing spliting\n", "- Train a model \n", "- Test it and evaluate performance. \n", "\n", "Returning the average performance, the results of each fold and the best model." ] }, { "cell_type": "code", "execution_count": 472, "id": "e25874d0", "metadata": {}, "outputs": [], "source": [ "def cross_validate(model, X, y, k=5):\n", " # 1. We create a K-Fold object that will split the data into 5 parts, \n", " # shuffle the dataset before splitting and use fixed seed\n", " kf = KFold(n_splits=k, shuffle=True, random_state=42)\n", "\n", " # Variables to store results\n", " scores = [] \n", " best_score = -1 \n", " best_model = None\n", "\n", " # Kfold returns the indices of the rows used for training and testing. \n", " for train_index, test_index in kf.split(X):\n", " #split data\n", " X_train, X_test = X[train_index], X[test_index]\n", " y_train, y_test = y[train_index], y[test_index]\n", "\n", " # train model\n", " model.fit(X_train, y_train)\n", "\n", " # make prediction\n", " y_pred = model.predict(X_test)\n", "\n", " # evaluate performance\n", " score = accuracy_score(y_test, y_pred)\n", " scores.append(score)\n", "\n", " #store result\n", " if score > best_score:\n", " best_score = score\n", " best_model = model\n", "\n", " return np.mean(scores), scores, best_model #<- We return the mean accuracy, all fold accuracy and best model" ] }, { "cell_type": "markdown", "id": "6376ee71", "metadata": {}, "source": [ "Let's see how this works. We create a Logistic Regression model." ] }, { "cell_type": "code", "execution_count": 473, "id": "05392a88", "metadata": {}, "outputs": [], "source": [ "model = LogisticRegression()" ] }, { "cell_type": "markdown", "id": "b7ec89a2", "metadata": {}, "source": [ "And compute cross validation:" ] }, { "cell_type": "code", "execution_count": 474, "id": "0365ae3f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mean score:1.0\n", "Scores: [1.0, 1.0, 1.0, 1.0, 1.0]\n" ] } ], "source": [ "mean_score, scores, best_model = cross_validate(model, X, y, 5)\n", "print(f\"Mean score:{mean_score}\")\n", "print(f\"Scores: {scores}\")" ] }, { "cell_type": "markdown", "id": "0559c29d", "metadata": {}, "source": [ "What can we do with the best_model variable? \n", "\n", "We can print the hyperparameters with we trained the model. " ] }, { "cell_type": "code", "execution_count": 475, "id": "46fc21d0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'C': 1.0, 'class_weight': None, 'dual': False, 'fit_intercept': True, 'intercept_scaling': 1, 'l1_ratio': None, 'max_iter': 100, 'multi_class': 'deprecated', 'n_jobs': None, 'penalty': 'l2', 'random_state': None, 'solver': 'lbfgs', 'tol': 0.0001, 'verbose': 0, 'warm_start': False}\n" ] } ], "source": [ "print(best_model.get_params())" ] }, { "cell_type": "markdown", "id": "4f9f5399", "metadata": {}, "source": [ "
\n", " III. Gridsearch \n", "
" ] }, { "cell_type": "markdown", "id": "ddcdd014", "metadata": {}, "source": [ "In the training of machine learning models, a hyperparameter is any parameter that is not directly learned during the training process but must be defined before training begins. These parameters are configurations that influence the behavior of the model, as well as its ability to learn and generalize on new datasets.\n", "\n", "Unlike parameters that are determined during the training of the model (such as weights in a neural network), hyperparameters must be set beforehand, typically based on the designer's experience, trial and error, or through search techniques.\n", "\n", "GridSearchCV automates the search for the best combination of hyperparameters by:\n", "\n", "- Trying all possible combinations from a predefined grid\n", "- Evaluating each using cross-validation\n", "- Selecting the configuration with the best average performance\n", "\n", "This ensures the model is not only trained, but also systematically optimized." ] }, { "cell_type": "markdown", "id": "cc10deeb", "metadata": {}, "source": [ "In order to use GridSearchCV, we need to create a dictionary of all the parameters we want to try our model:" ] }, { "cell_type": "code", "execution_count": 476, "id": "324d196e", "metadata": {}, "outputs": [], "source": [ "param_grid = {\n", " 'C': [0.1, 1, 10],\n", " 'kernel': ['linear', 'rbf'],\n", " 'gamma': ['scale', 'auto']\n", "}" ] }, { "cell_type": "markdown", "id": "7f967c2c", "metadata": {}, "source": [ "Then, we can create an object ``GridSearchCV`` with the following parameters: \n", "- ``estimator``:The model you want to optimize.\n", "- ``param_grid``: Dictionary of parameters to try.\n", "- ``cv``: Use 5-fold cross validation.\n", "- ``scoring``: Metric used to compare models. We can indicate accuracy, precision, recall,F1.\n", "- ``verbose``: Prints progress while running if set to 1." ] }, { "cell_type": "code", "execution_count": 477, "id": "8a4a6492", "metadata": {}, "outputs": [], "source": [ "model = svm()\n", "grid = GridSearchCV(\n", " estimator=model,\n", " param_grid=param_grid,\n", " cv=5,\n", " scoring='accuracy',\n", " verbose=1\n", ")" ] }, { "cell_type": "markdown", "id": "08602436", "metadata": {}, "source": [ "Once we have the object created, we call ``fit`` to mtrain the grid search:" ] }, { "cell_type": "code", "execution_count": 478, "id": "c9f31e6d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Fitting 5 folds for each of 12 candidates, totalling 60 fits\n" ] }, { "data": { "text/html": [ "
GridSearchCV(cv=5, estimator=SVC(),\n",
       "             param_grid={'C': [0.1, 1, 10], 'gamma': ['scale', 'auto'],\n",
       "                         'kernel': ['linear', 'rbf']},\n",
       "             scoring='accuracy', verbose=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "GridSearchCV(cv=5, estimator=SVC(),\n", " param_grid={'C': [0.1, 1, 10], 'gamma': ['scale', 'auto'],\n", " 'kernel': ['linear', 'rbf']},\n", " scoring='accuracy', verbose=1)" ] }, "execution_count": 478, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grid.fit(X, y)" ] }, { "cell_type": "markdown", "id": "67a81064", "metadata": {}, "source": [ "To see the results we can call to ``best_params`` and ``best_score``:" ] }, { "cell_type": "code", "execution_count": 479, "id": "d56c05d7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Best parameters: {'C': 0.1, 'gamma': 'scale', 'kernel': 'linear'}\n", "Best accuracy: 1.0\n" ] } ], "source": [ "print(\"Best parameters:\", grid.best_params_)\n", "print(\"Best accuracy:\", grid.best_score_)" ] }, { "cell_type": "markdown", "id": "9b4d27a3", "metadata": {}, "source": [ "We can obtain the best model calling ``best_estimator_``" ] }, { "cell_type": "code", "execution_count": 480, "id": "088dacc7", "metadata": {}, "outputs": [], "source": [ "best_model = grid.best_estimator_" ] }, { "cell_type": "markdown", "id": "9a64ad45", "metadata": {}, "source": [ "Then, we can perform predictions and compute more metrics: " ] }, { "cell_type": "code", "execution_count": 481, "id": "111d8f8f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Report:\n", " precision recall f1-score support\n", "\n", " 0 1.00 1.00 1.00 100\n", " 1 1.00 1.00 1.00 50\n", "\n", " accuracy 1.00 150\n", " macro avg 1.00 1.00 1.00 150\n", "weighted avg 1.00 1.00 1.00 150\n", "\n" ] } ], "source": [ "predictions = best_model.predict(X)\n", "print(\"\\nReport:\\n\", classification_report(y, predictions))" ] }, { "cell_type": "markdown", "id": "2d5a9dd8", "metadata": {}, "source": [ "
\n", " IV. Exercise \n", "
" ] }, { "cell_type": "markdown", "id": "679a346a", "metadata": {}, "source": [ "In this exercise, we will use the fish market dataset. The fish market dataset is a collection of data related to different species of fish and their characteristics.\n", "\n", "You can download the dataset here: [Link](https://github.com/bmalcover/AppOC/tree/main/docs/_static/01/Fishers%20maket.csv)\n", "\n", "You can find more info of the dataset here: [Link](https://www.kaggle.com/datasets/vipullrathod/fish-market)\n" ] }, { "cell_type": "code", "execution_count": 482, "id": "4c7809b5", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SpeciesWeightLength1Length2Length3HeightWidth
0Bream242.023.225.430.011.52004.0200
1Bream290.024.026.331.212.48004.3056
2Bream340.023.926.531.112.37784.6961
3Bream363.026.329.033.512.73004.4555
4Bream430.026.529.034.012.44405.1340
\n", "
" ], "text/plain": [ " Species Weight Length1 Length2 Length3 Height Width\n", "0 Bream 242.0 23.2 25.4 30.0 11.5200 4.0200\n", "1 Bream 290.0 24.0 26.3 31.2 12.4800 4.3056\n", "2 Bream 340.0 23.9 26.5 31.1 12.3778 4.6961\n", "3 Bream 363.0 26.3 29.0 33.5 12.7300 4.4555\n", "4 Bream 430.0 26.5 29.0 34.0 12.4440 5.1340" ] }, "execution_count": 482, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.preprocessing import StandardScaler\n", "import seaborn as sns\n", "from sklearn.metrics import mean_absolute_error, mean_squared_error, accuracy_score, confusion_matrix, classification_report\n", "\n", "df = pd.read_csv(\"Fishers maket.csv\")\n", "df.head()" ] }, { "cell_type": "markdown", "id": "7bd0dff7", "metadata": {}, "source": [ "After reading the dataset, let's see if there is any NaN value and other statistical information. " ] }, { "cell_type": "code", "execution_count": 483, "id": "0f7d1a7f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
WeightLength1Length2Length3HeightWidth
count159.000000159.000000159.000000159.000000159.000000159.000000
mean398.32641526.24717028.41572331.2270448.9709944.417486
std357.9783179.99644110.71632811.6102464.2862081.685804
min0.0000007.5000008.4000008.8000001.7284001.047600
25%120.00000019.05000021.00000023.1500005.9448003.385650
50%273.00000025.20000027.30000029.4000007.7860004.248500
75%650.00000032.70000035.50000039.65000012.3659005.584500
max1650.00000059.00000063.40000068.00000018.9570008.142000
\n", "
" ], "text/plain": [ " Weight Length1 Length2 Length3 Height Width\n", "count 159.000000 159.000000 159.000000 159.000000 159.000000 159.000000\n", "mean 398.326415 26.247170 28.415723 31.227044 8.970994 4.417486\n", "std 357.978317 9.996441 10.716328 11.610246 4.286208 1.685804\n", "min 0.000000 7.500000 8.400000 8.800000 1.728400 1.047600\n", "25% 120.000000 19.050000 21.000000 23.150000 5.944800 3.385650\n", "50% 273.000000 25.200000 27.300000 29.400000 7.786000 4.248500\n", "75% 650.000000 32.700000 35.500000 39.650000 12.365900 5.584500\n", "max 1650.000000 59.000000 63.400000 68.000000 18.957000 8.142000" ] }, "execution_count": 483, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.describe()" ] }, { "cell_type": "markdown", "id": "30f82c3c", "metadata": {}, "source": [ "There is no None value or NaN. We decided to classify between ``Perch`` and ``No Perch``. The ``No Perch`` class will contain the ``Bream`` and ``Roach`` classes. " ] }, { "cell_type": "code", "execution_count": 484, "id": "e4079bf4", "metadata": {}, "outputs": [], "source": [ "df = df[df[\"Species\"].isin([\"Perch\", \"Bream\", \"Roach\"])].copy()" ] }, { "cell_type": "markdown", "id": "e1af8a13", "metadata": {}, "source": [ "Let's create our Input features variable (X) and our target variable (Y). Our target variable will contain 1 if the fish is ``Perch``, if not will have a 0: " ] }, { "cell_type": "code", "execution_count": 485, "id": "1836f786", "metadata": {}, "outputs": [], "source": [ "X = df.drop(\"Species\", axis=1)\n", "y = (df[\"Species\"] == \"Perch\").astype(int)" ] }, { "cell_type": "markdown", "id": "2988ad11", "metadata": {}, "source": [ "
\n", "\n", "We decided to use Height and Weight variables as input features. Feel free to use other variable and see what happens!" ] }, { "cell_type": "code", "execution_count": 486, "id": "36fbddbb", "metadata": {}, "outputs": [], "source": [ "X_2 = df[[\"Weight\", \"Height\"]]" ] }, { "cell_type": "markdown", "id": "eba913ba", "metadata": {}, "source": [ "
\n", "\n", "__Task 1__\n", "\n", "Split the data between train and test split. Use ``X_2`` as input features variable." ] }, { "cell_type": "code", "execution_count": null, "id": "1f1191a7", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "4e8cbaeb", "metadata": {}, "source": [ "If we take a look to our features, we can see that ``weight`` has larger values in comparison with height:" ] }, { "cell_type": "code", "execution_count": 488, "id": "d43b037b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Weight Height\n", "15 600.0 15.4380\n", "21 685.0 15.9936\n", "101 218.0 7.1680\n", "26 720.0 16.3618\n", "127 1000.0 12.4888\n", ".. ... ...\n", "86 120.0 6.1100\n", "102 300.0 8.3230\n", "41 110.0 6.1677\n", "30 920.0 18.0369\n", "76 70.0 4.5880\n", "\n", "[77 rows x 2 columns]\n" ] } ], "source": [ "print(X_train)" ] }, { "cell_type": "markdown", "id": "47ef1649", "metadata": {}, "source": [ "A model like SVM or Logistic Regression will treat large numbers as more important just because they are bigger. We need to scale the data. Without this, the weight will dominate the process. \n", "\n", "We are going to use ``StandardScaler``, which performs the following: \n", "z=x−μ​/σ" ] }, { "cell_type": "code", "execution_count": 489, "id": "bc4fec71", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
StandardScaler()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "StandardScaler()" ] }, "execution_count": 489, "metadata": {}, "output_type": "execute_result" } ], "source": [ "scaler = StandardScaler()\n", "scaler.fit(X_train)" ] }, { "cell_type": "markdown", "id": "546b5e20", "metadata": {}, "source": [ "We use ``fit`` to make the scaler learn the training data the mean of each feature and the standard deviation of each feature. " ] }, { "cell_type": "code", "execution_count": 490, "id": "194b515a", "metadata": {}, "outputs": [], "source": [ "X_train_scaled = scaler.transform(X_train)\n", "X_test_scaled = scaler.transform(X_test)" ] }, { "cell_type": "markdown", "id": "b9de0603", "metadata": {}, "source": [ "We then use ``transform`` to transform each value using this: \n", "x`= x- mean / std\n", "\n", "We also transform the test data using the same scaler es the training data. As the test data simulates real-world and unseen data, it must be transformed using the rules learned from training. " ] }, { "cell_type": "markdown", "id": "48eb72b0", "metadata": {}, "source": [ "
\n", "\n", "__Task 2__\n", "\n", "Create a Logistic Regression or SVM model and create a param grid of the parameters: " ] }, { "cell_type": "code", "execution_count": null, "id": "5144e278", "metadata": {}, "outputs": [], "source": [ "model =..." ] }, { "cell_type": "code", "execution_count": null, "id": "8eaa4f4d", "metadata": {}, "outputs": [], "source": [ "param_grid = {\n", " \n", "}" ] }, { "cell_type": "markdown", "id": "4a0bdbdf", "metadata": {}, "source": [ "
\n", "\n", "__Task 3__\n", "\n", "Perform gridsearch and print the best params and best score. Remember to use the variable ``X_train_scaled``. " ] }, { "cell_type": "code", "execution_count": null, "id": "aec9f7a6", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 502, "id": "cae08405", "metadata": {}, "outputs": [], "source": [ "# Print best_params\n", "# Print best score" ] }, { "cell_type": "markdown", "id": "a951362d", "metadata": {}, "source": [ "
\n", "\n", "__Task 4__\n", "\n", "Evaluate the performance. Remember to use ``X_test_scaled``" ] }, { "cell_type": "code", "execution_count": null, "id": "bfaafe54", "metadata": {}, "outputs": [], "source": [ "y_pred = ...." ] }, { "cell_type": "code", "execution_count": null, "id": "c58db592", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " precision recall f1-score support\n", "\n", " 0 0.93 0.76 0.84 17\n", " 1 0.80 0.94 0.86 17\n", "\n", " accuracy 0.85 34\n", " macro avg 0.86 0.85 0.85 34\n", "weighted avg 0.86 0.85 0.85 34\n", "\n" ] } ], "source": [ "print() # Classification report" ] }, { "cell_type": "code", "execution_count": null, "id": "e0d5a907", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 500, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAf8AAAGiCAYAAADp4c+XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8ekN5oAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAeaklEQVR4nO3dbXhU5Z3H8d9IkgGzMBIieVACqCAaMFCJLPKYmhVZDERXkRbZCG2tGEGIj9mKCGoHarelCEJlq8G2+HCpQbQKl4tgYAEhxKC7rUAkgmITZAXShDBGZvbFXqadOwGZcCZnes7343VezD2Tc+55oT////s+ZzyhUCgkAADgGufYPQEAANC+CH8AAFyG8AcAwGUIfwAAXIbwBwDAZQh/AABchvAHAMBlCH8AAFyG8AcAwGUIfwAAXIbwBwAgRpSVlSkvL0/p6enyeDxavXp1i8/86U9/0vjx4+Xz+ZSYmKjs7GwdOHAgousQ/gAAxIiGhgZlZWVp6dKlrb7/8ccfa/jw4erXr582btyoDz74QHPmzFHHjh0juo6HH/YBACD2eDwelZaWKj8/v3ls0qRJio+P129/+9uzOjeVPwAAURQIBFRXVxd2BAKBiM8TDAb1hz/8QX379tWYMWPUvXt3DRkypNWlgW8TF/FfREnjb+61ewpAzJk47yO7pwDEpNcPvBHV8zcd3mfZufxLntO8efPCxubOnatHHnkkovMcOnRI9fX1WrBggR577DEtXLhQa9eu1Y033qgNGzZo1KhRZ3yumAl/AABiRvCkZacqLi5WUVFR2JjX6434PMFgUJI0YcIEzZ49W5I0cOBAbdmyRcuXLyf8AQCIFV6vt01hb0pOTlZcXJwuv/zysPHLLrtMmzdvjuhchD8AAKZQ0O4ZtJCQkKDs7Gzt3r07bHzPnj3q2bNnROci/AEAMAXtCf/6+npVVVU1v66urlZlZaWSkpKUkZGh++67T7fccotGjhypnJwcrV27Vq+//ro2btwY0XUIfwAADCGbKv/y8nLl5OQ0v/5mr0BBQYFKSkp0ww03aPny5fL7/Zo5c6YuvfRSvfLKKxo+fHhE1yH8AQCIEaNHj9a3PX5n2rRpmjZt2lldh/AHAMBkU9u/vRD+AACYYnDDn5V4wh8AAC5D5Q8AgMnCh/zEIsIfAAATbX8AAOAkVP4AAJjY7Q8AgLvY9ZCf9kLbHwAAl6HyBwDARNsfAACXcXjbn/AHAMDk8Pv8WfMHAMBlqPwBADDR9gcAwGUcvuGPtj8AAC5D5Q8AgIm2PwAALkPbHwAAOAmVPwAAhlDI2ff5E/4AAJgcvuZP2x8AAJeh8gcAwOTwDX+EPwAAJoe3/Ql/AABM/LAPAABwEip/AABMtP0BAHAZh2/4o+0PAIDLUPkDAGCi7Q8AgMvQ9gcAAE5C5Q8AgMnhlT/hDwCAwem/6kfbHwAAl6HyBwDARNsfAACX4VY/AABcxuGVP2v+AADEiLKyMuXl5Sk9PV0ej0erV68+5WfvuOMOeTweLVq0KOLrEP4AAJhCQeuOCDQ0NCgrK0tLly497edKS0u1bds2paent+nr0fYHAMBkU9t/7NixGjt27Gk/c/DgQc2YMUPr1q3TuHHj2nQdwh8AgCgKBAIKBAJhY16vV16vN+JzBYNBTZkyRffdd58yMzPbPCfa/gAAmCxs+/v9fvl8vrDD7/e3aVoLFy5UXFycZs6ceVZfj8ofAACThW3/4uJiFRUVhY21perfuXOnfvWrX6miokIej+es5kTlDwBAFHm9XnXp0iXsaEv4b9q0SYcOHVJGRobi4uIUFxen/fv365577lGvXr0iOheVPwAAphi8z3/KlCnKzc0NGxszZoymTJmiqVOnRnQuwh8AAJNNT/irr69XVVVV8+vq6mpVVlYqKSlJGRkZ6tatW9jn4+PjlZqaqksvvTSi6xD+AADEiPLycuXk5DS//mavQEFBgUpKSiy7DuEPAIDJprb/6NGjFQqFzvjzn3zySZuuQ/gDAGDih30AAHCZGNzwZyVu9QMAwGWo/AEAMNH2BwDAZWj7AwAAJ6HyBwDA5PDKn/AHAMAUwb32f49o+wMA4DJU/gAAmGj7AwDgMg4Pf9r+AAC4DJU/AAAmHvIDAIDLOLztT/gDAGDiVj8AAOAkVP4AAJho+wMA4DIOD3/a/gAAuAyVPwAAJm71AwDAXUJBdvsDAAAHofIHAMDk8A1/hD8AACaHr/nT9gcAwGWo/AEAMDl8wx/hDwCAiTV/AABcxuHhz5o/AAAuQ+UPAIDJ4T/pS/gDAGByeNuf8HepnZ/+r1Zu/1h/qjmqLxoC+sUNg/XdPmnN7y/bvFvrPjqomr+cUPw55+jyVJ/uGtFPA9K72jhrwF433XmTCh68Ta/95jX9x7wVdk8HaDPW/F2qselr9e3eRcX/NKDV93smJerB3AF6eeooPTt5mNK7nKvpL23Tl8cD7TxTIDb0uaKPrvv+dar+Y7XdU0F7CIasO2IQlb9LDb8oRcMvSjnl+/98+YVhr+/57uUq/fCA9n5RpyE9z4/29ICY0vHcjrpn8b168sEndcuMSXZPB+3B4U/4izj8Dx8+rGeeeUZbt25VTU2NJCk1NVVXX321brvtNp1/PsHgNE0ng3pl1wH9gzdOfc/vYvd0gHZ3x2PTVf7ODu3avIvwhyNEFP47duzQmDFjdO655yo3N1d9+/aVJNXW1mrx4sVasGCB1q1bp8GDB5/2PIFAQIFAePs42PS1vPE0ImJJWVWtHnh9p040nVTyP3TU8olD1fVcr93TAtrViLyRurj/xSrKm233VNCeYrRdb5WI0nbGjBm6+eabtXz5cnk8nrD3QqGQ7rjjDs2YMUNbt2497Xn8fr/mzZsXNvZveUP10ISrI5kOoiw7o5tevG2UjjZ+pVd37df9a8r1u1tHKCmR/wGAOySnJetHj/xID0+eo6ZAk93TQTsKsdv/r3bt2qWSkpIWwS9JHo9Hs2fP1qBBg771PMXFxSoqKgobC656OJKpoB10SohTRkKcMrom6or0rsp7+h2VfnhAP/jHPnZPDWgXlwy4RF3P76pFb/6qeaxDXAdlDsnU9QXX68ZLblDQ4SEBZ4oo/FNTU7V9+3b169ev1fe3b9+ulJRTbyL7htfrldcbXj020vKPeSGF9NXX/IcO7rHrv3apMLcwbGzWv9+tzz7+TC8/9QrB72S0/f/q3nvv1e23366dO3fqmmuuaQ762tparV+/XitWrNDPf/7zqEwU1jr+1dc6cKSh+fXBo8f1Ue0x+TrF67yOCVqxba9GX5Kq5ESvjjZ+pRff/0SH/nJC/9Qv3cZZA+2rsaFRB/bsDxs7cTyguiN/aTEOh7Fpt39ZWZmeeOIJ7dy5U3/+859VWlqq/Px8SVJTU5Meeughvfnmm9q3b598Pp9yc3O1YMECpadH9t/miMK/sLBQycnJ+uUvf6mnnnpKJ0+elCR16NBBV155pUpKSjRx4sSIJgB7/E/NUf3ohb/uzfj3DX+UJOX1v1APXXuFPvnfet3z3+U62viVzusYr8y08/TM94fpkuTOdk0ZANqPTZV/Q0ODsrKyNG3aNN14441h7x0/flwVFRWaM2eOsrKydOTIEd19990aP368ysvLI7qOJxRq2wOMm5qadPjwYUlScnKy4uPj23KaZo2/ufes/h5woonzPrJ7CkBMev3AG1E9f8P8yZadK+6BZ1rc4dba8rfJ4/GEVf6t2bFjh6666irt379fGRkZZzynNj/hLz4+XmlpaUpLSzvr4AcAIKYEg5Ydfr9fPp8v7PD7/ZZM89ixY/J4PDrvvPMi+jt22QEAYLKw7V/8k5Z3uH1b1X8mTpw4oQceeEDf+9731KVLZA9gI/wBAIiiM2nxR6qpqUkTJ05UKBTSsmXLIv57wh8AAFMMP9v/m+Dfv3+/3nnnnYirfonwBwCgpRi9z/+b4N+7d682bNigbt26tek8hD8AADGivr5eVVVVza+rq6tVWVmppKQkpaWl6aabblJFRYXeeOMNnTx5svkH9pKSkpSQkHDG1yH8AQAw2PVs//LycuXk5DS//majYEFBgR555BGtWbNGkjRw4MCwv9uwYYNGjx59xtch/AEAMNnU9h89erRO9/idNj6ap4U23+cPAAD+PlH5AwBgitENf1Yh/AEAMMXwrX5WIPwBADA5vPJnzR8AAJeh8gcAwBByeOVP+AMAYHJ4+NP2BwDAZaj8AQAw2fSEv/ZC+AMAYKLtDwAAnITKHwAAk8Mrf8IfAACDVT+gE6to+wMA4DJU/gAAmGj7AwDgMoQ/AADu4vTH+7LmDwCAy1D5AwBgcnjlT/gDAGBy9tN9afsDAOA2VP4AABicvuGP8AcAwOTw8KftDwCAy1D5AwBgcviGP8IfAACD09f8afsDAOAyVP4AAJho+wMA4C5Ob/sT/gAAmBxe+bPmDwCAy1D5AwBgCDm88if8AQAwOTz8afsDAOAyVP4AABho+wMA4DYOD3/a/gAAuAyVPwAABqe3/an8AQAwhILWHZEoKytTXl6e0tPT5fF4tHr16vB5hUJ6+OGHlZaWpk6dOik3N1d79+6N+PsR/gAAGOwK/4aGBmVlZWnp0qWtvv+zn/1Mixcv1vLly/Xee+8pMTFRY8aM0YkTJyK6Dm1/AABixNixYzV27NhW3wuFQlq0aJEeeughTZgwQZL03HPPKSUlRatXr9akSZPO+DpU/gAAmEIey45AIKC6urqwIxAIRDyl6upq1dTUKDc3t3nM5/NpyJAh2rp1a0TnIvwBADBY2fb3+/3y+Xxhh9/vj3hONTU1kqSUlJSw8ZSUlOb3zhRtfwAAoqi4uFhFRUVhY16v16bZ/D/CHwAAQyjosexcXq/XkrBPTU2VJNXW1iotLa15vLa2VgMHDozoXLT9AQAw2LXb/3R69+6t1NRUrV+/vnmsrq5O7733noYOHRrRuaj8AQCIEfX19aqqqmp+XV1drcrKSiUlJSkjI0OzZs3SY489pj59+qh3796aM2eO0tPTlZ+fH9F1CH8AAAyhkHVt/0iUl5crJyen+fU3ewUKCgpUUlKi+++/Xw0NDbr99tt19OhRDR8+XGvXrlXHjh0jug7hDwCAwa7H+44ePVqhUOiU73s8Hs2fP1/z588/q+uw5g8AgMtQ+QMAYLByt38sIvwBADCcpvPuCIQ/AAAGp1f+rPkDAOAyVP4AABicXvkT/gAAGJy+5k/bHwAAl6HyBwDAQNsfAACXsevxvu2Ftj8AAC5D5Q8AgMGuZ/u3F8IfAABDkLY/AABwEip/AAAMTt/wR/gDAGDgVj8AAFyGJ/wBAABHofIHAMBA2x8AAJfhVj8AAOAoVP4AABi41Q8AAJdhtz8AAHAUKn8AAAxO3/BH+AMAYHD6mj9tfwAAXIbKHwAAg9M3/BH+AAAYWPNvJ52nP2/3FICY0/j5JrunALgSa/4AAMBRYqbyBwAgVtD2BwDAZRy+34+2PwAAbkPlDwCAgbY/AAAuw25/AADgKFT+AAAYgnZPIMqo/AEAMITkseyIxMmTJzVnzhz17t1bnTp10sUXX6xHH31UIYufN0zlDwBAjFi4cKGWLVumlStXKjMzU+Xl5Zo6dap8Pp9mzpxp2XUIfwAADEGbbvTfsmWLJkyYoHHjxkmSevXqpeeff17bt2+39Dq0/QEAMATlsewIBAKqq6sLOwKBQKvXvfrqq7V+/Xrt2bNHkrRr1y5t3rxZY8eOtfT7Ef4AABisXPP3+/3y+Xxhh9/vb/W6Dz74oCZNmqR+/fopPj5egwYN0qxZszR58mRLvx9tfwAAoqi4uFhFRUVhY16vt9XPvvTSS/r973+vVatWKTMzU5WVlZo1a5bS09NVUFBg2ZwIfwAADFbe6uf1ek8Z9qb77ruvufqXpAEDBmj//v3y+/2EPwAA0RTpLXpWOX78uM45J3xFvkOHDgoGrX3yAOEPAECMyMvL0+OPP66MjAxlZmbq/fff1y9+8QtNmzbN0usQ/gAAGOx6wt+TTz6pOXPm6M4779ShQ4eUnp6uH//4x3r44YctvY4nZPVjg9ooLuECu6cAxJzGzzfZPQUgJsUnXxTV87+ZMsmyc/1z7QuWncsq3OoHAIDL0PYHAMBg14a/9kL4AwBgCDo7+2n7AwDgNlT+AAAYgrT9AQBwl5i4DS6KCH8AAAx23effXljzBwDAZaj8AQAwBD2s+QMA4CpOX/On7Q8AgMtQ+QMAYHD6hj/CHwAAA0/4AwAAjkLlDwCAgSf8AQDgMuz2BwAAjkLlDwCAwekb/gh/AAAM3OoHAIDLsOYPAAAchcofAAADa/4AALiM09f8afsDAOAyVP4AABicXvkT/gAAGEIOX/On7Q8AgMtQ+QMAYKDtDwCAyzg9/Gn7AwDgMlT+AAAYnP54X8IfAAADT/gDAMBlWPMHAACOQuUPAIDB6ZU/4Q8AgMHpG/5o+wMA4DJU/gAAGJy+25/KHwAAQ9DCI1IHDx7Urbfeqm7duqlTp04aMGCAysvLz/IbhaPyBwAgRhw5ckTDhg1TTk6O3nrrLZ1//vnau3evunbtaul1CH8AAAx2bfhbuHChevTooWeffbZ5rHfv3pZfh7Y/AACGoEKWHYFAQHV1dWFHIBBo9bpr1qzR4MGDdfPNN6t79+4aNGiQVqxYYfn3I/wBAIgiv98vn88Xdvj9/lY/u2/fPi1btkx9+vTRunXrNH36dM2cOVMrV660dE6eUCgUE7czxiVcYPcUgJjT+Pkmu6cAxKT45Iuiev5He0627Fz373mmRaXv9Xrl9XpbfDYhIUGDBw/Wli1bmsdmzpypHTt2aOvWrZbNiTV/AAAMVlbFpwr61qSlpenyyy8PG7vsssv0yiuvWDgjwh8AgBbserzvsGHDtHv37rCxPXv2qGfPnpZehzV/AABixOzZs7Vt2zb99Kc/VVVVlVatWqWnn35ahYWFll6H8AcAwBD0WHdEIjs7W6WlpXr++efVv39/Pfroo1q0aJEmT7ZuD4JE2x8AgBaCNv60z/XXX6/rr78+qteg8gcAwGWo/AEAMMTEPfBRRPgDAGCwa7d/e6HtDwCAy1D5AwBgsHPDX3sg/AEAMDg7+mn7AwDgOlT+AAAYnL7hj/AHAMDAmj8AAC7j7OhnzR8AANeh8gcAwMCaPwAALhNyeOOftj8AAC5D5Q8AgIG2PwAALuP0W/1o+wMA4DJU/gAAGJxd9xP+AAC0QNsfrjBi+BCtLi3RgU926uuvDmr8+DF2Twlod+WVH6rw/rnKGT9Z/YeN1fqyLS0+8/EnB3TX/Y/oH6/9F2Vfk69bfjBTf645ZMNsgbYj/CFJSkw8Vx988EfNuPsndk8FsE1j4wldeslF+sk9d7b6/oHPPte/Tr9XvXv20LNLFuqVlU/pjtu+rwRvQjvPFNEWtPCIRbT9IUlau26D1q7bYPc0AFuNGJqtEUOzT/n+4qdXasTQbN1T+IPmsYwL09tjamhnPOQHAKBgMKiyLTvUq8cFun32TzRy3CR970ezWl0awN8/p1f+lof/p59+qmnTpp32M4FAQHV1dWFHKOTs/8sC8PftyyNHdbyxUb/53UsaPmSwnv7l47pm5NWa9W+Pacf7H9g9PSAilof/l19+qZUrV572M36/Xz6fL+wIBf9i9VQAwDLB4P8XKDkjhupfJ92gfn0v1g+nTNSoq6/SS6vftHl2sFrIwn9iUcRr/mvWrDnt+/v27fvWcxQXF6uoqChsrGu3fpFOBQDaTdfzuiiuQwdd3CsjbPyiXj1U8cEfbZoVoiVW2/VWiTj88/Pz5fF4Ttum93g8pz2H1+uV1+uN6G8AwE7x8fHKvKyvqg98Fjb+yacHlZ7a3aZZAW0Tcds/LS1Nr776qoLBYKtHRUVFNOaJKEtMPFdZWZnKysqUJPXulaGsrEz16MFOZrjH8eON+mjPx/poz8eSpIOf1+qjPR8338c/9fv/orXry/Tymrd04LPPterlNXr3v97TpBvG2TltREEwFLLsiEWeUIQ77caPH6+BAwdq/vz5rb6/a9cuDRo0SMFgZE2TuIQLIvo8rDVq5FCt/8+XW4yvfO4l/eCHs22YESSp8fNNdk/BVbZXfKBpMx5oMT5hbK4ef+geSdKrb6zTf/z2JdUeOqxeGReq8Ie36rsjhrb3VF0vPvmiqJ7/1p43Wnau3+1/1bJzWSXi8N+0aZMaGhp03XXXtfp+Q0ODysvLNWrUqIgmQvgDLRH+QOsI/7MT8Zr/iBEjTvt+YmJixMEPAEAscfqz/XnCHwAAhli9Rc8qPOEPAACXofIHAMDAff4AALgMa/4AALgMa/4AAMBRqPwBADCw5g8AgMs4/WfmafsDABCDFixYII/Ho1mzZll+bip/AAAMdu/237Fjh37961/riiuuiMr5qfwBADAELTwCgYDq6urCjkAgcMpr19fXa/LkyVqxYoW6du0ale9H+AMAEEV+v18+ny/s8Pv9p/x8YWGhxo0bp9zc3KjNibY/AAAGK+/zLy4uVlFRUdiY1+tt9bMvvPCCKioqtGPHDsuu3xrCHwAAg5Vr/l6v95Rh/7c+/fRT3X333Xr77bfVsWNHy67fGsIfAIAYsHPnTh06dEjf+c53msdOnjypsrIyLVmyRIFAQB06dLDkWoQ/AAAGO+7zv+aaa/Thhx+GjU2dOlX9+vXTAw88YFnwS4Q/AAAt2PGEv86dO6t///5hY4mJierWrVuL8bNF+AMAYHD6D/sQ/gAAxKiNGzdG5byEPwAABruf8BdthD8AAAZ+2AcAADgKlT8AAAba/gAAuIzTd/vT9gcAwGWo/AEAMAQdvuGP8AcAwODs6KftDwCA61D5AwBgYLc/AAAuQ/gDAOAyPOEPAAA4CpU/AAAG2v4AALgMT/gDAACOQuUPAIDB6Rv+CH8AAAxOX/On7Q8AgMtQ+QMAYKDtDwCAy9D2BwAAjkLlDwCAwen3+RP+AAAYgqz5AwDgLk6v/FnzBwDAZaj8AQAw0PYHAMBlaPsDAABHofIHAMBA2x8AAJeh7Q8AAByFyh8AAANtfwAAXIa2PwAAcBQqfwAADKFQ0O4pRBXhDwCAIejwtj/hDwCAIeTwDX+s+QMAECP8fr+ys7PVuXNnde/eXfn5+dq9e7fl1yH8AQAwBBWy7IjEu+++q8LCQm3btk1vv/22mpqadO2116qhocHS7+cJxUhvIy7hArunAMScxs832T0FICbFJ18U1fNf0DXTsnMdPPI/bf7bL774Qt27d9e7776rkSNHWjYn1vwBAIiiQCCgQCAQNub1euX1er/1b48dOyZJSkpKsnROtP0BADAEQyHLDr/fL5/PF3b4/f5vn0MwqFmzZmnYsGHq37+/pd+Ptj8Qw2j7A62Ldts/9bzLLDvX/trKNlX+06dP11tvvaXNmzfrwgsvtGw+Em1/AACi6kxb/H/rrrvu0htvvKGysjLLg18i/AEAaMGupngoFNKMGTNUWlqqjRs3qnfv3lG5DuEPAIDBrif8FRYWatWqVXrttdfUuXNn1dTUSJJ8Pp86depk2XVY8wdiGGv+QOuiveZ/vu9Sy871xbEzf0iPx+NpdfzZZ5/VbbfdZtGMqPwBAGjBzrZ/eyD8AQAwBGOjKR41hD8AAIYYWRGPGh7yAwCAy1D5AwBgsGu3f3sh/AEAMND2BwAAjkLlDwCAgd3+AAC4TMjha/60/QEAcBkqfwAADLT9AQBwGXb7AwAAR6HyBwDA4PQNf4Q/AAAGp7f9CX8AAAxOD3/W/AEAcBkqfwAADM6u+yVPyOm9DUQkEAjI7/eruLhYXq/X7ukAMYF/L+A0hD/C1NXVyefz6dixY+rSpYvd0wFiAv9ewGlY8wcAwGUIfwAAXIbwBwDAZQh/hPF6vZo7dy6bmoC/wb8XcBo2/AEA4DJU/gAAuAzhDwCAyxD+AAC4DOEPAIDLEP4AALgM4Y9mS5cuVa9evdSxY0cNGTJE27dvt3tKgK3KysqUl5en9PR0eTwerV692u4pAZYg/CFJevHFF1VUVKS5c+eqoqJCWVlZGjNmjA4dOmT31ADbNDQ0KCsrS0uXLrV7KoCluM8fkqQhQ4YoOztbS5YskSQFg0H16NFDM2bM0IMPPmjz7AD7eTwelZaWKj8/3+6pAGeNyh/66quvtHPnTuXm5jaPnXPOOcrNzdXWrVttnBkAIBoIf+jw4cM6efKkUlJSwsZTUlJUU1Nj06wAANFC+AMA4DKEP5ScnKwOHTqotrY2bLy2tlapqak2zQoAEC2EP5SQkKArr7xS69evbx4LBoNav369hg4dauPMAADREGf3BBAbioqKVFBQoMGDB+uqq67SokWL1NDQoKlTp9o9NcA29fX1qqqqan5dXV2tyspKJSUlKSMjw8aZAWeHW/3QbMmSJXriiSdUU1OjgQMHavHixRoyZIjd0wJss3HjRuXk5LQYLygoUElJSftPCLAI4Q8AgMuw5g8AgMsQ/gAAuAzhDwCAyxD+AAC4DOEPAIDLEP4AALgM4Q8AgMsQ/gAAuAzhDwCAyxD+AAC4DOEPAIDL/B9MrBYAEYbSRQAAAABJRU5ErkJggg==", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "cm = ... # Confusion Matrix" ] } ], "metadata": { "kernelspec": { "display_name": "cv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.19" } }, "nbformat": 4, "nbformat_minor": 5 }