{ "cells": [ { "metadata": {}, "cell_type": "markdown", "source": [ "\n", " \"Open\n", "" ], "id": "2b13139ab7b63801" }, { "metadata": {}, "cell_type": "markdown", "source": [ "# Project: Sea Surface Temperature Prediction\n", "\n", "## Description\n", "\n", "In this project we will use a **TAO/TRITON** buoy dataset to predict sea surface temperature (`s.s.temp.`) in the equatorial Pacific Ocean. Anomalies in sea surface temperature in this region are the main indicator of **El Niño** and **La Niña** events, which have a global impact on climate.\n", "\n", "Starting from the preprocessed DataFrame `df2`, you have build and compare three recurrent architectures: **RNN**, **LSTM**, and **GRU**, evaluating their ability to capture the temporal dependencies of the series.\n", "\n", "[Download dataset](https://github.com/bmalcover/AppOC/blob/main/docs/_static/03/el%2Bnino.zip)\n", "\n", "The dataset contains the following columns:\n", "\n", "| Column | Description |\n", "|--------|-------------|\n", "| `month` | Month of the observation |\n", "| `day` | Day of the observation |\n", "| `latitude` | Latitude of the buoy |\n", "| `longitude` | Longitude of the buoy |\n", "| `zon.winds` | Zonal wind speed (east-west) |\n", "| `mer.winds` | Meridional wind speed (north-south) |\n", "| `air temp.` | Air temperature (ºC) |\n", "| `s.s.temp.` | Sea surface temperature (ºC) — **target variable** |\n", "\n", "\n", "**Delivery**\n", "\n", "The project must be submitted as a Jupyter Notebook (.ipynb) file containing all the code, results, and markdown cells explaining each step and the decisions made. Additionally, a PDF export of the notebook must be included so that the results and visualisations are easily accessible without needing to run the code. Both files must be submitted together." ], "id": "46b90c41f04c5e53" }, { "metadata": {}, "cell_type": "code", "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import torch\n", "import torch.nn as nn\n", "from sklearn.preprocessing import MinMaxScaler\n", "from sklearn.metrics import mean_absolute_error\n" ], "id": "d7bcf2b89a6bda7a", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": [ "## Data loading\n", "\n", "Loading the data in a dataframe called `df2`:" ], "id": "4cfbef1aa8ffbfe7" }, { "metadata": {}, "cell_type": "code", "source": [ "with open('tao-all2.col') as file:\n", " names2 = []\n", " for name in file:\n", "\n", " n = name.strip(\"\\n\")\n", " names2.append(n)\n", "\n", "print(names2)" ], "id": "934772aebb2230dc", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "code", "source": [ "df2 = pd.read_csv(\n", " 'tao-all2.dat.gz',\n", " compression='gzip',\n", " sep=r'\\s+',\n", " na_values='.',\n", " comment='%',\n", " header=None,\n", ")\n", "df2.columns = names2" ], "id": "3ec2192407b69125", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "Deleting unused features:", "id": "350b36197b03348f" }, { "metadata": {}, "cell_type": "code", "source": "df2.drop([\"obs\", \"humidity\",\"date\",\"year\"], axis=1, inplace=True)", "id": "c065df3d0853081", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "code", "source": [ "original_rows = df2.shape[0]\n", "df2 = df2.dropna()\n", "print(f\" Rows after dropna: {df2.shape[0]}\")\n", "\n", "print(f\"Deleted rows: {original_rows - df2.shape[0]}\")" ], "id": "aa5c2daf2fb034a9", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "code", "source": "df2.columns", "id": "20e3c789f1b4e5f", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Definition of train and test sets", "id": "2a920ca130d5afaf" }, { "metadata": {}, "cell_type": "code", "source": "\n", "id": "dc76d43d69935ded", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Models", "id": "e8d783f8869f0ff2" }, { "metadata": {}, "cell_type": "code", "source": "", "id": "d0ba7a5f4a4d8378", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Training", "id": "d17343583df103d5" }, { "metadata": {}, "cell_type": "code", "source": "", "id": "f5908973c0ceae75", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Evaluation", "id": "11cb5808a86a3400" }, { "metadata": {}, "cell_type": "code", "source": "", "id": "579a674a5740cbb", "outputs": [], "execution_count": null } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 5 }