Open In Colab

Project: Sea Surface Temperature Prediction

Description

In this project we will use a TAO/TRITON buoy dataset to predict sea surface temperature (s.s.temp.) in the equatorial Pacific Ocean. Anomalies in sea surface temperature in this region are the main indicator of El Niño and La Niña events, which have a global impact on climate.

Starting from the preprocessed DataFrame df2, you have build and compare three recurrent architectures: RNN, LSTM, and GRU, evaluating their ability to capture the temporal dependencies of the series.

Download dataset

The dataset contains the following columns:

Column

Description

month

Month of the observation

day

Day of the observation

latitude

Latitude of the buoy

longitude

Longitude of the buoy

zon.winds

Zonal wind speed (east-west)

mer.winds

Meridional wind speed (north-south)

air temp.

Air temperature (ºC)

s.s.temp.

Sea surface temperature (ºC) — target variable

Delivery

The project must be submitted as a Jupyter Notebook (.ipynb) file containing all the code, results, and markdown cells explaining each step and the decisions made. Additionally, a PDF export of the notebook must be included so that the results and visualisations are easily accessible without needing to run the code. Both files must be submitted together.

[ ]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_absolute_error

Data loading

Loading the data in a dataframe called df2:

[ ]:
with open('tao-all2.col') as file:
    names2 = []
    for name in file:

        n = name.strip("\n")
        names2.append(n)

print(names2)
[ ]:
df2 = pd.read_csv(
    'tao-all2.dat.gz',
    compression='gzip',
    sep=r'\s+',
    na_values='.',
    comment='%',
    header=None,
)
df2.columns = names2

Deleting unused features:

[ ]:
df2.drop(["obs", "humidity","date","year"], axis=1, inplace=True)
[ ]:
original_rows = df2.shape[0]
df2 = df2.dropna()
print(f" Rows after dropna: {df2.shape[0]}")

print(f"Deleted rows: {original_rows - df2.shape[0]}")
[ ]:
df2.columns

Definition of train and test sets

[ ]:

Models

[ ]:

Training

[ ]:

Evaluation

[ ]: