Project: Sea Surface Temperature Prediction
Description
In this project we will use a TAO/TRITON buoy dataset to predict sea surface temperature (s.s.temp.) in the equatorial Pacific Ocean. Anomalies in sea surface temperature in this region are the main indicator of El Niño and La Niña events, which have a global impact on climate.
Starting from the preprocessed DataFrame df2, you have build and compare three recurrent architectures: RNN, LSTM, and GRU, evaluating their ability to capture the temporal dependencies of the series.
The dataset contains the following columns:
Column |
Description |
|---|---|
|
Month of the observation |
|
Day of the observation |
|
Latitude of the buoy |
|
Longitude of the buoy |
|
Zonal wind speed (east-west) |
|
Meridional wind speed (north-south) |
|
Air temperature (ºC) |
|
Sea surface temperature (ºC) — target variable |
Delivery
The project must be submitted as a Jupyter Notebook (.ipynb) file containing all the code, results, and markdown cells explaining each step and the decisions made. Additionally, a PDF export of the notebook must be included so that the results and visualisations are easily accessible without needing to run the code. Both files must be submitted together.
[ ]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_absolute_error
Data loading
Loading the data in a dataframe called df2:
[ ]:
with open('tao-all2.col') as file:
names2 = []
for name in file:
n = name.strip("\n")
names2.append(n)
print(names2)
[ ]:
df2 = pd.read_csv(
'tao-all2.dat.gz',
compression='gzip',
sep=r'\s+',
na_values='.',
comment='%',
header=None,
)
df2.columns = names2
Deleting unused features:
[ ]:
df2.drop(["obs", "humidity","date","year"], axis=1, inplace=True)
[ ]:
original_rows = df2.shape[0]
df2 = df2.dropna()
print(f" Rows after dropna: {df2.shape[0]}")
print(f"Deleted rows: {original_rows - df2.shape[0]}")
[ ]:
df2.columns
Definition of train and test sets
[ ]:
Models
[ ]:
Training
[ ]:
Evaluation
[ ]: