Architecture Deep Dive
DATA INTELLIGENCE
How I transformed raw Nassau shipping records into actionable trade intelligence — uncovering cargo patterns, port performance, route bottlenecks, and seasonal logistics trends through end-to-end data analysis.
The Raw Data Problem
Nassau's shipping dataset contained thousands of unstructured voyage records spanning multiple routes, cargo types, vessel classes, and port pairs. Without structured analysis, the data was noise — no visibility into which routes were profitable, which cargo types caused delays, or which ports created bottlenecks.
The EDA-First Approach
I applied a systematic Exploratory Data Analysis pipeline using pandas for wrangling and plotly for interactive visualization. Rather than jumping to conclusions, the analysis was driven by distributional patterns, correlations, and time-series decomposition — letting the data surface its own story.
# Load and inspect the raw dataset
import pandas as pd
import plotly.express as px
df = pd.read_csv('nassau_shipping.csv')
# Parse dates and compute transit time
df['Departure Date'] = pd.to_datetime(df['Departure Date'])
df['Arrival Date'] = pd.to_datetime(df['Arrival Date'])
df['Transit Time'] = (
df['Arrival Date'] - df['Departure Date']
).dt.days
# Identify delayed shipments
df['Is Delayed'] = df['Status'] == 'Delayed'
# Route-level freight cost aggregation
route_stats = (
df.groupby('Route')
.agg(
avg_cost=('Freight Cost ($)', 'mean'),
total_volume=('Cargo Weight (tons)', 'sum'),
delay_rate=('Is Delayed', 'mean')
)
.sort_values('avg_cost', ascending=False)
.reset_index()
)
fig = px.bar(
route_stats, x='Route', y='avg_cost',
color='delay_rate', title='Route Cost vs Delay Rate'
)
Route Intelligence
Using groupby aggregations, I computed per-route KPIs — average freight cost, total cargo volume, and delay rate. This revealed that just the top 3 routes account for over 60% of total cargo volume, exposing a critical concentration risk in Nassau's logistics network.
Seasonal Trend Modeling
A time-series decomposition of monthly shipment volumes revealed a pronounced Q3 peak (July–September), indicating seasonal demand surges. Rolling averages smoothed short-term noise, making the cyclical freight cost pattern clearly visible for business forecasting.
KEY INSIGHTS
Bulk Cargo Dominance
Bulk cargo constitutes the largest share of all shipments, driving the majority of port throughput volumes and shaping Nassau's overall freight cost structure.
Cost–Transit Correlation
Transit time shows a strong positive correlation with freight cost — longer routes yield proportionally higher operational expenses, validated through scatter analysis and Pearson coefficient.
Perishable Delay Risk
Perishable goods experience a 2× higher delay rate compared to dry bulk cargo, revealing a critical vulnerability in cold-chain logistics across Nassau's maritime routes.
DASHBOARD ARCHITECTURE
Interactive Filters
Built dynamic Streamlit sidebar controls — date range sliders, multi-select dropdowns for cargo type, ship type, and port — that push filter state directly into Plotly chart re-renders.
Sankey Route Flow
Engineered a Plotly Sankey diagram to visualize cargo flow between loading and discharge ports, making route concentration and freight pathways immediately interpretable at a glance.
CSV Export Pipeline
Implemented a one-click filtered data export using Streamlit's download button, converting the live-filtered DataFrame to a CSV in-memory buffer for zero-friction user downloads.