Back to Main Site

Weather Data Acquisition

How WindMar sources, processes, and visualizes meteorological data

Data Sources: WindMar integrates NOAA GFS near-real-time forecasts, Copernicus ERA5 reanalysis, and CMEMS ocean data. A synthetic fallback ensures the system always has data available for development and demonstration.

WindMar's weather visualization and route optimization depend on accurate, timely meteorological data. The system implements a three-tier provider chain that automatically selects the best available data source based on freshness, availability, and data type.

What Powers the Weather Map

When you open WindMar and see wind particles flowing across the Mediterranean, here's what happens behind the scenes:

  • Wind particles & grid overlay — NOAA GFS 0.25° resolution, updated every 6 hours, with ~3.5h availability lag
  • Wave height contours — Copernicus CMEMS global wave model (0.083°), or synthetic fallback
  • Ocean current particles — Copernicus CMEMS global physics model, or synthetic fallback
  • Forecast animation — 41 GFS forecast frames (f000–f120, 3h steps) animated through a Windy-style timeline

Provider Chain

Each weather data type follows a fallback chain. The system tries the most authoritative and freshest source first, falling back gracefully when APIs are unavailable or credentials are not configured.

Wind Data Chain

GFS Near-Real-Time (0.25°, ~3.5h lag)
    ↓ unavailable
ERA5 Reanalysis (0.25°, ~5-day lag)
    ↓ unavailable
Synthetic Generator (configurable resolution)

Wave Data Chain

CMEMS Global Wave Model (0.083°, near-real-time)
    ↓ unavailable or no credentials
Synthetic Wave Generator (wind-coupled)

Ocean Current Data Chain

CMEMS Global Physics Model (0.083°, near-real-time)
    ↓ unavailable or no credentials
Synthetic Current Generator (climatological patterns)
No credentials needed for wind: GFS data from NOAA NOMADS is freely available without authentication. Only CMEMS wave/current data requires a Copernicus Marine Service account (free registration).

Why GFS for Wind?

GFS was chosen as the primary wind source for its simplicity of access and data freshness. The NOMADS GRIB filter provides direct HTTP access to subregion extracts — no SDK, no authentication, no client library to install. Each request returns an ~80 KB GRIB2 file for the Mediterranean region, parsed directly with pygrib.

By contrast, the Copernicus ecosystem (ERA5 via CDS, waves and currents via CMEMS) requires account registration and a dedicated Python client (copernicusmarine). While CMEMS access is free, the integration pipeline is heavier: the SDK adds dependencies, the API has higher latency, and ERA5 reanalysis data carries a ~5-day lag that makes it unsuitable as a primary near-real-time source. CMEMS integration for wave and current data remains a work in progress.

Wind fields from GFS are validated visually against Windy.com, which itself uses ECMWF IFS and GFS as underlying models. A comprehensive study of available metocean data sources (ECMWF IFS, ICON, NAM, CMEMS forecasts) will follow once end-to-end route optimization is validated.

GFS (Global Forecast System)

The Global Forecast System (GFS) is NOAA's primary global weather model. It produces forecasts four times daily, covering the entire globe at 0.25° resolution (~28 km at the equator) with forecast hours extending to 384 hours (16 days).

Model Characteristics

Property Value
OperatorNOAA / NCEP
Resolution0.25° (~28 km)
Model runs00Z, 06Z, 12Z, 18Z daily
Availability lag~3.5 hours after run start
Forecast range384 hours (16 days)
Temporal resolution1h (f000–f120), 3h (f120–f384)
Variables usedUGRD, VGRD at 10m above ground
Data formatGRIB2
AccessFree, no authentication

How WindMar Uses GFS

WindMar downloads only the 10-metre U and V wind components for a subregion of the globe:

  • Analysis (f000): Current conditions — loaded on page open for the wind overlay and particle animation
  • Forecast (f003–f120): 40 additional forecast hours in 3h steps — downloaded via the forecast prefetch system and animated through the timeline slider

Run Selection Algorithm

WindMar automatically determines the latest available GFS run by subtracting the 3.5-hour availability lag from the current UTC time and rounding down to the nearest run hour:

now_utc = 2026-02-08 14:30Z
available_time = 14:30 - 3:30 = 11:00Z
latest_run = max(h for h in [0, 6, 12, 18] if h <= 11) = 06Z
→ Uses GFS run: 2026-02-08 / 06Z

NOMADS GRIB Filter API

NOAA's NOMADS (NOAA Operational Model Archive and Distribution System) provides a GRIB filter service that allows downloading subsets of GFS output without transferring the entire global file (~300 MB per forecast hour).

URL Construction

Each GRIB download is constructed as a parameterized HTTP GET request:

https://nomads.ncep.noaa.gov/cgi-bin/filter_gfs_0p25.pl?
  file=gfs.t06z.pgrb2.0p25.f024       # Run hour + forecast hour
  &lev_10_m_above_ground=on            # Surface wind level
  &var_UGRD=on                          # U-component of wind
  &var_VGRD=on                          # V-component of wind
  &subregion=                            # Enable spatial filtering
  &leftlon=-15                           # Western bound
  &rightlon=40                           # Eastern bound
  &toplat=60                             # Northern bound
  &bottomlat=30                          # Southern bound
  &dir=/gfs.20260208/06/atmos            # Model run directory

Parameters Explained

Parameter Description Example
fileGRIB2 filename with run hour and forecast hourgfs.t06z.pgrb2.0p25.f024
lev_10_m_above_groundSelects the 10m wind levelon
var_UGRDU-component (east-west) of windon
var_VGRDV-component (north-south) of windon
subregionEnables spatial subsetting(empty)
leftlon, rightlonLongitude bounds (accepts -180..180 or 0..360)-15, 40
toplat, bottomlatLatitude bounds60, 30
dirServer directory for the model run/gfs.20260208/06/atmos

Rate Limiting

NOMADS rate limit: NOAA asks users to limit requests. WindMar enforces a 2-second delay between consecutive downloads during forecast prefetch. For the default Mediterranean region (30–60°N, 15°W–40°E), each GRIB2 file is approximately 80 KB.

File Size Estimates

Region Grid Points Per-File Size 41 Frames Total
Mediterranean (30–60N, -15–40E)121 × 221 = 26,741~80 KB~3.3 MB
Global721 × 1440 = 1,038,240~8 MB~330 MB

ERA5 Reanalysis

ERA5 is ECMWF's fifth-generation atmospheric reanalysis dataset, produced by the Copernicus Climate Data Store (CDS). It combines historical observations with numerical weather models to produce a consistent, gap-free record of past weather.

Key Characteristics

Property Value
OperatorECMWF / Copernicus CDS
Resolution0.25° (~31 km)
Temporal resolutionHourly
Coverage1940 to present
Availability lag~5 days
FormatNetCDF / GRIB2
AccessFree with CDS API key

When WindMar Uses ERA5

ERA5 serves as the second-tier fallback in the wind provider chain:

  • When GFS data is temporarily unavailable (NOMADS outages, network issues)
  • When pygrib library is not installed in the Docker container
  • For historical analysis and route replay
Note: ERA5 requires a CDS API key configured via the CDSAPI_KEY environment variable. Register at cds.climate.copernicus.eu.

GRIB2 Data Format

GRIB2 (GRIdded Binary, Edition 2) is the WMO standard format for meteorological data. Each GRIB2 file contains one or more messages, where each message represents a single field (e.g., U-wind at 10m) on a regular latitude/longitude grid.

GFS GRIB2 Message Structure

WindMar downloads GRIB2 files containing exactly two messages:

Message Short Name Description Units
110u (UGRD)U-component of 10m wind (eastward)m/s
210v (VGRD)V-component of 10m wind (northward)m/s

Reading GRIB2 in Python

import pygrib

grbs = pygrib.open("gfs_20260208_06_f024_lat30_60_lon-15_40.grib2")

u_msg = grbs.select(shortName='10u')[0]
v_msg = grbs.select(shortName='10v')[0]

u_data = u_msg.values    # 2D numpy array [lat, lon]
v_data = v_msg.values

lats_2d, lons_2d = u_msg.latlons()
lats = lats_2d[:, 0]     # 1D latitude vector
lons = lons_2d[0, :]     # 1D longitude vector

grbs.close()

Leaflet-Velocity JSON Format

WindMar's frontend uses leaflet-velocity to render animated wind and current particles on the map. This library expects data in a specific JSON format derived from GRIB2 conventions.

Format Structure

[
  {
    "header": {
      "parameterCategory": 2,    // Meteorological: Momentum
      "parameterNumber": 2,      // U-component
      "lo1": -15.0,              // First longitude (western edge)
      "la1": 60.0,               // First latitude (northern edge)
      "lo2": 40.0,               // Last longitude (eastern edge)
      "la2": 30.0,               // Last latitude (southern edge)
      "dx": 0.25,                // Longitude step (degrees)
      "dy": 0.25,                // Latitude step (degrees)
      "nx": 221,                 // Number of longitude points
      "ny": 121,                 // Number of latitude points
      "refTime": "2026-02-08T06:00:00"
    },
    "data": [...]                // Flattened N→S, W→E array
  },
  {
    "header": {
      ...
      "parameterNumber": 3       // V-component
    },
    "data": [...]
  }
]
Data ordering: Leaflet-velocity expects data ordered from North to South (descending latitude), West to East (ascending longitude). GFS GRIB2 data arrives South-to-North, so WindMar flips the array before serving it to the frontend.

Coordinate Conventions

Different data sources use different longitude conventions, which WindMar handles automatically:

Source Longitude Range Example: 15°W
GFS GRIB2 (native)0° to 360°345°
ERA5 NetCDF-180° to 180°-15°
Leaflet / Frontend-180° to 180°-15°
NOMADS APIAccepts both conventions-15° or 345°

WindMar's GFSDataProvider converts all longitudes to the -180..180 convention after reading GRIB2 data, re-sorting the data array to maintain ascending longitude order.

Caching Strategy

All downloaded weather data is cached to disk. The cache key encodes the GFS run, forecast hour, and geographic bounds, ensuring that data is never re-downloaded for the same parameters.

Cache File Naming

data/gfs_cache/
  gfs_{YYYYMMDD}_{HH}_f{FFF}_lat{S}_{N}_lon{W}_{E}.grib2

Example:
  gfs_20260208_06_f024_lat30_60_lon-15_40.grib2
       ↑          ↑   ↑    ↑
       run date   run  forecast  geographic bounds
                  hour  hour

Cache Invalidation

  • Automatic: New GFS runs produce new filenames, so stale data coexists without conflict
  • Cleanup: The clear_old_cache(keep_hours=12) method removes files older than 12 hours
  • Disk usage: ~3.3 MB for a full 41-frame forecast (Mediterranean region)

Forecast Prefetch Workflow

When the user enables the forecast timeline, the following sequence occurs:

  1. Frontend sends POST /api/weather/forecast/prefetch
  2. Backend starts a background task that downloads f000–f120 in 3h steps (41 files)
  3. Frontend polls GET /api/weather/forecast/status every 3 seconds
  4. Progress bar shows N/41 frames cached
  5. Once all frames are cached, frontend calls GET /api/weather/forecast/frames
  6. Backend reads all 41 GRIB2 files, converts to velocity JSON, returns as single bulk response
  7. Frontend stores all frames in memory and enables the timeline slider
Performance: Full prefetch takes ~82 seconds worst-case (41 files × 2s rate limit). Subsequent loads with cache hits complete in under 1 second.

Forecast Timeline

The forecast timeline is a Windy.com-style animation control that lets users scrub through 5 days (120 hours) of GFS wind forecasts. It renders as a bar at the bottom of the map.

Timeline Controls

Control Function
Play / PauseAuto-advance through forecast hours
Time displayShows T+Nh and the valid UTC datetime
Range slider0 to 120 in steps of 3, with day markers at 0h/24h/48h/72h/96h/120h
Speed (1x/2x/4x)Animation speed: 2s/1s/0.5s per frame
Close (X)Disables forecast mode, reverts to live f000 data

How Animation Works

Only the wind particles animate through forecast hours. The underlying grid overlay (colored wind speed cells) stays at f000, matching the Windy.com pattern where the particle layer conveys temporal change while the background provides spatial context.

When the user moves the slider or the animation advances, the frontend swaps the windVelocityData state with the pre-loaded forecast frame for that hour. The VelocityParticleLayer component re-renders with the new data, creating the visual impression of wind patterns evolving over time.

Weather Model Comparison

How the data sources WindMar uses compare to other major weather models. All sources marked “Free” require no payment; “Free (account)” means free registration is needed.

Model Operator Resolution Lag Forecast Range Cost WindMar Status
GFS NOAA 0.25° ~3.5h 384h (16d) Free Primary wind source
ERA5 ECMWF / CDS 0.25° ~5 days Reanalysis only Free (API key) Wind fallback
ECMWF IFS ECMWF 0.1° ~6h 240h (10d) Commercial Not used
ICON DWD 0.125° ~4h 180h (7.5d) Free Not used
CMEMS Waves Copernicus 0.083° Near RT 240h (10d) Free (account) Wave source (integration in progress)
CMEMS Physics Copernicus 0.083° Near RT 240h (10d) Free (account) Current source (integration in progress)

Weather & Forecast API Reference

GET /api/weather/wind/velocity

Returns wind data in leaflet-velocity format. Supports forecast hours for timeline animation.

Parameters

  • lat_min float — Southern latitude bound (default: 30.0)
  • lat_max float — Northern latitude bound (default: 60.0)
  • lon_min float — Western longitude bound (default: -15.0)
  • lon_max float — Eastern longitude bound (default: 40.0)
  • forecast_hour int — GFS forecast hour, 0–120 in steps of 3 (default: 0)

Response

Array of two velocity data objects (U-component, V-component) in leaflet-velocity format.

GET /api/weather/forecast/status

Returns GFS run information and which forecast hours are cached.

Response

{
  "run_date": "20260208",
  "run_hour": "06",
  "total_hours": 41,
  "cached_hours": 15,
  "complete": false,
  "prefetch_running": true,
  "hours": [
    {"forecast_hour": 0, "valid_time": "2026-02-08T06:00:00Z", "cached": true},
    {"forecast_hour": 3, "valid_time": "2026-02-08T09:00:00Z", "cached": true},
    ...
  ]
}
POST /api/weather/forecast/prefetch

Triggers background download of all 41 GFS forecast hours (f000–f120). Returns immediately. Poll /status for progress.

Parameters

  • lat_min, lat_max, lon_min, lon_max — Geographic bounds (same defaults as velocity endpoint)

Response

{"status": "started", "message": "Prefetch triggered in background"}
GET /api/weather/forecast/frames

Bulk endpoint returning all cached forecast frames in velocity format. Call after prefetch completes. Response is gzip-compressed (~5–10 MB).

Response

{
  "run_date": "20260208",
  "run_hour": "06",
  "run_time": "2026-02-08T06:00:00",
  "total_hours": 41,
  "cached_hours": 41,
  "frames": {
    "0":  [{"header": {...}, "data": [...]}, {"header": {...}, "data": [...]}],
    "3":  [{"header": {...}, "data": [...]}, {"header": {...}, "data": [...]}],
    "6":  [...],
    ...
    "120": [...]
  }
}