2 Easy Steps to Download Historical Weather Data

The National Oceanic and Atmospheric Administration (NOAA) Integrated Surface Database (ISD) provides one of the richest sources of historical weather data consisting of hourly and synpoptic observation. This blog will introuce the simple way to retrieve and process the raw data into Python dataframe.

Selection of weather station

Based on the location of the city/sampling site, we could seach for the best/most close weather station for getting its ID. Here, Milan, Italy will be treated as the example, and the main procedures are listed as follows. The detailed information of weathe station globally can be downloaded here link.

1
2
3
4
5
6
7
8
9
10
11
import pandas as pd

isd_st = pd.read_csv("/mnt/d/Dropbox/data/geo/Meteo/NOAA/global/isd-history.csv")
## filtering based on country name
isd_st[isd_st['CTRY'] =='IT']
## filtering based on city name
isd_st[isd_st["STATION NAME"].astype(str).str.contains("MILAN")]

## Then, we could choose the perferred monitoring station, and note its "USAF" and "WBAN"
## For Milan, USAF == 160800, WBAN == 99999, then the target ID will be "160800-99999"
station_ID = 160800-99999

A more visualized way could be achieved through this link Find a station.

FTP downloading and processing

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
from ftplib import FTP
from pathlib import Path

ftp = FTP("ftp.ncdc.noaa.gov")
ftp.login()

## there are multiple dataset within NOAA ISD. Here we choose NOAA ISD-Lite, which is hourly-recoreded.
ftp.cwd('/pub/data/noaa/isd-lite/2013/')

# Get all files
files = ftp.nlst()

# Download the file matched with the station ID
for file in files:
if file == station_ID +'.gz':
# print (file)
print("Downloading..." + file)
ftp.retrbinary(f'RETR {file}', open(str(Path(r'./isd_data') / file), 'wb').write)
# # ftp.close()

def calc_rh(T,TD):
RH = 100*(np.exp((17.625*TD)/(243.04+TD))/np.exp((17.625*T)/(243.04+T)))
return RH
def read_year_meteo(filepath):
df= pd.read_csv(filepath,sep='\s+',header = None)
#NOTE: Trace precipitation is coded as -1
# SCALING FACTOR: 10
df.columns = ['Year','Month','Day','Hour','Air temp','Dew point','Pressure','WD','WS','Cloud','1h prep','6h prep']
df['Date'] = df['Year'].astype(str)+'-'+df["Month"].apply("{0:0=2d}".format)+'-'+df["Day"].apply("{0:0=2d}".format)+' '+df["Hour"].apply("{0:0=2d}".format)+':00:00'
df = df[['Date','Air temp','Dew point','Pressure','WD','WS','Cloud','1h prep','6h prep'] ]
df = df.replace(-9999,np.nan)
for t in ['Air temp','Dew point','Pressure','WS','1h prep','6h prep']:
df[t] = df[t]/10.0
df['RH'] = calc_rh(df['Air temp'],df['Dew point'])
return df

milan_2013 = read_year_meteo('./isd_data/160800-99999-2013')
milan_2013.to_csv("./milan_weather_data_hourly.csv", index = False)

Done!

Reference

How We Process the NOAA Integrated Surface Database Historical Weather Data

Preparing the NOAA ISD, Hourly, Global Dataset for Time-series Databases

Getting Weather Data in 3 Easy Steps

Plotting maps using D3 and Topojson 读书笔记-意大利文化简史

Kommentare

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×