Time Series Data

Time series data summarizes biological activity across time for each radar station. It is created by aggregating vertical profile data to the volume scan level by integrating across the vertical dimension.

Table of contents

  1. Definitions of Daily Periods, Sunrise, and Sunset
  2. Scan-Level Time Series
  3. 5-Minute Time Series

Definitions of Daily Periods, Sunrise, and Sunset

For convenience, time series data gives times relative to local sunrise/sunset, as well as other information about daily periods and solar elevation, in addition to giving absolute UTC times.

Three different periods are defined for each radar station and date:

  • day: Between local sunrise and local sunset on the specified date.
  • night: Between local sunset on the specified date and local sunrise the next day.
  • utc_calendar_day: From 00:00 UTC to 23:59 UTC on the specified date.

Sunrise and sunset are defined as the times when the solar elevation (the angle between the sun’s center and the horizon) is equal to -0.8333 degrees, which, with standard refraction, is the moment when sun’s upper limb crosses the horizon. Calculations are done using the NREL Solar Position Algorithm (Reda and Andreas; 2003), as implemented in the Python pvlib package.

Reference: Reda, Ibrahim, and Afshin Andreas. “Solar position algorithm for solar radiation applications.” Solar energy 76.5 (2004): 577-589.

Scan-Level Time Series

The scan-level time series has one record per radar volume scan, which occurs every 4 to 10 minutes for each station. Records are grouped into files by station and year, and organized into directories by year, so an example file would be:

scans/2017/KBOX-2017.csv

The data looks like this:

Source: KBOX-2017.csv

station datetime solar_elevation period_date period period_time period_length reflectivity reflectivity_unfiltered traffic_rate traffic_rate_unfiltered u v speed direction fraction_rain rmse
KBOX 2017-01-01T00:03:44Z -28.9065 2016-12-31 night 2.6720 14.8185 1.1082 30.2624 129.7419 3609.4937 17.7099 26.9328 32.5198 33.3273 0.0441 11.7738
KBOX 2017-01-01T00:08:40Z -29.8189 2016-12-31 night 2.7542 14.8185 1.9516 42.3745 234.6426 5096.9646 18.5884 27.0999 33.3982 34.4470 0.0380 11.8894
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

Schema: Scan-Level Time Series

Source: scan.json

A time series of biology measurements for radar volume scans. The measurements are obtained by vertically integrating the profile data. Missing values may rarely occur when a dependent variable in the profile is missing at all elevations.

Name Description Type Unit
station *

The 4-character station ID. See station metadata.

  • required: true
  • pattern: [A-Z]{4}
string
datetime *

UTC timestamp of record. For scan-level time series, this is the timestamp of the radar volume scan. For 5-minute time series, it is a regularly-spaced 5-minute timestamp.

  • required: true
datetime
solar_elevation *

Sun elevation in degrees above horizon (negative for below horizon) at the radar station’s location for the date and time of the record.

  • required: true
number degree
period_date *

The local date of the start of the period (night or day) to which the record belongs. See the period field.

  • required: true
date
period *

The period (night or day) to which the record belongs, as determined by the station’s location and the date and time of the record.

  • required: true
string
period_time *

Hours from the beginning of the period to the time of the current record. See the period field.

  • required: true
  • minimum: 0.0
number h
period_length *

Length of the period. See the period field.

  • required: true
  • minimum: 0.0
number h
reflectivity

Vertically integrated reflectivity. Represents total radar cross section in cm2 in a vertical column above one square kilometer of the earth’s surface. Only includes scattering volumes identified as biology.

  • minimum: 0.0
number cm2 km-2
reflectivity_unfiltered

Same as reflectivity, but includes all scattering volumes, including those classified as precipitation.

  • minimum: 0.0
number cm2 km-2
traffic_rate

Reflectivity traffic rate, computed by vertically integrating reflectivity multiplied by speed. Represents total radar cross section in cm2 crossing over a one kilometer transect in one hour, where the transect is adaptively chosen for each height bin to be perpendicular to the mean direction of travel in that height bin. Only includes scattering volumes identified as biology.

  • minimum: 0.0
number cm2 km-1 h-1
traffic_rate_unfiltered

Same as traffic_rate, but includes all scattering volumes, including those classified as precipitation.

  • minimum: 0.0
number cm2 km-1 h-1
u

Zonal (east-west) velocity component, computed as reflectivity-weighted average over height bins.

number m s-1
v

Meridional (north-south) velocity component, computed as reflectivity-weighted average over height bins.

number m s-1
speed

Mean (ground) speed of travel, computed as reflectivity-weighted average over height bins. This represents the average speed of scatterers across elevation bins, regardless of direction. This is different from, and generally higher than, the magnitude of the reflectivity-weighted average velocities u and v, because ‘cancellation’ of objects moving in different directions can occur when averaging velocities.

  • minimum: 0.0
number m s-1
direction

Mean direction of travel, computed directly from u and v. The angle is given as a compass bearing in degrees clockwise from north.

  • minimum: 0.0
  • maximum: 360.0
number degree
fraction_rain

Fraction of scattering volumes classified as precipitation.

  • minimum: 0.0
  • maximum: 1.0
number fraction
rmse

Root-mean squared error of VVP fit, reflectivity-weighted average over height bins.

  • minimum: 0.0
number m s-1

5-Minute Time Series

The 5-Minute Time Series data resamples the Scan-Level Time Series data to a fixed time step of five minutes (instead of the original irregular time step of 4 to 10 minutes), which often makes it easier to analyze and combine with other time series data.

There is one file per station-year, organized into directories by year. An example file would be:

5min/2017/KBOX-2017-5min.csv

The data is similar to the scan-level time series. It looks like this:

Source: KBOX-2017-5min.csv

station datetime solar_elevation period_date period period_time period_length reflectivity reflectivity_unfiltered traffic_rate traffic_rate_unfiltered u v speed direction fraction_rain rmse filled
KBOX 2017-01-01T00:00:00Z -28.2169 2016-12-31 night 2.6098 14.8185 1.1082 30.2624 129.7419 3609.4937 17.7099 26.9328 32.5198 33.3273 0.0441 11.7738 0
KBOX 2017-01-01T00:05:00Z -29.1406 2016-12-31 night 2.6931 14.8185 1.3247 33.3723 156.6759 3991.4119 17.9355 26.9757 32.7453 33.6189 0.0425 11.8035 0
KBOX 2017-01-01T00:10:00Z -30.0656 2016-12-31 night 2.7764 14.8185 1.6832 37.7803 199.5330 4535.2454 17.7703 26.7170 32.5709 33.6292 0.0440 11.9710 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
  • The datetime column now has timestamps at regular 5-minute intervals, and there is a new filled column recording when values are considered to be “filled in”.
  • The metadata columns from solar_elevation through period_length are computed from the location and timestamp given by station and datetime.
  • The data fields from density through rmse are interpolated from the scan-level data using linear interpolation if the time to the nearest measurement is less than one hour, and left empty otherwise. If the time to the nearest measurement is less than one hour but greater than 10 minutes, the measurement is considered “filled” and the filled column is set to 1.

Schema: Five-Minute Time Series

Source: 5min.json

A time series of biological measurements for a single radar station at a regular 5-minute interval. Measurements are obtained by interpolating the Scan-Level Time Series to a 5-minute time step.

Name Description Type Unit
station *

The 4-character station ID. See station metadata.

  • required: true
  • pattern: [A-Z]{4}
string
datetime *

UTC timestamp of record. For scan-level time series, this is the timestamp of the radar volume scan. For 5-minute time series, it is a regularly-spaced 5-minute timestamp.

  • required: true
datetime
solar_elevation *

Sun elevation in degrees above horizon (negative for below horizon) at the radar station’s location for the date and time of the record.

  • required: true
number degree
period_date *

The local date of the start of the period (night or day) to which the record belongs. See the period field.

  • required: true
date
period *

The period (night or day) to which the record belongs, as determined by the station’s location and the date and time of the record.

  • required: true
string
period_time *

Hours from the beginning of the period to the time of the current record. See the period field.

  • required: true
  • minimum: 0.0
number h
period_length *

Length of the period. See the period field.

  • required: true
  • minimum: 0.0
number h
reflectivity

Vertically integrated reflectivity. Represents total radar cross section in cm2 in a vertical column above one square kilometer of the earth’s surface. Only includes scattering volumes identified as biology.

  • minimum: 0.0
number cm2 km-2
reflectivity_unfiltered

Same as reflectivity, but includes all scattering volumes, including those classified as precipitation.

  • minimum: 0.0
number cm2 km-2
traffic_rate

Reflectivity traffic rate, computed by vertically integrating reflectivity multiplied by speed. Represents total radar cross section in cm2 crossing over a one kilometer transect in one hour, where the transect is adaptively chosen for each height bin to be perpendicular to the mean direction of travel in that height bin. Only includes scattering volumes identified as biology.

  • minimum: 0.0
number cm2 km-1 h-1
traffic_rate_unfiltered

Same as traffic_rate, but includes all scattering volumes, including those classified as precipitation.

  • minimum: 0.0
number cm2 km-1 h-1
u

Zonal (east-west) velocity component, computed as reflectivity-weighted average over height bins.

number m s-1
v

Meridional (north-south) velocity component, computed as reflectivity-weighted average over height bins.

number m s-1
speed

Mean (ground) speed of travel, computed as reflectivity-weighted average over height bins. This represents the average speed of scatterers across elevation bins, regardless of direction. This is different from, and generally higher than, the magnitude of the reflectivity-weighted average velocities u and v, because ‘cancellation’ of objects moving in different directions can occur when averaging velocities.

  • minimum: 0.0
number m s-1
direction

Mean direction of travel, computed directly from u and v. The angle is given as a compass bearing in degrees clockwise from north.

  • minimum: 0.0
  • maximum: 360.0
number degree
fraction_rain

Fraction of scattering volumes classified as precipitation.

  • minimum: 0.0
  • maximum: 1.0
number fraction
rmse

Root-mean squared error of VVP fit, reflectivity-weighted average over height bins.

  • minimum: 0.0
number m s-1
filled

Indicates whether the measurement fields are considered “filled”. True if the nearest record in the original scan-level time series data is more than 10 minutes but less than one hour from the timestamp (date) of this record. (If the time to the nearest measurement in the original data is more than one hour, the measurement fields are left blank.)

boolean

Combining and Unstacking Data

Both the scan-level and 5-minute time-series data are in a stacked format with one row per timestamp and station. This makes it easy to combine data for many stations or years by simply concatenating the rows:

import pandas as pd
files = ['KBOX-2017-5min.csv', 'KENX-2017-5min.csv']
df = pd.concat([pd.read_csv(file, parse_dates=['datetime']) for file in files])

For 5-minute time-series data, timestamps are shared across stations. Analysts may want to pivot the data to an unstacked format with one row per timestamp and columns corresponding to the same variable across different stations. In Python this can be done as follows:

df = df.pivot(
    index="datetime",
    columns="station",
    values="reflectivity"
)

This gives:

Source: 2017-5min-unstacked.csv

datetime KBOX KENX
2017-01-01 00:00:00+00:00 1.1082 0.2102
2017-01-01 00:05:00+00:00 1.3247 0.198
2017-01-01 00:10:00+00:00 1.6832 0.1749
2017-01-01 00:15:00+00:00 0.9276 0.373
2017-01-01 00:20:00+00:00 0.8962 0.2181
... ... ...