Time Series Data
Time series data summarizes biological activity across time for each radar station. It is created by aggregating vertical profile data to the volume scan level by integrating across the vertical dimension.
Table of contents
Definitions of Daily Periods, Sunrise, and Sunset
For convenience, time series data gives times relative to local sunrise/sunset, as well as other information about daily periods and solar elevation, in addition to giving absolute UTC times.
Three different periods are defined for each radar station and date:
day: Between local sunrise and local sunset on the specified date.night: Between local sunset on the specified date and local sunrise the next day.utc_calendar_day: From 00:00 UTC to 23:59 UTC on the specified date.
Sunrise and sunset are defined as the times when the solar elevation (the angle between the sun’s center and the horizon) is equal to -0.8333 degrees, which, with standard refraction, is the moment when sun’s upper limb crosses the horizon. Calculations are done using the NREL Solar Position Algorithm (Reda and Andreas; 2003), as implemented in the Python pvlib package.
Reference: Reda, Ibrahim, and Afshin Andreas. “Solar position algorithm for solar radiation applications.” Solar energy 76.5 (2004): 577-589.
Scan-Level Time Series
The scan-level time series has one record per radar volume scan, which occurs every 4 to 10 minutes for each station. Records are grouped into files by station and year, and organized into directories by year, so an example file would be:
scans/2017/KBOX-2017.csv
The data looks like this:
Source: KBOX-2017.csv
| station | datetime | solar_elevation | period_date | period | period_time | period_length | reflectivity | reflectivity_unfiltered | traffic_rate | traffic_rate_unfiltered | u | v | speed | direction | fraction_rain | rmse |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| KBOX | 2017-01-01T00:03:44Z | -28.9065 | 2016-12-31 | night | 2.6720 | 14.8185 | 1.1082 | 30.2624 | 129.7419 | 3609.4937 | 17.7099 | 26.9328 | 32.5198 | 33.3273 | 0.0441 | 11.7738 |
| KBOX | 2017-01-01T00:08:40Z | -29.8189 | 2016-12-31 | night | 2.7542 | 14.8185 | 1.9516 | 42.3745 | 234.6426 | 5096.9646 | 18.5884 | 27.0999 | 33.3982 | 34.4470 | 0.0380 | 11.8894 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Schema: Scan-Level Time Series
Source: scan.json
A time series of biology measurements for radar volume scans. The measurements are obtained by vertically integrating the profile data. Missing values may rarely occur when a dependent variable in the profile is missing at all elevations.
| Name | Description | Type | Unit |
|---|---|---|---|
station * | The 4-character station ID. See station metadata.
| string | |
datetime * | UTC timestamp of record. For scan-level time series, this is the timestamp of the radar volume scan. For 5-minute time series, it is a regularly-spaced 5-minute timestamp.
| datetime | |
solar_elevation * | Sun elevation in degrees above horizon (negative for below horizon) at the radar station’s location for the date and time of the record.
| number | degree |
period_date * | The local date of the start of the period (night or day) to which the record belongs. See the
| date | |
period * | The period (night or day) to which the record belongs, as determined by the station’s location and the date and time of the record.
| string | |
period_time * | Hours from the beginning of the period to the time of the current record. See the
| number | h |
period_length * | Length of the period. See the
| number | h |
reflectivity | Vertically integrated reflectivity. Represents total radar cross section in cm2 in a vertical column above one square kilometer of the earth’s surface. Only includes scattering volumes identified as biology.
| number | cm2 km-2 |
reflectivity_unfiltered | Same as
| number | cm2 km-2 |
traffic_rate | Reflectivity traffic rate, computed by vertically integrating reflectivity multiplied by speed. Represents total radar cross section in cm2 crossing over a one kilometer transect in one hour, where the transect is adaptively chosen for each height bin to be perpendicular to the mean direction of travel in that height bin. Only includes scattering volumes identified as biology.
| number | cm2 km-1 h-1 |
traffic_rate_unfiltered | Same as
| number | cm2 km-1 h-1 |
u | Zonal (east-west) velocity component, computed as reflectivity-weighted average over height bins. | number | m s-1 |
v | Meridional (north-south) velocity component, computed as reflectivity-weighted average over height bins. | number | m s-1 |
speed | Mean (ground) speed of travel, computed as reflectivity-weighted average over height bins. This represents the average speed of scatterers across elevation bins, regardless of direction. This is different from, and generally higher than, the magnitude of the reflectivity-weighted average velocities
| number | m s-1 |
direction | Mean direction of travel, computed directly from
| number | degree |
fraction_rain | Fraction of scattering volumes classified as precipitation.
| number | fraction |
rmse | Root-mean squared error of VVP fit, reflectivity-weighted average over height bins.
| number | m s-1 |
5-Minute Time Series
The 5-Minute Time Series data resamples the Scan-Level Time Series data to a fixed time step of five minutes (instead of the original irregular time step of 4 to 10 minutes), which often makes it easier to analyze and combine with other time series data.
There is one file per station-year, organized into directories by year. An example file would be:
5min/2017/KBOX-2017-5min.csv
The data is similar to the scan-level time series. It looks like this:
Source: KBOX-2017-5min.csv
| station | datetime | solar_elevation | period_date | period | period_time | period_length | reflectivity | reflectivity_unfiltered | traffic_rate | traffic_rate_unfiltered | u | v | speed | direction | fraction_rain | rmse | filled |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| KBOX | 2017-01-01T00:00:00Z | -28.2169 | 2016-12-31 | night | 2.6098 | 14.8185 | 1.1082 | 30.2624 | 129.7419 | 3609.4937 | 17.7099 | 26.9328 | 32.5198 | 33.3273 | 0.0441 | 11.7738 | 0 |
| KBOX | 2017-01-01T00:05:00Z | -29.1406 | 2016-12-31 | night | 2.6931 | 14.8185 | 1.3247 | 33.3723 | 156.6759 | 3991.4119 | 17.9355 | 26.9757 | 32.7453 | 33.6189 | 0.0425 | 11.8035 | 0 |
| KBOX | 2017-01-01T00:10:00Z | -30.0656 | 2016-12-31 | night | 2.7764 | 14.8185 | 1.6832 | 37.7803 | 199.5330 | 4535.2454 | 17.7703 | 26.7170 | 32.5709 | 33.6292 | 0.0440 | 11.9710 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
- The
datetimecolumn now has timestamps at regular 5-minute intervals, and there is a newfilledcolumn recording when values are considered to be “filled in”. - The metadata columns from
solar_elevationthroughperiod_lengthare computed from the location and timestamp given bystationanddatetime. - The data fields from
densitythroughrmseare interpolated from the scan-level data using linear interpolation if the time to the nearest measurement is less than one hour, and left empty otherwise. If the time to the nearest measurement is less than one hour but greater than 10 minutes, the measurement is considered “filled” and thefilledcolumn is set to1.
Schema: Five-Minute Time Series
Source: 5min.json
A time series of biological measurements for a single radar station at a regular 5-minute interval. Measurements are obtained by interpolating the Scan-Level Time Series to a 5-minute time step.
| Name | Description | Type | Unit |
|---|---|---|---|
station * | The 4-character station ID. See station metadata.
| string | |
datetime * | UTC timestamp of record. For scan-level time series, this is the timestamp of the radar volume scan. For 5-minute time series, it is a regularly-spaced 5-minute timestamp.
| datetime | |
solar_elevation * | Sun elevation in degrees above horizon (negative for below horizon) at the radar station’s location for the date and time of the record.
| number | degree |
period_date * | The local date of the start of the period (night or day) to which the record belongs. See the
| date | |
period * | The period (night or day) to which the record belongs, as determined by the station’s location and the date and time of the record.
| string | |
period_time * | Hours from the beginning of the period to the time of the current record. See the
| number | h |
period_length * | Length of the period. See the
| number | h |
reflectivity | Vertically integrated reflectivity. Represents total radar cross section in cm2 in a vertical column above one square kilometer of the earth’s surface. Only includes scattering volumes identified as biology.
| number | cm2 km-2 |
reflectivity_unfiltered | Same as
| number | cm2 km-2 |
traffic_rate | Reflectivity traffic rate, computed by vertically integrating reflectivity multiplied by speed. Represents total radar cross section in cm2 crossing over a one kilometer transect in one hour, where the transect is adaptively chosen for each height bin to be perpendicular to the mean direction of travel in that height bin. Only includes scattering volumes identified as biology.
| number | cm2 km-1 h-1 |
traffic_rate_unfiltered | Same as
| number | cm2 km-1 h-1 |
u | Zonal (east-west) velocity component, computed as reflectivity-weighted average over height bins. | number | m s-1 |
v | Meridional (north-south) velocity component, computed as reflectivity-weighted average over height bins. | number | m s-1 |
speed | Mean (ground) speed of travel, computed as reflectivity-weighted average over height bins. This represents the average speed of scatterers across elevation bins, regardless of direction. This is different from, and generally higher than, the magnitude of the reflectivity-weighted average velocities
| number | m s-1 |
direction | Mean direction of travel, computed directly from
| number | degree |
fraction_rain | Fraction of scattering volumes classified as precipitation.
| number | fraction |
rmse | Root-mean squared error of VVP fit, reflectivity-weighted average over height bins.
| number | m s-1 |
filled | Indicates whether the measurement fields are considered “filled”. True if the nearest record in the original scan-level time series data is more than 10 minutes but less than one hour from the timestamp ( | boolean |
Combining and Unstacking Data
Both the scan-level and 5-minute time-series data are in a stacked format with one row per timestamp and station. This makes it easy to combine data for many stations or years by simply concatenating the rows:
import pandas as pd
files = ['KBOX-2017-5min.csv', 'KENX-2017-5min.csv']
df = pd.concat([pd.read_csv(file, parse_dates=['datetime']) for file in files])
For 5-minute time-series data, timestamps are shared across stations. Analysts may want to pivot the data to an unstacked format with one row per timestamp and columns corresponding to the same variable across different stations. In Python this can be done as follows:
df = df.pivot(
index="datetime",
columns="station",
values="reflectivity"
)
This gives:
Source: 2017-5min-unstacked.csv
| datetime | KBOX | KENX |
|---|---|---|
| 2017-01-01 00:00:00+00:00 | 1.1082 | 0.2102 |
| 2017-01-01 00:05:00+00:00 | 1.3247 | 0.198 |
| 2017-01-01 00:10:00+00:00 | 1.6832 | 0.1749 |
| 2017-01-01 00:15:00+00:00 | 0.9276 | 0.373 |
| 2017-01-01 00:20:00+00:00 | 0.8962 | 0.2181 |
| ... | ... | ... |