The regulatory Community Multiscale Air Quality (CMAQ) model is a means
to understanding the sources, concentrations and regulatory attainment of air
pollutants within a model’s domain. Substantial resources are allocated
to the evaluation of model performance. The Regionalized Air quality Model
Performance (RAMP) method introduced here explores novel ways of visualizing and
evaluating CMAQ model performance and errors for daily Particulate Matter
≤ 2.5 micrometers (PM2.5) concentrations across the continental United
States. The RAMP method performs a non-homogenous, non-linear, non-homoscedastic
model performance evaluation at each CMAQ grid. This work demonstrates that CMAQ
model performance, for a well-documented 2001 regulatory episode, is
non-homogeneous across space/time. The RAMP correction of systematic errors
outperforms other model evaluation methods as demonstrated by a 22.1%
reduction in Mean Square Error compared to a constant domain wide correction.
The RAMP method is able to accurately reproduce simulated performance with a
correlation of

Particulate Matter ≤ 2.5 micrometers in diameter (PM2.5) is one of
the six “criteria air pollutants” regulated in the United States
(

The goal of this work is to address this significant knowledge gap by introducing a method that assesses model performance at any space/time region of interest across the spatiotemporal continuum. Advantages for assessing model performance at any region across a continuum include being able to 1) exactly delineate geographical patterns of modeling errors and 2) correct systematic errors across the modeling domain for individual CMAQ grid concentrations.

Systematic errors are consistent deviations of modeled data from observed
data. Systematic errors, once assessed, can be used to correct the modeled value.
The remaining error, i.e. the random noise of the modeled value around the observed
data, is the random error. While current CMAQ model performance evaluation methods
are multifaceted (

The Regionalized Air quality Model Performance (RAMP) method introduced in
this work assesses model performance across the spatiotemporal continuum of daily
PM2.5 across the continental US. Our framework is a regionalized space/time
extension of the Constant Air quality Model Performance (CAMP) method (

This work demonstrates the use of the RAMP for daily PM2.5 mass predicted by CMAQ across the entirety of the continental United States. As an evaluation of the RAMP method, we have chosen a regulatory episode developed for the years 2001 and 2002. The model performance for this episode has been well documented and thus provides an ideal case study. The results of the RAMP analysis include maps showing the geographical variations of systematic and random errors displayed at the resolution of an individual CMAQ grid cell. These results provide new insights about model performance that complement existing performance evaluation methods. The RAMP results are helpful in making decision on resource allocation for further improvement in the air quality model. Furthermore, calculating systematic errors for individual CMAQ grids facilitate systematic error correction leading to maps of PM2.5 concentrations with improved mapping accuracy.

Daily observed PM2.5 for each space/time location during
2000–2002 were constructed based on monitoring data from monitoring
stations measuring either hourly or daily PM2.5 obtained from the EPA’s
Air Quality Systems (AQS) data base (

Random variables

In this work metrics are geared towards dividing error in a dichotomous
manner. Namely, metrics are divided into systematic and random errors.
Systematic errors are consistent errors between observed and modeled CMAQ data
and can be removed through calculating the mean systematic error. Random errors
are the residual errors remaining once the systematic error is removed. Random
errors can be conceptualized as the random noise between CMAQ and observed data.
Total error is the sum of the two. In the naming convention of a statistic the
first letter(s) is used to identify the statistical operator as follows: M=mean,
V=variance, S=Standard deviation, RMS=square Root of the Mean of Squared values.
The last letter(s) is used to identify the value of interest as follows: E=Error
(^{2}, S=Standardized
error=_{E},
NE=Normalized Error=E/^{2}(𝒟) quantifies the
systematic error, ^{2}(𝒟)
+

The CAMP method (_{k}_{1}
(_{k}_{k}_{2}
(_{k}_{k}_{k}_{k}_{i}_{i}_{k}_{i}_{k}_{i}_{i}

The CAMP method does not investigate how λ_{1}
(_{k}_{2} (_{k}

The Regionalized Air quality Model Performance (RAMP) method introduced
here consists of extending the CAMP method (_{1} (_{k}_{k}_{2}
(_{k}_{k}_{k}_{k}

An efficient numerical implementation of the calculation of
λ_{1} (_{k}_{2}
(_{k}_{i}_{i}_{i}_{l}_{1,l}
(_{l}_{2,l}
(_{l}_{1}
(_{k}_{2}
(_{k}_{i}_{1}
(_{k}_{2}
(_{k}_{k}_{k}^{3},
λ_{1} (_{k}^{3} and

There is a correspondence between the parameters λ_{1}
(_{k}_{2}
(_{k}_{1}
(_{k}_{2}
(_{k}_{k}_{k}_{k}

We also define _{k}_{k}

The RAMP method provides the statistical distribution of observed air
pollution as

Validation is performed by comparing the accuracy of the model
correction performed by three approaches: the Constant, CAMP and RAMP correction
methods. The Constant correction method is defined through _{k}_{1} (_{k}_{2} (_{k}_{k}_{k}_{1} (_{1}
(_{2} (

We also conduct a stochastic simulation to test how well each method
reproduces the simulated values. The maps of λ_{1}
(_{2}
(_{1} (_{2} (_{1}*(_{2}*(_{1}*(_{2}*(_{1} (_{2} (_{1}*(_{2}*(_{1}
(_{2}
(

A demonstration of the RAMP method was performed using daily PM2.5
concentrations predicted by CMAQv4.5 at the 36 km grid level for 2001 across the
continental United States. CMAQv4.5 is the most recent version available for
2001 across the continental US. Although newer versions of CMAQ exist for later
years, it was critical to analyze model performance in 2001 due to an ongoing
epidemiological study focused on novel neurodegenerative PM2.5 health end points
and its association with loss of brain mass in older women (

Results of the RAMP analysis can be visualized for July 1, 2001 (^{2}(^{2}(^{2}(

The domain wide model performance of CMAQ is assessed by the performance
statistics ^{3}), indicating
that CMAQv4.5 has systematic errors that underestimates PM2.5 by 1.05
µ^{3} across the
continental United States in 2001 on average. Interestingly, ^{3})^{2} and
a precision quantified by a correlation

The validation statistics of three model performance evaluation methods
(Constant, CAMP and RAMP) are shown in _{1} (_{2} (

Validation of λ_{1} (_{1} (^{3}) for CMAQ to
0.0304 (µ^{3}), 0.0281
(µ^{3}) and
−0.0202 (µ^{3}) for
the Constant, CAMP and RAMP methods, respectively. This was expected by design
due to each method eliminating systematic errors across 𝒟. The model
performance evaluation methods differ in their abilities to reduce random
errors, as demonstrated by the ^{3}) for CMAQ to
7.18 (µ^{3}), 6.58
(µ^{3}) and 6.34
(µ^{3}) for the
Constant, CAMP and RAMP methods, respectively. This translates in a total error
that is lower for RAMP (^{3})^{2})
than for CAMP (^{3})^{2})
and the Constant method (^{3})^{2}).
This corresponds to a 22.1% reduction in MSE from the Constant to the
RAMP method. This finding is further confirmed by the correlation between
observed and λ_{1} (_{1} (

Validation of λ_{2} (_{2}
(_{2} (^{3} across 𝒟
on average. The ^{3}, indicating that
the Constant method leads to a substantial overestimation of random errors by
50.5% over RAMP estimates. The overestimation of random error is
attenuated with the CAMP method, which has an ^{3}
corresponding to a 29.2% overestimation compared to the RAMP
estimates.

Overall these validation results demonstrate that the RAMP method
provides a λ_{1} (_{2} (

The map of the true systematic error
_{1} (_{1}*(_{1} (_{1}*(

Similar results were found when comparing the true λ_{2}
(_{2}*(_{2}*(_{2}
(_{2}*(_{2}*(_{2} (_{2}*(

These results demonstrate that in situations where there is regional variability in model performance, the RAMP method is better able to estimate the spatial variability of systematic errors compared to the Constant and CAMP methods. This implies the RAMP method should be considered for performance evaluation in future studies when it is plausible for model performance to vary spatially.

This work contributes novel evidence that the performance of air quality
models is non-linear and non-homoscedastic. That is, λ_{1} and
λ_{2} are a non-linear function of the modeled value
_{k}_{1}
− _{k}_{2} do
not vary as a function of _{k}_{1} and λ_{2} are
non-linear functions of _{k}^{3})^{2} for
the Constant method to 43.3
(µ^{3})^{2} for
the CAMP method, corresponding to a 16% reduction in MSE that
demonstrates that model performance improves for a non-linear and
non-homoscedastic model. In the stochastic simulation results, the Constant
method is unable to capture the spatial variability in systematic and random
errors whereas the CAMP method is able to capture domain-wide variability of
these errors. Furthermore, both the validation and stochastic simulation results
indicate that the Constant method significantly over predicts random errors
compared to the CAMP method. Finally, the non-homoscedastic behavior in model
performance is evidenced by maps of λ_{2}
(_{k}_{k}_{k}

From these results, one should be cautious when using linear and
homoscedastic model performance evaluation methods to explore the spatial
variability of model performance. This is the usual practice of current
approaches in which models can be expressed as
_{0}(_{1}(_{0}(_{1}(

To better understand the magnitude of the systematic errors
^{2}(_{1}
(

The RAMP analysis provides a map of
^{2}(^{2}(^{2}(^{3})^{2}).
The areas of high systematic error are quantified as follows: (1) the Great
Lakes (15,552 ^{2}), (2) the Appalachian Mountains
(116,640 ^{2}), (3) the South East (38,880
^{2}), (4) Southern California (73,872
^{2}), (5) Northern California (75,168
^{2}) and (6) the Rocky Mountains (290,304
^{2}).

Some of the regions identified for their high systematic errors are
corroborated in the literature. The over prediction in region 1 (the Great
Lakes) is in line with an overestimation of residential wood burning in the
region reported in the National Emissions Inventory (NEI) (

The map of

This work introduces a spatiotemporal approach that can estimate and
distinguish systematic error from random error of predictions made by regulatory air
quality models at any location within a given modeling domain. The estimation of
systematic and random errors is created in a manner that does not assume that the
relationship between observed and modeled values is linear or homoscedastic, and
estimation of errors is performed in a manner that is regionalized. By estimating
errors across a continuous geographical domain for a given day of interest, this
approach permits the production of maps delineating areas of high errors. These maps
are useful to 1) assess model performance by quantifying systematic and random
errors at a fine spatial resolution across the entire space/time domain where
monitoring does not exist and 2) do a model correction of systematic errors of the
CMAQv4.5 estimates of PM2.5 for 2001 for individual grids. Future works include
performing a data fusion of RAMP model corrected values and observations using the
geostatistical Bayesian Maximum Entropy (BME) method of PM2.5 (

This research was supported in part by the National Institute on Aging (NIA) under award number R01AG033078, the National Institute of Occupational Safety and Health (NIOSH) under grant 2T42/OH-008673 and the National Institute of Environmental Health Sciences (NIEHS) under grant T32ES007018. CMAQ modeling was performed by the US EPA. This research has not been formally reviewed by the EPA. The views expressed in this document are solely those of the authors and do not necessarily reflect those of the NIA, NIOSH, NIEHS or the EPA.

RAMP analysis for an arbitrary CMAQ grid location on July 1, 2001 for
daily PM2.5. The black dots are all the paired modeled and observed daily PM2.5
concentrations within a space/time region
ℛ(_{1,l}
(_{l}_{2,l}
(_{l}_{1}
(

Maps of RAMP error and RAMP error correction of CMAQ. Daily PM2.5 across
the continental United States on July 1, 2001 displaying (a) RAMP
^{2}(^{3} and (a) and (b)
are in (µ^{3})^{2}.
Plot (b) shows 6 regions of large random error delineated in the dashed green
line with the same regions delineated and labeled in (a). Delineated regions
include (1) the Great Lakes, (2) the Appalachian Mountains, (3) the South East,
(4) Southern California, (5) Northern California and (6) the Rocky
Mountains.

Map of RAMP mean error. Daily PM2.5 across the continental United States
on July 1, 2001 displaying ^{3}. The 6 regions of
high random error delineated in

Validation statistics. Statistics of the validation results of daily
paired observed PM2.5 and λ_{1}
(_{2}
(_{2} (

Statistic | CMAQ | CMAQ Corrected | ||
---|---|---|---|---|

Constant | Non-linear/Non | Non-linear/Non | ||

^{3}) | −1.05 | 0.0304 | 0.0281 | −0.0202 |

^{3}) | 7.77 | 7.18 | 6.58 | 6.34 |

^{3})^{2} | 61.5 | 51.5 | 43.3 | 40.1 |

0.589 | 0.625 | 0.631 | 0.698 | |

-- | 0.766 | 0.823 | 1.05 | |

-- | 0.875 | 0.907 | 1.03 | |

^{3}) | -- | 8.20 | 7.04 | 5.45 |

Error correction performed for individual CMAQ grids

Maps created showing model performance at unmonitored locations

Most error coming from CMAQ is random error

There is a need to evaluate model performance in a regionalized manner