Estimate model uncertainty
Usage
bootstrap_error(
x,
n_samples = 50,
sd_x_ppm = NULL,
n_replicates = NULL,
sample_from = "gasdata",
rep_cols = NULL
)
# S3 method for class 'cfp_altres'
bootstrap_error(
x,
n_samples = 50,
sd_x_ppm = NULL,
n_replicates = NULL,
sample_from = "gasdata",
rep_cols = NULL
)
# S3 method for class 'cfp_dat'
bootstrap_error(
x,
n_samples = 50,
sd_x_ppm = NULL,
n_replicates = NULL,
sample_from = "gasdata",
rep_cols = NULL
)
# S3 method for class 'cfp_fgmod'
bootstrap_error(
x,
n_samples = 50,
sd_x_ppm = NULL,
n_replicates = NULL,
sample_from = "gasdata",
rep_cols = NULL
)
# S3 method for class 'cfp_pfmod'
bootstrap_error(
x,
n_samples = 50,
sd_x_ppm = NULL,
n_replicates = NULL,
sample_from = "gasdata",
rep_cols = NULL
)
make_bootstrap_model(
x,
n_samples = 50,
sd_x_ppm = NULL,
n_replicates = NULL,
sample_from = "gasdata",
rep_cols = NULL
)
# S3 method for class 'cfp_pfmod'
make_bootstrap_model(
x,
n_samples = 50,
sd_x_ppm = NULL,
n_replicates = NULL,
sample_from = "gasdata",
rep_cols = NULL
)
calculate_bootstrap_error(x, y)
# S3 method for class 'cfp_pfmod'
calculate_bootstrap_error(x, y)
Arguments
- x
A
cfp_pfres
model result from a call topro_flux()
.- n_samples
The number of samples to take in the bootstrapping.
- sd_x_ppm
An optional estimate of the standard deviation of x_ppm. Can be either
a single value applied equally to all
a data.frame with a column of the same name that maps a value to every observation depth. See
depth_structure()
for an easy way to create it.be provided as its own column already present in
x$gasdata
.
- n_replicates
The number of replicates to be generated if sd_x_ppm is set.
- sample_from
From which dataset to sample the bootstrapping dataset. Can either be
'gasdata'
or'soilphys'
or'both'
.- rep_cols
The id_cols that represent repetitions. If removed, the repetitions in soilphys of each profile must match in their structure exactly.
- y
The result of the bootstrap model.
Value
x with added columns DELTA_flux and DELTA_prod as an estimate of the error of of the corresponding columns in the same units.
General procedure
bootstrap_error()
is mostly a wrapper around two functions that can also be
run separately.
In make_bootstrap_model()
, for sample_from = "gasdata"
the
gasdata
concentration data is resampled for every depth and profile
a total number of n_samples
. This is done by randomly sampling the
observations at each depth without changing the number of observations but
while allowing replacing. If rep_cols
are given, these columns are
removed from the id_cols
and the resulting profiles combined as one.
For sample_from = "soilphys"
, the soilphys
data is combined
using the rep_cols
as repetitions. Among every remaining profile and
depth, one observation across all repetitions is chosen for each of
n_samples
. sample_from = "both"
applies both methods above.
Each newly sampled profile is identifiable by the
added bootstrap_id
column which is also added to id_cols
.
After this new model is run again, the bootstap error is calculated in
calculate_bootstrap_error()
. This is the standard deviation of the
production and flux parameters across all bootstrapped model runs and is
calculated for each profile and layer of the original model, or for each
distinct profile in the new model without rep_cols
.
These are returned together with the mean values of prod
, flux
and F0
across all runs in the PROFLUX
data.frame and can
thereby be extracted by efflux()
and production()
.
Artificial observations in gasdata
If there are not enough observations per depth (e.g.) because there is only
one measurement per depth, it is possible to create artificial observations
by providing n_replicates
and sd_x_ppm
. Here, every depth of
every profile is first averaged to its mean (redundant if there is only one
observation). Then, a random dataset of n_replicates
observations
is generated that is normally distributed around the mean with a standard
deviation (in ppm) of sd_x_ppm
. These observations are then resampled
as described above. Note that this error should be representative of the
sampling error in the field and not the measurement error of the measurement
device, which is much lower.