Model Selection and Evaluation
Numerical and statistical evaluation
OFV
- Objective Function Value
- minimising the -2LL
- lower OFV indicates a better fit
- likelihood ratio test [📖]
- for nested models (complex models which can be collapsed to the simpler one)
- not useful to compare distinct models
- \(χ^2\) distribution
- null hypothesis: no difference between models
- hypothesis: difference between models
- significance level (α) of 0.01:
- OFV of 6.63 (degrees of freedom=1 or an increase of 1 parameter)
- OFV of 9.21 (degrees of freedom=2 or an increase of 2 parameters)
- etc.
\(n\) is the number of observations;
\(y_i\) is the observed value for the \(i^{th}\) observation;
\(\hat{y}_i\) is the predicted/expected value for the \(i^{th}\) observation;
\(\text{var}(y_i)\) is the variance of the \(i^{th}\) observation.
| Degrees of Freedom | α=0.05 | α=0.01 | α=0.001 |
|---|---|---|---|
| 1 | 3.841 | 6.635 | 10.828 |
| 2 | 5.991 | 9.210 | 13.816 |
| 3 | 7.815 | 11.345 | 16.266 |
| 4 | 9.488 | 13.277 | 18.467 |
| 5 | 11.070 | 15.086 | 20.515 |
| 6 | 12.592 | 16.812 | 22.458 |
| 7 | 14.067 | 18.475 | 24.322 |
| 8 | 15.507 | 20.090 | 26.125 |
| 9 | 16.919 | 21.666 | 27.877 |
| 10 | 18.307 | 23.209 | 29.588 |
AIC
- Akaike Information Criterion
- penalties are applied as a function of increased number of parameters
- [📖] [📖]
- not nested
- lower AIC indicates the better fit
BIC
- Bayesian Information Criterion
- penalties are applied as a function of increased number of parameters [📖] [📖]
- not nested
- lower BIC indicates the better fit
\(\text{OFV} = -2 \times \ln(L)\) - Objective Function Value, where \(L\) is the likelihood of the model;
\(p_{\text{random}}\) - number of random effects parameters;
\(p_{\text{fixed}}\) - number of fixed effects parameters;
\(n_{\text{id}}\) - number of unique individuals (clusters);
\(n_{\text{obs}}\) - number of observations.
Graphical evaluation
- models can be evaluated by visual examination of data plots [📖]
GOF
- Standard Goodness of Fit
- comparison of the predicted versus the observed concentrations
- observations should be scattered evenly around the line of identity
- CWRES (conditional weighted residuals)
- CWRES vs. population predictions
- identification of concentration-dependencies
- to assess appropriateness of the RUV model
- CWRES vs. time
- identification of time-dependencies
- specification appears in the absorption or the elimination phase.
VPC
- Visual Predictive Check
- model diagnostics [📖]
- constructing simulated data using the developed model and
- comparing it with the existing dataset
- simulation-based graphical evaluation
- to evaluate predictive performance of a model
- the ability of a model to reproduce the observed data
- procedure [📖]
- The percentiles of interest (commonly 5th,50th and 95th)
- the confidence interval of respective percentiles for the simulated concentrations
- compared graphically with the same percentiles of the observed concentrations
- derived for selected time ranges (bins)
- not at every time to ease the comparison
- percentiles of the simulated and observed data are compared graphically [📖]
- Categorical VPC is a useful tool to evaluate performance for categorical data [📖]
Evaluation of uncertainty in parameter estimates
- variance-covariance matrix
- generated in NONMEM
- standard errors of the parameter estimates
- the square root of the diagonal elements in variance-covariance matrix
- %RSE relative standard error
- to evaluate parameter precision for fixed-effects
- <30% are acceptable [📖]
\(θ\) the final population parameter;
\(SE(θ)\) standard error of the population parameter.
- %RSE for random-effects parameters
- 40-50% is acceptable [📖]
\(\omega^2\) final variance;
\(SE(\omega^2)\) standard error of final variance.
-
Bootstrap method [📖]
- generated from the original dataset
- sampling individuals with replacement
- new parameter estimates are generated
- derived confidence interval (e.g. 95% CI)
- \(>200\) datasets may be needed to generate the standard errors
- can be generated using PsN software
-
Log-Likelihood profiling
- to assess if the OFV from the final model refers to the global minimum
- surface of the likelihood between the full and reduced model
- re-estimation by fixing the respective parameter to a slightly different estimate (e.g. ±5% or ±20%)
- until the selected significant difference in likelihood (e.g. ΔOFV: 3.84, df=1, α=0.05) is achieved
- the lower and upper boarder of the 95% confidence interval for the parameter has been reached [📖] [📖]
Influential Individuals
- may have a large impact on
- model selection
- parameter estimates
- Comparison of individual OFV in the NONMEM output
- Case-deletion diagnostics
- new datasets where one individual has been removed
- influential individual if
- a relative change in parameter estimates of ±20% [📖]
Simulations
- Deterministic
- do not consider the random-effects parameters of the model
- generating the typical concentration-time profile for a given set of covariates
- useful to visualise and assess which impact changes in dose will have on e.g. exposure
- Stochastic
- consider the random-effects parameters
- used when generating VPCs
- require appropriate precision of all parameters
- to guide dose selection
- to compare different dosing scenarios