Package: psfmi 1.4.0

psfmi: Prediction Model Pooling, Selection and Performance Evaluation Across Multiply Imputed Datasets

Pooling, backward and forward selection of linear, logistic and Cox regression models in multiply imputed datasets. Backward and forward selection can be done from the pooled model using Rubin's Rules (RR), the D1, D2, D3, D4 and the median p-values method. This is also possible for Mixed models. The models can contain continuous, dichotomous, categorical and restricted cubic spline predictors and interaction terms between all these type of predictors. The stability of the models can be evaluated using (cluster) bootstrapping. The package further contains functions to pool model performance measures as ROC/AUC, Reclassification, R-squared, scaled Brier score, H&L test and calibration plots for logistic regression models. Internal validation can be done across multiply imputed datasets with cross-validation or bootstrapping. The adjusted intercept after shrinkage of pooled regression coefficients can be obtained. Backward and forward selection as part of internal validation is possible. A function to externally validate logistic prediction models in multiple imputed datasets is available and a function to compare models. For Cox models a strata variable can be included. Eekhout (2017) <doi:10.1186/s12874-017-0404-7>. Wiel (2009) <doi:10.1093/biostatistics/kxp011>. Marshall (2009) <doi:10.1186/1471-2288-9-57>.

Authors:Martijn Heymans [cre, aut], Iris Eekhout [ctb]

psfmi_1.4.0.tar.gz
psfmi_1.4.0.zip(r-4.5)psfmi_1.4.0.zip(r-4.4)
psfmi_1.4.0.tgz(r-4.5-any)psfmi_1.4.0.tgz(r-4.4-any)
psfmi_1.4.0.tar.gz(r-4.5-noble)psfmi_1.4.0.tar.gz(r-4.4-noble)
psfmi_1.4.0.tgz(r-4.4-emscripten)
psfmi.pdf |psfmi.html✨
psfmi/json (API)
NEWS

# Install 'psfmi' in R:

install.packages('psfmi', repos = c('https://mwheymans.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/mwheymans/psfmi/issues

Pkgdown site:https://mwheymans.github.io

Datasets:

anderson - Data from a placebo-controlled RCT with leukemia patients
aortadis - Dataset of patients with a aortadissection
bmd - Data of a non-experimental study in more than 300 elderly women
chlrform - Data about concentration of ß2-microglobuline in urine as indicator for possible damage to the kidney
chol_long - Long dataset of persons from the The Amsterdam Growth and Health Longitudinal Study
chol_wide - Wide dataset of persons from the The Amsterdam Growth and Health Longitudinal Study
day2_dataset4_mi - Dataset of low back pain patients with missing values
hipstudy - Dataset of elderly patients with a hip fracture
hipstudy_external - External Dataset of elderly patients with a hip fracture
hoorn_basic - Dataset of the Hoorn Study
infarct - Data of a patient-control study regarding the relationship between MI and smoking
ipdna_md - Example dataset for the psfmi_mm function
lbp_orig - Example dataset for psfmi_perform function, method boot_MI
lbpmi_extval - Example dataset of Low Back Pain Patients for external validation
lbpmicox - Example dataset for psfmi_coxr function
lbpmilr - Example dataset for psfmi_lr function
lbpmilr_dev - Example dataset for mivalext_lr function
lungvolume - Data of the development of lung and heartvolume of unborn babies
mammaca - Data of a study among women with breast cancer
men - Data of 613 patients with meningitis
sbp_age - Dataset with blood pressure measurements
sbp_qas - Dataset with blood pressure measurements
smoking - Survival data about smoking
weight - Dataset of persons from the The Amsterdam Growth and Health Longitudinal Study

On CRAN:

cox-regression imputation imputed-datasets logistic multiple-imputation pool predictor regression selection spline spline-predictors

7.17 score 10 stars 70 scripts 505 downloads 7 mentions 48 exports 145 dependencies

Last updated 2 years agofrom:6afb51f1f1. Checks:7 OK. Indexed: yes.

Target	Result	Latest binary
Doc / Vignettes	OK	Mar 08 2025
R-4.5-win	OK	Mar 08 2025
R-4.5-mac	OK	Mar 08 2025
R-4.5-linux	OK	Mar 08 2025
R-4.4-win	OK	Mar 08 2025
R-4.4-mac	OK	Mar 08 2025
R-4.4-linux	OK	Mar 08 2025

Exports:boot_MI bw_single clean_P coxph_bw coxph_fw cv_MI cv_MI_RR glm_bw glm_fw hoslem_test km_estimates km_fit mean_auc_log MI_boot MI_cv_naive miceImp mivalext_lr nri_cox nri_est pool_auc pool_compare_models pool_D2 pool_D4 pool_intadj pool_performance pool_performance_internal pool_reclassification pool_RR psfmi_coxr psfmi_coxr_bw psfmi_coxr_fw psfmi_lm psfmi_lm_bw psfmi_lm_fw psfmi_lr psfmi_lr_bw psfmi_lr_fw psfmi_mm psfmi_mm_multiparm psfmi_perform psfmi_stab psfmi_validate risk_coxph RR_diff_prop rsq_nagel rsq_surv scaled_brier stab_single

Dependencies:abind backports base64enc bit bit64 bitops boot broom bslib cachem car carData caTools checkmate cli clipr cluster codetools colorspace cowplot cpp11 crayon cvAUC data.table DBI Deriv digest doBy dplyr evaluate fansi farver fastmap fontawesome forcats foreach foreign Formula fs furrr future generics ggplot2 glmnet globals glue gplots gridExtra gtable gtools haven highr Hmisc hms htmlTable htmltools htmlwidgets isoband iterators jomo jquerylib jsonlite KernSmooth knitr labeling lattice lifecycle listenv lme4 magrittr MASS Matrix MatrixModels memoise mgcv mice microbenchmark mime minqa mitml mitools modelr multcomp munsell mvtnorm nlme nloptr nnet norm numDeriv ordinal pan parallelly pbkrtest pillar pkgconfig plyr polspline prettyunits pROC progress purrr quantreg R6 rappdirs rbibutils RColorBrewer Rcpp RcppEigen Rdpack readr reformulas rlang rmarkdown rms ROCR rpart rsample rstudioapi sandwich sass scales shape slider SparseM stringi stringr survival TH.data tibble tidyr tidyselect tinytex tzdb ucminf utf8 vctrs viridis viridisLite vroom warp withr xfun yaml zoo

Pool Model Performance

Martijn W Heymans

Rendered fromPool_Model_Performance.Rmdusingknitr::rmarkdownon Mar 08 2025.

Last update: 2023-06-17
Started: 2021-09-23

Pooling and Selection of Cox Regression Models

Martijn W Heymans

Rendered frompsfmi_CoxModels.Rmdusingknitr::rmarkdownon Mar 08 2025.

Last update: 2023-06-15
Started: 2020-06-30

Pooling and Selection of Linear Regression Models

Martijn W Heymans

Rendered frompsfmi_LinearModels.Rmdusingknitr::rmarkdownon Mar 08 2025.

Last update: 2021-08-29
Started: 2021-08-29

Pooling and Selection of Logistic Regression Models

Martijn W Heymans

Rendered frompsfmi_LogisticModels.Rmdusingknitr::rmarkdownon Mar 08 2025.

Last update: 2021-08-29
Started: 2020-06-30

Pooling AUC values

Martijn W Heymans

Rendered fromPooling_AUC_values.Rmdusingknitr::rmarkdownon Mar 08 2025.

Last update: 2023-06-17
Started: 2021-09-23

Working together: mice and psfmi

Martijn W Heymans

Rendered frompsfmi_mice.Rmdusingknitr::rmarkdownon Mar 08 2025.

Last update: 2021-08-29
Started: 2020-07-01

Help page	Topics
Data from a placebo-controlled RCT with leukemia patients	anderson
Dataset of patients with a aortadissection	aortadis
Data of a non-experimental study in more than 300 elderly women	bmd
Predictor selection function for backward selection of Linear and Logistic regression models.	bw_single
Data about concentration of ß2-microglobuline in urine as indicator for possible damage to the kidney	chlrform
Long dataset of persons from the The Amsterdam Growth and Health Longitudinal Study (AGHLS)	chol_long
Wide dataset of persons from the The Amsterdam Growth and Health Longitudinal Study (AGHLS)	chol_wide
Predictor selection function for backward selection of Cox regression models in single complete dataset.	coxph_bw
Predictor selection function for forward selection of Cox regression models in single complete dataset.	coxph_fw
Dataset of low back pain patients with missing values	day2_dataset4_mi
Function for backward selection of Linear and Logistic regression models.	glm_bw
Function for forward selection of Linear and Logistic regression models.	glm_fw
Dataset of elderly patients with a hip fracture	hipstudy
External Dataset of elderly patients with a hip fracture	hipstudy_external
Dataset of the Hoorn Study	hoorn_basic
Calculates the Hosmer and Lemeshow goodness of fit test.	hoslem_test
Data of a patient-control study regarding the relationship between MI and smoking	infarct
Example dataset for the psfmi_mm function	ipdna_md
Kaplan-Meier risk estimates for Net Reclassification Index analysis	km_estimates
Kaplan-Meier (KM) estimate at specific time point	km_fit
Example dataset for psfmi_perform function, method boot_MI	lbp_orig
Example dataset of Low Back Pain Patients for external validation	lbpmi_extval
Example dataset for psfmi_coxr function	lbpmicox
Example dataset for psfmi_lr function	lbpmilr
Example dataset for mivalext_lr function	lbpmilr_dev
Data of the development of lung and heartvolume of unborn babies	lungvolume
Data of a study among women with breast cancer	mammaca
Data of 613 patients with meningitis	men
External Validation of logistic prediction models in multiply imputed datasets	mivalext_lr
Net Reclassification Index for Cox Regression Models	nri_cox
Calculation of Net Reclassification Index measures	nri_est
Calculates the pooled C-statistic (Area Under the ROC Curve) across Multiply Imputed datasets	pool_auc
Compare the fit and performance of prediction models across Multipy Imputed data	pool_compare_models
Combines the Chi Square statistics across Multiply Imputed datasets	pool_D2
Pools the Likelihood Ratio tests across Multiply Imputed datasets ( method D4)	pool_D4
Provides pooled adjusted intercept after shrinkage of pooled coefficients in multiply imputed datasets	pool_intadj
Pooling performance measures across multiply imputed datasets	pool_performance
Function to pool NRI measures over Multiply Imputed datasets	pool_reclassification
Function to combine estimates by using Rubin's Rules	pool_RR
Pooling and Predictor selection function for backward or forward selection of Cox regression models across multiply imputed data.	psfmi_coxr
Pooling and Predictor selection function for backward or forward selection of Linear regression models across multiply imputed data.	psfmi_lm
Pooling and Predictor selection function for backward or forward selection of Logistic regression models across multiply imputed data.	psfmi_lr
Pooling and Predictor selection function for multilevel models in multiply imputed datasets	psfmi_mm
Multiparameter pooling methods called by psfmi_mm	psfmi_mm_multiparm
Internal validation and performance of logistic prediction models across Multiply Imputed datasets	psfmi_perform
Function to evaluate bootstrap predictor and model stability in multiply imputed datasets.	psfmi_stab
Internal validation and performance of logistic prediction models across Multiply Imputed datasets	psfmi_validate
Risk calculation at specific time point for Cox model	risk_coxph
Nagelkerke's R-square calculation for logistic regression / glm models	rsq_nagel
R-square calculation for Cox regression models	rsq_surv
Dataset with blood pressure measurements	sbp_age
Dataset with blood pressure measurements	sbp_qas
Calculates the scaled Brier score	scaled_brier
Survival data about smoking	smoking
Function to evaluate bootstrap predictor and model stability.	stab_single
Dataset of persons from the The Amsterdam Growth and Health Longitudinal Study (AGHLS)	weight

Package: psfmi 1.4.0

psfmi: Prediction Model Pooling, Selection and Performance Evaluation Across Multiply Imputed Datasets

Pool Model Performance

Pooling and Selection of Cox Regression Models

Pooling and Selection of Linear Regression Models

Pooling and Selection of Logistic Regression Models

Pooling AUC values

Working together: mice and psfmi

Citation

Development and contributors

Readme and manuals

Help Manual

Usage by other packages (reverse dependencies)