| Title: | Linear Regression and Logistic Regression with Missing Covariates |
|---|---|
| Description: | Estimate parameters of linear regression and logistic regression with missing covariates with missing data, perform model selection and prediction, using EM-type algorithms. Jiang W., Josse J., Lavielle M., TraumaBase Group (2020) <doi:10.1016/j.csda.2019.106907>. |
| Authors: | Wei Jiang [aut], Pavlo Mozharovskyi [ctb], Imke Mayer [ctb], Christophe Muller [ctb], Julie Josse [aut, cre] |
| Maintainer: | Julie Josse <[email protected]> |
| License: | GPL-3 |
| Version: | 1.1.0 |
| Built: | 2026-05-07 07:43:06 UTC |
| Source: | https://github.com/julierennes/misaem |
Given all the possible patterns of missingness.
combinations(p)combinations(p)
p |
Dimension of covariates. |
A matrix containing all the possible missing patterns. Each row indicates a pattern of missingness. "1" means "observed", 0 means "missing".
comb = combinations(5)comb = combinations(5)
Function for imputing single point for linear regression model
imputeEllP(point, Sigma.inv)imputeEllP(point, Sigma.inv)
point |
A single observation containing missing values. |
Sigma.inv |
Inverse of estimated |
Imputed observation.
This function initializes missing values in a data frame using mean imputation.
initialize_df(x, method = "mean", seed = NULL)initialize_df(x, method = "mean", seed = NULL)
x |
A data frame or matrix containing missing values to be initialized. If a matrix is provided, it will be converted to a data frame internally. |
method |
A character string specifying the initialization method. Currently, only "mean" is supported. |
seed |
An optional integer to set the random seed for reproducibility. Not used in the current implementation. |
Missing values are replaced with the column means.
A data frame or matrix with missing values initialized using the mean of each column.
Used in main function miss.saem. Calculate the observed log-likelihood for logistic regression model with missing data, using Monte Carlo version of Louis formula.
likelihood_saem( beta, mu, Sigma, Y, X.obs, rindic = as.matrix(is.na(X.obs)), whichcolXmissing = (1:ncol(rindic))[apply(rindic, 2, sum) > 0], mc.size = 2 )likelihood_saem( beta, mu, Sigma, Y, X.obs, rindic = as.matrix(is.na(X.obs)), whichcolXmissing = (1:ncol(rindic))[apply(rindic, 2, sum) > 0], mc.size = 2 )
beta |
Estimated parameter of logistic regression model. |
mu |
Estimated parameter |
Sigma |
Estimated parameter |
Y |
Response vector |
X.obs |
Design matrix with missingness |
rindic |
Missing pattern of X.obs. If a component in X.obs is missing, the corresponding position in rindic is 1; else 0. |
whichcolXmissing |
The column index in covariate containing at least one missing observation. |
mc.size |
Monte Carlo sampling size. |
Observed log-likelihood.
# Generate dataset N <- 50 # number of subjects p <- 3 # number of explanatory variables mu.star <- rep(0,p) # mean of the explanatory variables Sigma.star <- diag(rep(1,p)) # covariance beta.star <- c(1, 1, 0) # coefficients beta0.star <- 0 # intercept beta.true = c(beta0.star,beta.star) X.complete <- matrix(rnorm(N*p), nrow=N)%*%chol(Sigma.star) + matrix(rep(mu.star,N), nrow=N, byrow = TRUE) p1 <- 1/(1+exp(-X.complete%*%beta.star-beta0.star)) y <- as.numeric(runif(N)<p1) # Generate missingness p.miss <- 0.10 patterns <- runif(N*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA # Observed log-likelihood ll_obs = likelihood_saem(beta.true,mu.star,Sigma.star,y,X.obs)# Generate dataset N <- 50 # number of subjects p <- 3 # number of explanatory variables mu.star <- rep(0,p) # mean of the explanatory variables Sigma.star <- diag(rep(1,p)) # covariance beta.star <- c(1, 1, 0) # coefficients beta0.star <- 0 # intercept beta.true = c(beta0.star,beta.star) X.complete <- matrix(rnorm(N*p), nrow=N)%*%chol(Sigma.star) + matrix(rep(mu.star,N), nrow=N, byrow = TRUE) p1 <- 1/(1+exp(-X.complete%*%beta.star-beta0.star)) y <- as.numeric(runif(N)<p1) # Generate missingness p.miss <- 0.10 patterns <- runif(N*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA # Observed log-likelihood ll_obs = likelihood_saem(beta.true,mu.star,Sigma.star,y,X.obs)
Calculate the likelihood or log-likelihood for one observation of logistic regression model .
log_reg(y, x, beta, iflog = TRUE)log_reg(y, x, beta, iflog = TRUE)
y |
Response value (0 or 1). |
x |
Covariate vector of dimension |
beta |
Estimated parameter of logistic regression model. |
iflog |
If TRUE, log_reg calculate the log-likelihood; else likelihood. |
Likelihood or log-likelihood.
Used in main function miss.saem. Calculate the variance of estimated parameters for logistic regression model with missing data, using Monte Carlo version of Louis formula.
louis_lr_saem( beta, mu, Sigma, Y, X.obs, pos_var = 1:ncol(X.obs), rindic = as.matrix(is.na(X.obs)), whichcolXmissing = (1:ncol(rindic))[apply(rindic, 2, sum) > 0], mc.size = 2 )louis_lr_saem( beta, mu, Sigma, Y, X.obs, pos_var = 1:ncol(X.obs), rindic = as.matrix(is.na(X.obs)), whichcolXmissing = (1:ncol(rindic))[apply(rindic, 2, sum) > 0], mc.size = 2 )
beta |
Estimated parameter of logistic regression model. |
mu |
Estimated parameter |
Sigma |
Estimated parameter |
Y |
Response vector |
X.obs |
Design matrix with missingness |
pos_var |
Index of selected covariates. |
rindic |
Missing pattern of X.obs. If a component in X.obs is missing, the corresponding position in rindic is 1; else 0. |
whichcolXmissing |
The column index in covariate containing at least one missing observation. |
mc.size |
Monte Carlo sampling size. |
Variance of estimated .
# Generate dataset N <- 50 # number of subjects p <- 3 # number of explanatory variables mu.star <- rep(0,p) # mean of the explanatory variables Sigma.star <- diag(rep(1,p)) # covariance beta.star <- c(1, 1, 0) # coefficients beta0.star <- 0 # intercept beta.true = c(beta0.star,beta.star) X.complete <- matrix(rnorm(N*p), nrow=N)%*%chol(Sigma.star) + matrix(rep(mu.star,N), nrow=N, byrow = TRUE) p1 <- 1/(1+exp(-X.complete%*%beta.star-beta0.star)) y <- as.numeric(runif(N)<p1) # Generate missingness p.miss <- 0.10 patterns <- runif(N*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA # Louis formula to obtain variance of estimates V_obs = louis_lr_saem(beta.true,mu.star,Sigma.star,y,X.obs)# Generate dataset N <- 50 # number of subjects p <- 3 # number of explanatory variables mu.star <- rep(0,p) # mean of the explanatory variables Sigma.star <- diag(rep(1,p)) # covariance beta.star <- c(1, 1, 0) # coefficients beta0.star <- 0 # intercept beta.true = c(beta0.star,beta.star) X.complete <- matrix(rnorm(N*p), nrow=N)%*%chol(Sigma.star) + matrix(rep(mu.star,N), nrow=N, byrow = TRUE) p1 <- 1/(1+exp(-X.complete%*%beta.star-beta0.star)) y <- as.numeric(runif(N)<p1) # Generate missingness p.miss <- 0.10 patterns <- runif(N*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA # Louis formula to obtain variance of estimates V_obs = louis_lr_saem(beta.true,mu.star,Sigma.star,y,X.obs)
This function is used to perform statistical inference for logistic regression model with missing values, by algorithm SAEM.
miss.glm(formula, data, control = list(...), ...)miss.glm(formula, data, control = list(...), ...)
formula |
an object of class " |
data |
an optional data frame containing the variables in the model. If not found in |
control |
a list of parameters for controlling the fitting process. For |
... |
arguments to be used to form the default control argument if it is not supplied directly. |
An object of class "miss.glm": a list with following components:
coefficients |
Estimated |
ll |
Observed log-likelihood. |
var.covar |
Variance-covariance matrix for estimated parameters. |
s.err |
Standard error for estimated parameters. |
mu.X |
Estimated |
Sig.X |
Estimated |
call |
the matched call. |
formula |
the formula supplied. |
# Generate dataset N <- 100 # number of subjects p <- 3 # number of explanatory variables mu.star <- rep(0,p) # mean of the explanatory variables Sigma.star <- diag(rep(1,p)) # covariance beta.star <- c(1, 1, 0) # coefficients beta0.star <- 0 # intercept beta.true = c(beta0.star,beta.star) X.complete <- matrix(rnorm(N*p), nrow=N)%*%chol(Sigma.star) + matrix(rep(mu.star,N), nrow=N, byrow = TRUE) p1 <- 1/(1+exp(-X.complete%*%beta.star-beta0.star)) y <- as.numeric(runif(N)<p1) # Generate missingness p.miss <- 0.10 patterns <- runif(N*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA df.obs = data.frame(y,X.obs) # SAEM miss.list = miss.glm(y~., data=df.obs, print_iter=FALSE,seed=100) print(miss.list) print(summary(miss.list)) summary(miss.list)$coef# Generate dataset N <- 100 # number of subjects p <- 3 # number of explanatory variables mu.star <- rep(0,p) # mean of the explanatory variables Sigma.star <- diag(rep(1,p)) # covariance beta.star <- c(1, 1, 0) # coefficients beta0.star <- 0 # intercept beta.true = c(beta0.star,beta.star) X.complete <- matrix(rnorm(N*p), nrow=N)%*%chol(Sigma.star) + matrix(rep(mu.star,N), nrow=N, byrow = TRUE) p1 <- 1/(1+exp(-X.complete%*%beta.star-beta0.star)) y <- as.numeric(runif(N)<p1) # Generate missingness p.miss <- 0.10 patterns <- runif(N*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA df.obs = data.frame(y,X.obs) # SAEM miss.list = miss.glm(y~., data=df.obs, print_iter=FALSE,seed=100) print(miss.list) print(summary(miss.list)) summary(miss.list)$coef
Auxiliary function for miss.glm fitting. Typically only used internally by miss.glm.fit.
miss.glm.control( maxruns = 500, tol_em = 1e-07, nmcmc = 2, tau = 1, k1 = 50, subsets = NA, seed = NA, print_iter = TRUE, var_cal = TRUE, ll_obs_cal = TRUE, alpha = 0, lambda = 0, save_trace = FALSE, init_method = "mean", ignore_errors = FALSE )miss.glm.control( maxruns = 500, tol_em = 1e-07, nmcmc = 2, tau = 1, k1 = 50, subsets = NA, seed = NA, print_iter = TRUE, var_cal = TRUE, ll_obs_cal = TRUE, alpha = 0, lambda = 0, save_trace = FALSE, init_method = "mean", ignore_errors = FALSE )
maxruns |
maximum number of iterations. The default is maxruns = 500. |
tol_em |
the tolerance to stop SAEM. The default is tol_em = 1e-7. |
nmcmc |
the MCMC length. The default is nmcmc = 2. |
tau |
rate |
k1 |
number of first iterations |
subsets |
Index of selected covariates if any. The default is all the covariates. |
seed |
an integer as a seed set for the random generator. |
print_iter |
logical indicating if output should be produced for each iteration. |
var_cal |
logical indicating if the variance of estimated parameters should be calculated. |
ll_obs_cal |
logical indicating if the observed log-likelihood should be calculated. |
alpha |
elastic-net mixing parameter, with 0 <= alpha <= 1. The default is alpha = 0 (ridge). |
lambda |
non-negative regularization parameter. The default is lambda = 0 (no regularization). |
save_trace |
logical indicating if the trace of beta values over iterations should be saved. |
init_method |
method to initialize the missing data. "mean", its default value, is the only option currently implemented. |
ignore_errors |
logical indicating if errors during the fitting process should be ignored (TRUE, raising a warning instead) or stop the execution (FALSE). Default is FALSE. |
A list with components named as the arguments.
## For examples see example(miss.glm)## For examples see example(miss.glm)
This function is used inside miss.glm to fit logistic regression model with missing values, by algorithm SAEM.
miss.glm.fit(x, y, control = list())miss.glm.fit(x, y, control = list())
x |
design matrix with missingness |
y |
response vector |
control |
a list of parameters for controlling the fitting process. For |
a list with following components:
coefficients |
Estimated |
ll |
Observed log-likelihood. |
var.covar |
Variance-covariance matrix for estimated parameters. |
s.err |
Standard error for estimated parameters. |
mu.X |
Estimated |
Sig.X |
Estimated |
## For examples see example(miss.glm)## For examples see example(miss.glm)
Model selection for the logistic regression model with missing data.
miss.glm.model.select(Y, X, seed = NA)miss.glm.model.select(Y, X, seed = NA)
Y |
Binary response vector |
X |
Design matrix with missingness |
seed |
An integer as a seed set for the random generator. The default value is 200. |
An object of class "miss.glm".
# Generate dataset N <- 40 # number of subjects p <- 3 # number of explanatory variables mu.star <- rep(0,p) # mean of the explanatory variables Sigma.star <- diag(rep(1,p)) # covariance beta.star <- c(1, 1, 0) # coefficients beta0.star <- 0 # intercept beta.true = c(beta0.star,beta.star) X.complete <- matrix(rnorm(N*p), nrow=N)%*%chol(Sigma.star) + matrix(rep(mu.star,N), nrow=N, byrow = TRUE) p1 <- 1/(1+exp(-X.complete%*%beta.star-beta0.star)) Y <- as.numeric(runif(N)<p1) # Generate missingness p.miss <- 0.10 patterns <- runif(N*p)<p.miss #missing completely at random X <- X.complete X[patterns] <- NA # model selection for SAEM miss.model = miss.glm.model.select(Y,X,seed=100) print(miss.model)# Generate dataset N <- 40 # number of subjects p <- 3 # number of explanatory variables mu.star <- rep(0,p) # mean of the explanatory variables Sigma.star <- diag(rep(1,p)) # covariance beta.star <- c(1, 1, 0) # coefficients beta0.star <- 0 # intercept beta.true = c(beta0.star,beta.star) X.complete <- matrix(rnorm(N*p), nrow=N)%*%chol(Sigma.star) + matrix(rep(mu.star,N), nrow=N, byrow = TRUE) p1 <- 1/(1+exp(-X.complete%*%beta.star-beta0.star)) Y <- as.numeric(runif(N)<p1) # Generate missingness p.miss <- 0.10 patterns <- runif(N*p)<p.miss #missing completely at random X <- X.complete X[patterns] <- NA # model selection for SAEM miss.model = miss.glm.model.select(Y,X,seed=100) print(miss.model)
This function is used to perform statistical inference for linear regression model with missing values, by algorithm EM.
miss.lm(formula, data, control = list(...), ...)miss.lm(formula, data, control = list(...), ...)
formula |
an object of class " |
data |
an optional data frame containing the variables in the model. If not found in |
control |
a list of parameters for controlling the fitting process. For |
... |
arguments to be used to form the default control argument if it is not supplied directly. |
An object of class "miss.lm": a list with following components:
coefficients |
Estimated |
ll |
Observed log-likelihood. |
s.resid |
Estimated standard error for residuals. |
s.err |
Standard error for estimated parameters. |
mu.X |
Estimated |
Sig.X |
Estimated |
call |
the matched call. |
formula |
the formula supplied. |
# Generate complete data set.seed(1) mu.X <- c(1, 1) Sigma.X <- matrix(c(1, 1, 1, 4), nrow = 2) n <- 50 p <- 2 X.complete <- matrix(rnorm(n*p), nrow=n)%*%chol(Sigma.X) + matrix(rep(mu.X,n), nrow=n, byrow = TRUE) b <- c(2, 3, -1) sigma.eps <- 0.25 y <- cbind(rep(1, n), X.complete) %*% b + rnorm(n, 0, sigma.eps) # Add missing values p.miss <- 0.10 patterns <- runif(n*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA # Estimate regression using EM df.obs = data.frame(y,X.obs) miss.list = miss.lm(y~., data=df.obs) print(miss.list) print(summary(miss.list)) summary(miss.list)$coef# Generate complete data set.seed(1) mu.X <- c(1, 1) Sigma.X <- matrix(c(1, 1, 1, 4), nrow = 2) n <- 50 p <- 2 X.complete <- matrix(rnorm(n*p), nrow=n)%*%chol(Sigma.X) + matrix(rep(mu.X,n), nrow=n, byrow = TRUE) b <- c(2, 3, -1) sigma.eps <- 0.25 y <- cbind(rep(1, n), X.complete) %*% b + rnorm(n, 0, sigma.eps) # Add missing values p.miss <- 0.10 patterns <- runif(n*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA # Estimate regression using EM df.obs = data.frame(y,X.obs) miss.list = miss.lm(y~., data=df.obs) print(miss.list) print(summary(miss.list)) summary(miss.list)$coef
Auxiliary function for miss.lm fitting. Typically only used internally by miss.lm.fit.
miss.lm.control(maxruns = 500, tol_em = 1e-07, print_iter = TRUE)miss.lm.control(maxruns = 500, tol_em = 1e-07, print_iter = TRUE)
maxruns |
maximum number of iterations. The default is maxruns = 500. |
tol_em |
the tolerance to stop EM. The default is tol_em = 1e-4. |
print_iter |
logical indicating if output should be produced for each iteration. |
A list with components named as the arguments.
## For examples see example(miss.lm)## For examples see example(miss.lm)
This function is used inside miss.lm to fit linear regression model with missing values, by EM algorithm.
miss.lm.fit(x, y, control = list())miss.lm.fit(x, y, control = list())
x |
design matrix with missingness |
y |
response vector |
control |
a list of parameters for controlling the fitting process. For |
a list with following components:
coefficients |
Estimated |
ll |
Observed log-likelihood. |
s.resid |
Estimated standard error for residuals. |
s.err |
Standard error for estimated parameters. |
mu.X |
Estimated |
Sig.X |
Estimated |
## For examples see example(miss.lm)## For examples see example(miss.lm)
Model selection for the linear regression model with missing data.
miss.lm.model.select(Y, X)miss.lm.model.select(Y, X)
Y |
Response vector |
X |
Design matrix with missingness |
An object of class "miss.lm".
# Generate complete data set.seed(1) mu.X <- c(1, 1) Sigma.X <- matrix(c(1, 1, 1, 4), nrow = 2) n <- 50 p <- 2 X.complete <- matrix(rnorm(n*p), nrow=n)%*%chol(Sigma.X) + matrix(rep(mu.X,n), nrow=n, byrow = TRUE) b <- c(2, 0, -1) sigma.eps <- 0.25 y <- cbind(rep(1, n), X.complete) %*% b + rnorm(n, 0, sigma.eps) # Add missing values p.miss <- 0.10 patterns <- runif(n*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA # model selection miss.model = miss.lm.model.select(y, X.obs) print(miss.model)# Generate complete data set.seed(1) mu.X <- c(1, 1) Sigma.X <- matrix(c(1, 1, 1, 4), nrow = 2) n <- 50 p <- 2 X.complete <- matrix(rnorm(n*p), nrow=n)%*%chol(Sigma.X) + matrix(rep(mu.X,n), nrow=n, byrow = TRUE) b <- c(2, 0, -1) sigma.eps <- 0.25 y <- cbind(rep(1, n), X.complete) %*% b + rnorm(n, 0, sigma.eps) # Add missing values p.miss <- 0.10 patterns <- runif(n*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA # model selection miss.model = miss.lm.model.select(y, X.obs) print(miss.model)
Prediction on test with missing values for the logistic regression model.
## S3 method for class 'miss.glm' predict(object, newdata = NULL, seed = NA, method = "map", mc.size = 100, ...)## S3 method for class 'miss.glm' predict(object, newdata = NULL, seed = NA, method = "map", mc.size = 100, ...)
object |
a fitted object of class inheriting from "miss.glm". |
newdata |
a data frame in which to look for variables with which to predict. It can contain missing values. |
seed |
An integer as a seed set for the random generator. |
method |
The name of method to deal with missing values in test set. It can be 'map'(maximum a posteriori) or 'impute' (imputation by conditional expectation). Default is 'map'. |
mc.size |
The number of Monte Carlo samples to use when method is 'map'. Default is 100. |
... |
Further arguments passed to or from other methods. |
pr.saem |
The prediction result for logistic regression: the probability of response y=1. |
# Generate dataset N <- 100 # number of subjects p <- 3 # number of explanatory variables mu.star <- rep(0,p) # mean of the explanatory variables Sigma.star <- diag(rep(1,p)) # covariance beta.star <- c(1, 1, 0) # coefficients beta0.star <- 0 # intercept beta.true = c(beta0.star,beta.star) X.complete <- matrix(rnorm(N*p), nrow=N)%*%chol(Sigma.star) + matrix(rep(mu.star,N), nrow=N, byrow = TRUE) p1 <- 1/(1+exp(-X.complete%*%beta.star-beta0.star)) y <- as.numeric(runif(N)<p1) # Generate missingness p.miss <- 0.10 patterns <- runif(N*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA df.obs = data.frame(y,X.obs) # SAEM miss.list = miss.glm(y~., data=df.obs, print_iter=FALSE,seed=100) # Generate new dataset for prediction Nt <- 20 Xt <- matrix(rnorm(Nt*p), nrow=Nt)%*%chol(Sigma.star)+ matrix(rep(mu.star,Nt), nrow=Nt, byrow = TRUE) # Generate missingness in new dataset patterns <- runif(Nt*p)<p.miss Xt.obs <- Xt Xt.obs[patterns] <- NA # Prediction with missing values miss.prob = predict(miss.list, data.frame(Xt.obs), method='map') print(miss.prob)# Generate dataset N <- 100 # number of subjects p <- 3 # number of explanatory variables mu.star <- rep(0,p) # mean of the explanatory variables Sigma.star <- diag(rep(1,p)) # covariance beta.star <- c(1, 1, 0) # coefficients beta0.star <- 0 # intercept beta.true = c(beta0.star,beta.star) X.complete <- matrix(rnorm(N*p), nrow=N)%*%chol(Sigma.star) + matrix(rep(mu.star,N), nrow=N, byrow = TRUE) p1 <- 1/(1+exp(-X.complete%*%beta.star-beta0.star)) y <- as.numeric(runif(N)<p1) # Generate missingness p.miss <- 0.10 patterns <- runif(N*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA df.obs = data.frame(y,X.obs) # SAEM miss.list = miss.glm(y~., data=df.obs, print_iter=FALSE,seed=100) # Generate new dataset for prediction Nt <- 20 Xt <- matrix(rnorm(Nt*p), nrow=Nt)%*%chol(Sigma.star)+ matrix(rep(mu.star,Nt), nrow=Nt, byrow = TRUE) # Generate missingness in new dataset patterns <- runif(Nt*p)<p.miss Xt.obs <- Xt Xt.obs[patterns] <- NA # Prediction with missing values miss.prob = predict(miss.list, data.frame(Xt.obs), method='map') print(miss.prob)
Prediction on test with missing values for the linear regression model.
## S3 method for class 'miss.lm' predict(object, newdata = NULL, seed = NA, ...)## S3 method for class 'miss.lm' predict(object, newdata = NULL, seed = NA, ...)
object |
a fitted object of class inheriting from "miss.lm". |
newdata |
a data frame in which to look for variables with which to predict. It can contain missing values. |
seed |
An integer as a seed set for the random generator. |
... |
Further arguments passed to or from other methods. |
pr.y |
The prediction result for linear regression. |
# Generate complete data set.seed(1) mu.X <- c(1, 1) Sigma.X <- matrix(c(1, 1, 1, 4), nrow = 2) n <- 50 # train set size p <- 2 # number of covariates X.complete <- matrix(rnorm(n*p), nrow=n)%*%chol(Sigma.X) + matrix(rep(mu.X,n), nrow=n, byrow = TRUE) b <- c(2, 3, -1) sigma.eps <- 0.25 y <- cbind(rep(1, n), X.complete) %*% b + rnorm(n, 0, sigma.eps) # Add missing values p.miss <- 0.10 patterns <- runif(n*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA # Estimate regression using EM df.obs = data.frame(y ,X.obs) miss.list = miss.lm(y~., data=df.obs) # Generate new dataset for prediction nt <- 20 Xt <- matrix(rnorm(nt*p), nrow=nt)%*%chol(Sigma.X)+ matrix(rep(mu.X,nt), nrow=nt, byrow = TRUE) # Generate missingness in new dataset patterns <- runif(nt*p)<p.miss Xt.obs <- Xt Xt.obs[patterns] <- NA # Prediction with missing values miss.pred = predict(miss.list, data.frame(Xt.obs)) print(miss.pred)# Generate complete data set.seed(1) mu.X <- c(1, 1) Sigma.X <- matrix(c(1, 1, 1, 4), nrow = 2) n <- 50 # train set size p <- 2 # number of covariates X.complete <- matrix(rnorm(n*p), nrow=n)%*%chol(Sigma.X) + matrix(rep(mu.X,n), nrow=n, byrow = TRUE) b <- c(2, 3, -1) sigma.eps <- 0.25 y <- cbind(rep(1, n), X.complete) %*% b + rnorm(n, 0, sigma.eps) # Add missing values p.miss <- 0.10 patterns <- runif(n*p)<p.miss #missing completely at random X.obs <- X.complete X.obs[patterns] <- NA # Estimate regression using EM df.obs = data.frame(y ,X.obs) miss.list = miss.lm(y~., data=df.obs) # Generate new dataset for prediction nt <- 20 Xt <- matrix(rnorm(nt*p), nrow=nt)%*%chol(Sigma.X)+ matrix(rep(mu.X,nt), nrow=nt, byrow = TRUE) # Generate missingness in new dataset patterns <- runif(nt*p)<p.miss Xt.obs <- Xt Xt.obs[patterns] <- NA # Prediction with missing values miss.pred = predict(miss.list, data.frame(Xt.obs)) print(miss.pred)
Print results for class miss.glm.
## S3 method for class 'miss.glm' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'miss.glm' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
an object of class " |
digits |
minimal number of significant digits. |
... |
further arguments passed to or from other methods. |
No return value, called for coefficient and standard error estimates print.
## For examples see example(miss.glm)## For examples see example(miss.glm)
Print results for class miss.lm.
## S3 method for class 'miss.lm' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'miss.lm' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
an object of class " |
digits |
minimal number of significant digits. |
... |
further arguments passed to or from other methods. |
No return value, called for coefficient and standard error estimates print.
## For examples see example(miss.lm)## For examples see example(miss.lm)
Print results for class summary.miss.glm.
## S3 method for class 'summary.miss.glm' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'summary.miss.glm' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
an object of class " |
digits |
minimal number of significant digits. |
... |
further arguments passed to or from other methods. |
No return value, called for summary print.
## For examples see example(miss.glm)## For examples see example(miss.glm)
Print results for class summary.miss.lm.
## S3 method for class 'summary.miss.lm' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'summary.miss.lm' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
an object of class " |
digits |
minimal number of significant digits. |
... |
further arguments passed to or from other methods. |
No return value, called for summary print.
## For examples see example(miss.lm)## For examples see example(miss.lm)
Summary for class miss.glm.
## S3 method for class 'miss.glm' summary(object, ...)## S3 method for class 'miss.glm' summary(object, ...)
object |
an object of class " |
... |
Further arguments passed to or from other methods. |
An object of class "summary.miss.glm", a list with following components:
coefficients |
The matrix of coefficients and standard errors |
loglikelihood |
Observed log-likelihood. |
call |
the component from |
formula |
the component from |
## For examples see example(miss.glm)## For examples see example(miss.glm)
Summary for class miss.lm.
## S3 method for class 'miss.lm' summary(object, ...)## S3 method for class 'miss.lm' summary(object, ...)
object |
an object of class " |
... |
Further arguments passed to or from other methods. |
An object of class "summary.miss.lm", a list with following components:
coefficients |
The matrix of coefficients and standard errors. |
loglikelihood |
Observed log-likelihood. |
call |
the component from |
formula |
the component from |
## For examples see example(miss.lm)## For examples see example(miss.lm)