Parallel AUC and partial AUC calculation with optimized memory usage

Computes bootstrap estimates of partial and complete AUC using parallel processing and optimized binning.

Usage

auc_parallel(
  test_prediction,
  prediction,
  threshold = 5,
  sample_percentage = 50,
  iterations = 500L,
  compute_full_auc = TRUE,
  n_bins = 500L
)

Arguments

test_prediction: Numeric vector of test prediction values
prediction: Numeric vector of model predictions (background suitability data)
threshold: Percentage threshold for partial AUC calculation (default = 5.0)
sample_percentage: Percentage of test data to sample in each iteration (default = 50.0)
iterations: Number of bootstrap iterations (default = 500)
compute_full_auc: Boolean indicating whether to compute complete AUC (default = TRUE)
n_bins: Number of bins for discretization (default = 500)

Value

A numeric matrix with `iterations` rows and 4 columns containing:

auc_complete: Complete AUC (NA when compute_full_auc = FALSE)
auc_pmodel: Partial AUC for the model (sensitivity > 1 - threshold/100)
auc_prand: Partial AUC for random model (reference)
ratio: Ratio of model AUC to random AUC (model/reference)

Details

This function implements a highly optimized AUC calculation pipeline: 1. Cleans input data (removes non-finite values) 2. Combines background and test predictions 3. Performs range-based binning (discretization) 4. Computes cumulative distribution of background predictions 5. Runs bootstrap iterations in parallel: - Samples test predictions - Computes sensitivity-specificity curves - Calculates partial and complete AUC

Key optimizations: - OpenMP parallelization for binning and bootstrap - Vectorized operations using Armadillo

Partial AUC

The partial AUC focuses on the high-sensitivity region defined by: Sensitivity > 1 - (threshold/100)

Examples

# Basic usage with random data
set.seed(123)
bg_pred <- runif(1000)   # bg predictions
test_pred <- runif(500)     # Test predictions

# Compute only partial AUC metrics (500 iterations)
results <- auc_parallel(test_pred, bg_pred,
                            threshold = 5.0,
                            iterations = 100)  # Reduced for example

# View first 5 iterations
head(results, 5)
#>          [,1]     [,2]      [,3]      [,4]
#> [1,] 0.485038 0.035314 0.0353520 0.9989251
#> [2,] 0.506256 0.039196 0.0392000 0.9998980
#> [3,] 0.515182 0.069904 0.0694080 1.0071462
#> [4,] 0.520932 0.051704 0.0515955 1.0021029
#> [5,] 0.496894 0.035196 0.0353520 0.9955872

# Summarize results (assume complete AUC was not computed)
summary <- summarize_auc_results(results, has_complete_auc = FALSE)

# Interpretation:
# - auc_pmodel: Model's partial AUC (higher is better)
# - auc_prand: Random model's partial AUC
# - ratio: Model AUC / Random AUC (>1 indicates better than random)

# Compute both partial and complete AUC
full_results <- auc_parallel(test_pred, bg_pred,
                                 compute_full_auc = TRUE,
                                 iterations = 100)