Adaptive Random Search for Transparent Classification: Bridging Minimalist Machine Learning and Clinical Applications in Medical Datasets
DOI:
https://doi.org/10.37256/cm.6520257403Keywords:
Minimalist Machine Learning (MML), Adaptive Random Search (ARS), medical datasets, stratified holdout, 1-Nearest-Neighbor (1-NN), interpretabilityAbstract
Medical datasets frequently manifest a "horizontal" structure, wherein the number of features significantly exceeds the number of samples (e.g., 1,071 genes vs. 28 patients in glioma classification). Traditional validation methods encounter challenges with such imbalances, while complex machine learning models sacrifice interpretability for performance, resulting in "black boxes" that hinder clinical trust. To address this challenge, we propose Adaptive Random Search (ARS), a novel metaheuristic method that enhances stratified holdout validation by optimizing training subset selection. Developed at the Intelligent Computing Laboratory of Computing and Systems Center, National Polytechnic Institute (CIC-IPN), the birthplace of Minimalist Machine Learning (MML), ARS combines transparency with efficiency. By iteratively searching for optimal 20% training subsets, ARS reduces computational costs while maximizing balanced accuracy. To validate the proposed method, we employed a 1-Nearest-Neighbor (1-NN) classifier, selected for its lazy learning nature and interpretability, in eight medical datasets: glioma gene expression (Nutt), Parkinson's disease, gastrointestinal lesions, toxicity, Darwin, Gene Expression and SEMG for basics hands movements. The experimental results demonstrate that ARS attains state-of-the-art performance, achieving balanced accuracy of 100% on the Nutt and Gene Expression datasets and outperforming Support Vector Machine (SVM), Random Forest, and neural networks. In clinical settings, ARS-based classifications exhibited a stronger correlation with patient survival (p = 0.05) compared to histopathology in gliomas, underscoring its prognostic significance. By converting high-dimensional data into two-dimensional decision boundaries through statistical metrics (mean, standard deviation), ARS aligns with the MML principles, providing a computationally efficient and interpretable solution for medical diagnostics. This work demonstrates a synthesis of minimalist design with clinical utility, thereby substantiating the notion that simplicity need not compromise efficacy in high-stakes healthcare applications.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Mailyn Moreno Espino, et al.

This work is licensed under a Creative Commons Attribution 4.0 International License.
