K-Means Optimization with Kernel Gaussian Radial Basis Function in C#

Authors

  • Harun Nasrullah Faculty of Information Technology, Budi Luhur University, Jakarta, 12260, Indonesia
  • Arif Bramantoro School of Computing and Informatics, Brunei University of Technology, Bandar Seri Begawan, BE1410, Brunei Darussalam https://orcid.org/0000-0003-2772-9427
  • Ahmad A. Alzahrani Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia

DOI:

https://doi.org/10.37256/cm.6520257401

Keywords:

elbow, K-Means, Kernel Gaussian, Sum of Squares Error (SSE)

Abstract

K-Means is an iterative clustering technique that relies on centroids to organize data into distinct clusters until convergence is achieved. Despite its effectiveness, K-Means struggles with non-linearly separable data and is sensitive to initial centroid selection, as data points are assigned to clusters based on proximity to centroids. Moreover, the algorithm's performance hinges on the predetermined number of clusters (K), with suboptimal K selection leading to inferior clustering outcomes. To address these limitations, this study proposes the use of Kernel K-Means, employing the Kernel Gaussian Radial Basis Function to handle non-linearly separable datasets in high-dimensional feature spaces. The Elbow method is employed to determine the optimal K value, and the Sum of Squares Error (SSE) method aids in identifying initial centroids. Subsequent analysis of clustering results utilizes the Davies Bouldin Index and Silhouette coefficient techniques. Implementation is conducted using the C# programming language to expand machine learning applications across different programming languages. Our findings indicate that Kernel K-Means consistently outperforms traditional K-Means, with the Silhouette coefficient's effectiveness varying depending on dataset characteristics and size. 

Downloads

Published

2025-09-24