Abstract
Abstract: Data Preprocessing is a vital process in machine learning and data science which is aimed at importantly enhancing data quality and improving model performance. The existing data preprocessing techniques often focusses on different challenges but effectively mitigating noise in data stands out as a fundamental aspect. Each method offers unique advantages and limitations, making their applicability dependent on the specific characteristics of the dataset at hand.
This research paper explores a novel method named as Sequential ‘n’ Distance Average (SnDA) for denoising the data which is introduced as a unified and lightweight approach which will lead to prepare data for more accurate analysis and modelling in terms of denoising the data. Core SnDA method operates by sorting the original data, computing average successive differences, and reconstructing a smoothed sequence. This approach effectively smooths the data, mitigating noise and preserves the underlying structure.
The key advantage of SnDA is it enhances data quality without distorting its scale or distribution, making it a robust and versatile preprocessing tool for various analytical and predictive tasks.
Considering there are specific data characteristic and preprocessing requirements, a couple of different SnDA variants has been developed namely “Modified SnDA” and “Adaptive SnDA Glide.” Each variant maintains the core principle of SnDA computing the average of successive differences but incorporates additional steps or modifications to better suit particular data scenarios. Hence SnDA along with its variants can be adapted to meet the unique challenges presented by different datasets.



![Author ORCID: We display the ORCID iD icon alongside authors names on our website to acknowledge that the ORCiD has been authenticated when entered by the user. To view the users ORCiD record click the icon. [opens in a new tab]](https://www.cambridge.org/engage/assets/public/coe/logo/orcid.png)