Solubility-Driven Phenol Extraction from Olive Tree Derivatives in Ethanol/Methanol: Empirical, UNIFAC & ML Models

Authors

  • Mohamed Abdelkader Hafiene Laboratory for Materials Applications in Environment, Water, and Energy (LAM3E), Faculty of Sciences, University of Gafsa, Gafsa, 6029, Tunisia
  • Hatem Ksibi Laboratory for Materials Applications in Environment, Water, and Energy (LAM3E), Faculty of Sciences, University of Gafsa, Gafsa, 6029, Tunisia https://orcid.org/0000-0003-4144-9958

DOI:

https://doi.org/10.37256/fce.6220257412

Keywords:

olive-derived phenolics, solubility prediction, ethanol-methanol solvents, Apelblat equation, Universal Functional Activity Coefficient (UNIFAC) method, machine learning regression

Abstract

This study investigates the solubility behavior and predictive modeling of six key phenolic compounds derived from olive sources—hydroxytyrosol, luteolin, oleuropein, rutin, quercetin, and verbascoside—in methanol, ethanol, and their binary mixtures (10 : 90, 50 : 50, and 90 : 10 v/v) at temperatures ranging from 20 °C to 50 °C. Experimental solubility data were compiled from previously published literature. These results showed a wide solubility range: oleuropein exhibited the highest solubility (~ 100 mg/100 g in methanol at 50 °C), followed by hydroxytyrosol (~ 78 mg/100 g), verbascoside (~ 45 mg/100 g), rutin (~ 35 mg/100 g), quercetin (~ 5 mg/100 g), and luteolin (~ 3 mg/100 g). Solubility generally increased with temperature and ethanol content, though compound-specific effects were observed. Empirical modeling using the Apelblat equation demonstrated strong agreement with experimental data (R2 > 0.98; Mean Absolute Error (MAE) < 5%) across all compounds. Predictive models were also developed using both the Universal Functional Activity Coefficient (UNIFAC) thermodynamic method and Machine Learning (ML) algorithms (eXtreme Gradient Boosting (XGBoost)), Random Forest (RF). While UNIFAC captured general solubility trends (R2 ≈ 0.75), it was limited by its group contribution assumptions and lack of interaction-specific parameters. In contrast, the ML models achieved higher accuracy (R2 > 0.95; Root Mean Square Error (RMSE) < 3.2 mg/100 g), particularly for highly soluble compounds such as oleuropein and hydroxytyrosol. Minor deviations (R2 ≈ 0.93) were observed for quercetin and luteolin due to their lower solubility and narrower data range. Pearson correlation analysis highlighted solvent composition as the dominant factor influencing solubility, with coefficients exceeding 0.90 for most compounds. Finally, the predictive insights were validated against experimental extraction efficiencies, confirming that solubility-optimized conditions (e.g., high methanol content at 50 °C) led to a 20-35% improvement in phenol recovery, demonstrating the practical relevance of this integrated analytical-modeling approach for the design of efficient extraction processes.

Downloads

Published

2025-09-22

How to Cite

(1)
Hafiene, M. A.; Ksibi, H. Solubility-Driven Phenol Extraction from Olive Tree Derivatives in Ethanol/Methanol: Empirical, UNIFAC & ML Models. Fine Chemical Engineering 2025, 6, 344-359.