### 1. Introduction

### 2. Materials and Methods

### 2.1. Dataset Compilation

### 2.2. Model Development and Evaluation

^{2}), root mean square error (RMSE), and the hat matrix (leverage approach). The R

^{2}and RMSE values were calculated using Eq. (2) and Eq. (3):

##### (2)

$${\text{R}}^{2}=1-{\scriptstyle \frac{{\mathrm{\Sigma}}_{\text{i}=1}^{\text{n}}{({\text{RE}}^{\text{predicted}}-{\text{RE}}^{\text{experimental}})}^{2}}{{\mathrm{\Sigma}}_{\text{i}=1}^{\text{n}}{({\text{RE}}^{\text{predicted}}-{\text{RE}}^{\text{average}})}^{2}}}$$##### (3)

$$\text{RMSE}=\sqrt{{\scriptstyle \frac{{\mathrm{\Sigma}}_{\text{i}=1}^{\text{n}}{({\text{RE}}^{\text{experimental}}-{\text{RE}}^{\text{predicted}})}^{2}}{\text{N}}}}$$##### (4)

$$\text{h}={\scriptstyle \frac{1}{\text{n}}}+{\scriptstyle \frac{{({\text{X}}_{\text{i}}-\text{X})}^{2}}{{\mathrm{\Sigma}}_{\text{i}=1}^{\text{n}}{({\text{X}}_{\text{i}}-\text{X})}^{2}}}$$_{i}is the variable value of the

*i*th object and X is the variable average. The outliers of the model predictions (Y-outliers, standardized residuals greater than two standard deviation unit) and the influential or problematic variables (X-outliers, defined by the critical hat values) were visualized using the William’s plot (the standardized residuals of RE mapped against the leverage value). The critical leverage threshold (h* cut-off value) is often fixed at 3(p+1)/n, where n is the number of observations in the dataset and p is the number of variables in the model. The data predicted for high leverage compounds (h > h*) was considered extrapolated by the model.

### 3. Results and Discussion

### 3.1. Model Performance of MLR and SVR

##### (5)

$$\begin{array}{l}\text{RE\hspace{0.17em}}(\%)=102.71-4.261\times \text{SOM\hspace{0.17em}}(\%)+0.303\times \text{Temp}\text{.\hspace{0.17em}}(\xb0\text{C})+\\ \mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}0.478\times \text{Time\hspace{0.17em}}(\text{min})-51.89\times \\ \mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\mathrm{\hspace{0.17em}\u200a\u200a}\text{Soil}/\text{Water\hspace{0.17em}ratio}-0.302\times \text{MW\hspace{0.17em}}(\text{g}/\text{mol})\end{array}$$^{2}value 0.99 and 0.93 for training and testing datasets, respectively. After tuning, optimum cost and epsilon values for the best performance were 8 and 0.01, which is indicated (darker blue region) in Fig. 3 (a). Based on the leverage approach, the AD assessment results also revealed that the SVR model exhibited a remarkable prediction coverage of more than 98% (Fig. 3 (c)). The identified minimal outliers in the training and testing datasets indicated the accuracy of the proposed SVR model for predicting the PAHs removal rate by the SCWE technique.

### 3.2. Variable Importance Analysis

### 3.3. Partial Dependence Plot Based on SVR Model

### 4. Conclusions

^{2}and RMSE values demonstrated that the developed predictive model shows immense promise and could help with the process design and optimization of the SCWE method for the removal of PAHs from soils. Although the developed model is suitable for predicting the removal rate of PAHs by SCWE from this study, large amounts of data from more diverse experimental conditions are needed to improve its accuracy.