Qiao, Ma, Shi, Xi, Yang, Huang, and Wang: Which fossil energy source has the highest average contribution rate to US carbon emissions based on a machine learning algorithm?
Research
Environmental Engineering Research 2025; 30(4): 240339.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Coal, oil, and natural gas are the main three fossil energy that produce carbon. Among them, which is the main contributor to carbon emissions is rarely studied. In this work, the average contribution rate of carbon emissions is predicted based on an innovative two-stage model combining the optimal layers of wavelet’s orders with long short-term memory optimized by an improved sparrow search algorithm. The experimental results demonstrate that using wavelet for preprocessing can achieve better prediction results, compared to some other preprocessing methods, and the prediction results of one-step prediction are better than those of multi-step prediction. In addition, the six prediction error indicators used in this study are reasonable, and using the average prediction error evaluation indicator is more reasonable. The conclusion can be reached that the order of average contribution rate of carbon emissions from high to low is natural gas, petroleum, and coal and their proportion is 46.62%, 34.90%, and 18.48%, therefore, in the short future, natural gas will be the main source of carbon emissions in the US.
Keywords: Average contribution rate, Carbon emission source prediction, Improved sparrow search algorithm, Long short-term memory, Wavelet transform
Graphical Abstract
Keywords: Average contribution rate, Carbon emission source prediction, Improved sparrow search algorithm, Long short-term memory, Wavelet transform
Introduction
1.1. Background and Rationale
It has become an international consensus that carbon emissions (CE) lead to global warming [1]. To this end, many countries have signed the Paris Agreement to work together to reduce CE [2]. Although some measures to reduce CE have been tried, some countries still have a lot of CE [3]. For example, in recent years, the CE of the US and China have been in the top two, and Japan, Russia, and India ranked in the top five in CE [4]. In addition, before 2005, the US ranked first in CE, while China ranked first in CE after 2005. However, the per capita CE of the US is more than twice that of China. And in 2021, the US will account for 23.03% of the world’s total CE [4]. Therefore, the US is still the major CE country in the world.
The CE of the US mainly comes from the consumption of fossil energy, such as natural gas, oil, and coal. To actively respond to the impact of CE on the global climate, the US has put forward a “zero CE action plan”, which draws on and extends the previous two United Nations-led sustainable development solution network reports, and the plan aims to achieve zero CE by 2050. However, for a long time to come, natural gas, oil, and coal are still the main fossil fuels [5]. Among these three energies, different energies have different contributions to CE, and this contribution rate is very important for the government to formulate corresponding policies [6]. In the past few decades, the contribution of coal, oil, and natural gas to CE has gradually changed and is not static. For example, the contribution of natural gas to CE is gradually increasing. Therefore, this fluctuation brings challenges to accurately predicting the contribution of different fossil energy to CE.
1.2. Related Works
In recent years, many studies have focused on forecasting CE [7]. From the perspective of models, it can be roughly divided into three categories [8]. The first type is the traditional model. Meng and Niu [9] think that except for a few countries, the main figures of CE from fossil energy are S-shaped curves. They select the logistic function to simulate the S-shaped curve and to improve the goodness of fit by applying three algorithms. The results indicate that their model is better than the linear model based on simulation risk and goodness of fit. Pérez-Suárez and López-Menéndez [10] employ the environmental Logistic curve and extended environmental Kuznets curve to explain CE in different countries and their result allows them to obtain some ex-ante projections. The second type is the grey model (GM) based on grey theory. Lin et al. [11] proposed the GM (1, 1) to predict CE in Taiwan, and the results show the average residual error of the GM (1, 1) is below 10%. Wang and Ye [12] introduce the power exponential term of the relevant variables as exogenous variables in a multivariable GM. Their results indicate that the model developed by them can reflect the mechanism of the non-linear effects of the gross domestic product on CE from fossil energy consumption. Wang and Li [13] proposed a new method for discussing the relationship between CE and economic growth. This method is based on the derived non-equigap grey Verhulst model optimized by particle swarm optimization. Their results show that the Chinese government should take effective measures to reduce CE. In addition to the above GMs, some researchers also proposed some innovative GMs, such as a new nonlinear discrete GM (1, N) in Ding et al. [14], a hybrid Verhulst-GM (1, N) in Ofosu-Adarkwa et al. [15], a novel grey rolling prediction model in Zhou et al. [16], a novel fractional grey Riccati model in Gao et al. [17], a new cumulative GM in Qiao et al. [18], and a novel continuous fractional nonlinear grey Bernoulli model with grey wolf optimizer in Xie et al. [19]. The third type is machine learning model based on artificial intelligence technology. Sun et al. [20] establish some factors impacting CE based on 22 influencing factors. On this basis, they put forward a coupling model concerning extreme learning machine (ELM) optimized by particle swarm optimization (PSO). The results show that the developed PSO-ELM outperforms the ELM and back propagation neural network in predicting CE. Zhao et al. [21] establish a hybrid of the mixed data sampling regression model and back propagation neural network (BPNN) to forecast CE and their developed model is remarkably better than several benchmark models. In addition, some models have also been developed, such as improved Gaussian regression in [22], a hybrid algorithm based on an improved lion swarm optimizer in [23], and a model according to BPNN based on random forest and PSO in [24].
In addition to simply studying CE prediction, there are also studies on the relationship between CE and other indicators [25]. For example, Pao and Tsai [26] analyzed the relationship between CE, energy consumption, and economic growth. Mohmmed et al. analyzed the relationship between CE and economic growth, development, and human health [27]. Menyah and Wolde-Rufael analyzed the relationship between CE and nuclear energy, renewable energy, and economic growth in the US [28]. Mason et al. discussed the relationship between CE energy demand, and wind generation in Ireland using evolutionary neural networks [29].
1.3. Motivations, Innovations, Contributions, and Article Organization
The above research mainly focuses on traditional models, machine learning models, and GMs. The research content mainly focuses on the prediction of CE alone or the study of the relationship between CE and other factors. For these used models, data preprocessing technology is rarely used. However, in some studies, it has been proved that appropriate data preprocessing technology can improve the prediction accuracy of the model in the prediction of other fields [30]. In addition, these studies also pay little attention to the contribution rate of CE generated by fossil energy. However, the government needs to formulate corresponding policies based on the prediction of the average contribution rate of CE. That is to say, both the prediction accuracy of the prediction model and the research content on CE need to be developed. Therefore, in this study, a novel two-stage model is studied to forecast the average contribution rate of CE generated by fossil energy based on data preprocessing technology.
The innovation of the research content is to focus on the average contribution rate of CE generated by different fossil fuels, which is rarely mentioned in the literature. The innovation of the proposed model lies in the introduction of preprocessing methods, which is rarely seen in other literature on CE prediction. The main work of this study is as follows:
A novel two-stage model is studied to forecast the average contribution rate of three kinds of fossil energy in the US.
The optimal layers of wavelet’s order suitable for different fossil energy are fixed, and the average contribution rate of different fossil energy in 2023 is predicted.
Different pretreatment methods are compared based on ISSA-LSTM and different forecasting steps are compared based on the optimal layers of wavelet’s order-ISSA-LSTM.
The rationality of error indicators applied in this study is discussed.
The rest of this paper is organized as follows. Some methods applied in this work are shown in Section 2. Section 3 displays applications including data statistical analysis and forecasting steps. Section 4 presents the key results of this work. Section 5 gives a discussion of four issues related to this work. At last, the conclusion of this work is summarized in Section 6.
Methodologies
2.1. Wavelet Theory
The WT [31, 32] can realize the time subdivision of the signal at high frequency and the detail subdivision at low frequency, complete the multi-scale refinement of the signal, and overcome the shortcomings of the Fourier transform (FT).
The wavelet reconstruction process is the inverse process of wavelet decomposition, which can be referred to in references [33–35] for details.
2.2. Long Short-Term Memory Theory
The LSTM network was presented by [36, 37], which avoids the phenomenon of gradient disappearance and gradient explosion in the traditional RNN. The LSTM network includes three gates, namely input, forgetting, and output gates. The controlling equation in the LSTM network is displayed in the following Eqs. (1) ~ (6):
(1)
(2)
(3)
(4)
(5)
(6)
where ft means the output data of the forgetting gate at the current moment; xt means the current input value; gt and it means the value handled by the Sigmoid and tanh function in the input gate; Wo, Wi, Wf, Wc, and bi, bc, bo, bf mean the weight matrix and bias vectors of the corresponding threshold; ot means the output data of output gate at the current moment; St and St-1 mean the output cell state at the current moment and the last moment; ht and ht-1 mean the data output information at the current moment and the last moment.
2.3. Improved Sparrow Search Algorithm
The SSA was studied by [38] according to the anti-predation and predation behavior of sparrows during the foraging process. The SSA has divided the entire sparrow population into discoverers, joiners, and scouts. The finder is responsible for providing foraging areas and directions for the entire sparrow population, and the scout is responsible for alerting the sparrow population, the optimal position is fixed by the fitness value. It is found that the finder location update and initial population play an important role in the property of SSA. Therefore, we will improve SSA in two aspects proposed by [39]. These two aspects are improvement of Sobol initial population and improvement of finder location update, respectively. For improvement of Sobol initial population, the initial value can be generated through Sobol sequence. For improvement of finder location update, the location update of the finder will to some extent affect the sparrow’s search ability. Therefore, improvements should be made to the public display of the finder’s location update. Moreover, the T distribution will be introduced to SSA to improve the local search performance in the later stage proposed by [40].
(1) Improvement of Sobol initial population
A good initial position is conducive to finding the optimal value quickly, while a poor initial position may make the sparrow fall into a local optimum. At present, the common ways to generate high-quality initial values are Tent chaotic initial value, Logistic chaotic map, Sobol sequence, and Sin chaotic initial value. Because the initial data generated by the Sobol sequence has the advantages of uniform distribution and small differences in [41], we use the Sobol sequence to generate the initial data.
(2) Improvement of finder location update
The global search ability is not strong in SSA’s early stage, and it is easy to miss the optimal solution at non-zero places. The Salps swarm algorithm [42] can reduce the occurrence of this situation to a certain extent. The coordination among the Salps swarm algorithm is computed by c1((ub−lb)c2+lb), if it is directly applied to the SSA, the early search range of the SSA will be too large, which will affect the convergence speed. Therefore, it is improved to make it suitable for the SSA. After improvement, the location update formula of the finder is displayed in Eq. (7):
(7)
where lb means the lower boundary; ub means the upper boundary; c2 means the random number between [0,1]; c3 means the random number between [0,1]; Tmax means the maximum number of iterations; t means the current iteration number; c1=2e−(4t/Tmax) plays a very key role in local and global search.
(3) Improvement of T distribution
As a transition form between the Cauchy and Gaussian distributions, the T distribution combines the advantages of the Cauchy and Gaussian distribution by changing the degrees of freedom. The formula for updating the sparrow’s position by utilizing the T distribution is displayed in Eq. (8).
(8)
The meaning of the symbols in Eq. (8) can be referred to [43].
Applications
3.1. Data Collection and Analysis
Since 1965, the US has ranked first in CE in the world. Until 2005, China’s CE exceeded the US, and the US ranked second. However, the per capita CE of the US still ranked first in the world. The reason for the high CE in the US is the massive use of coal, natural gas, and petroleum. Due to the increasingly serious problem of global warming, reducing CE has become imminent. Finding the main sources of CE and formulating corresponding policies are the main ways to reduce CE. Because of the large amount of CE and the collectability of data in the US, it is very representative, and it is very reasonable to study it as an example.
The raw data for the average contribution rate forecasting of CE generated by coal, natural gas, and petroleum in the US are taken from EIA (https://www.eia.gov/). A total of 1836 data are selected, and the data of CE generated by coal, natural gas, and petroleum are 612, 612, and 612, and the starting and ending time of selection is from January 1973 to December 2023. For the CE data generated by coal, its max, min, mean and SD are 203, 49, 138, and 34. For CE data generated by natural gas, its max, min, mean and SD are 195, 50, 101, and 27. For CE data generated by petroleum, its max, min, mean and SD are 241, 133, 191, and 16.
3.2. Prediction Steps
In this study, the prediction process consists of five parts, which are described in detail as follows:
3.2.1. Data preprocessing
Data processing is a key step in improving forecasting accuracy [44]. In this study, according to the actual situation of NGC, data normalization processing is conducted. The normalization equation is displayed in Eq. (9) below:
(9)
where ymax means the normalized maximum data; ymin means the normalized minimum data; y means the normalized value; xmin means the minimum values before normalization; xmax means the maximum values before normalization.
3.2.2. Dataset segmentation and prediction principle
The raw data is divided into training sets, test sets, and prediction sets. The training set is used to train the prediction model, the test set is used to test the model, and the forecast set is used to predict future data. Because there is no clear division rule, the division of the training sample, test sample, and prediction sample in this study is 541:59:12. In this work, one-step prediction is used to predict future data. In this prediction process, the sliding time window (STW) must be set which is displayed in Eq. (10), and the length of the STW is set to 10, for all experiments and comparisons.
(10)
where xt means the input data; f means the forecasting function; Ft means the output value.
3.2.3. Forecasting
ISSA-LSTM is utilized to forecast the CE generated by coal, natural gas, and petroleum in the US. This is the second stage of the algorithm proposed in this work. During the training process, LSTM and ISSA are executed synchronously. When ISSA finds the optimal hyper-parameters in LSTM, LSTM will be trained. In this training process, the following objective functions (Eq. (11))
(11)
where Or means the observation values; Pr represents the forecasting values.
3.2.4. Anti-normalization and integration
In this subsection, the high- and low-frequency components decomposed by wavelet, which are predicted by ISSA-LSTM et al. respectively. After the prediction, the results of the low- and high-frequency components need to be denormalized. Furthermore, the normalized results need to be aggregated for the low- and high-frequency components. Therefore, through this process, the final forecasting results are obtained.
3.2.5. Forecasting error
This work selects U1, root mean square error (RMSE), U2, mean absolute error (MAE), and mean absolute percentage error (MAPE), the root mean square percentage error (RMSPE), as the evaluation indicators of the model. Its formula has been established in those studies [45].
Results
4.1. Optimal Layers of Wavelet’s Orders for CE Produced by the Coal
In this section, the CE produced by the coal is decomposed by using Haar wavelet, Daubechies wavelet, and Symflets wavelet into 231 high-frequency components and 231 low-frequency components. Then ISSA-LSTM is utilized to forecast the 462 components respectively, and the final prediction results are obtained through reconstruction. Three parameters of LSTM, which are learning_rate, maxEpochs, and numHiddenUnits, are optimized by ISSA.
Fig. 1 gives the comparison of prediction results of the test set based on different layers of different wavelet’s orders and ISSA-LSTM for the coal. In Fig. 1 (a), the prediction results obtained by OH, T1H, T2H, F1H, F2H, S1H, and S2H (from one layer to seven layers of Haar wavelet) can closely follow the changing trend of the raw data at many points. However, the difference between the predicted value obtained by F2H and S1H and the observed value at many points is a little larger than that obtained by other layers, especially for F2H.
Fig. 1 (b)–(f) gives the comparison of forecasting results of the test set based on one-seven layers of Daubechies wavelet’s two-six orders (ODBT1, T1DBT1, T2DBT1, F1DBT1, F2DBT1, S1DBT1, S2DBT1, ODBT2, T1DBT2, T2DBT2, F1DBT2, F2DBT2, S1DBT2, S2DBT2, ODBF1, T1DBF1, T2DBF1, F1DBF1, F2DBF1, S2DBF1, S2DBF1, ODBF2, T1DBF2, T2DBF2, F1DBF2, F2DBF2, S1DBF2, S2DBF2, ODBS, T1DBS, T2DBS, F1DBS, F2DBS, S1DBS, S2DBS) and ISSA-LSTM for the coal. From Fig. 1 (b), it can be seen that the prediction results obtained by ODBT1, S1DBT1, and S2DBT1-ISSA-LSTM greatly deviate from the raw data, the forecasting results obtained by T1DBT1, T2DBT1, F1DBT1, and F2DBT1 can closely follow the changing trend of the raw data, and the difference with the observed value is small. Moreover, from Fig. 1 (c), the prediction results obtained by F2DBT2-ISSA-LSTM greatly deviate from the raw data, the prediction results gotten by the other layers can closely follow the changing trend of the raw data, and the difference with the raw data is small. In Fig. 1 (d)–(f), the predicted values obtained from ODBF1, T1DBF1, T1DBF1, F1DBF1, F2DBF1, S1DBF1, S2DBF1, ODBF2, T1DBF2, T1DBF2, F1DBF2, F2DBF2, S1DBF2, S2DBF2, ODBS, T1DBS, T2DBS, F1DBS, F2DBS, S1DBS, S2DBS can follow the trend of the raw data, and the difference is not large.
Fig. 1 (b)–(f) gives the comparison of forecasting results of the test set based on one-seven layers of Symflets wavelet’s two-six orders (OSYMT1, T1SYMT1, T2SYMT1, F1SYMT1, F2SYMT1, S1SYMT1, S2SYMT1, OSYMT2, T1SYMT2, T2SYMT2, F1SYMT2, F2SYMT2, S1SYMT2, S2SYMT2, OSYMF1, T1SYMF1, T2SYMF1, F1SYMF1, F2SYMF1, S1SYMF1, S2SYMF1, OSYMF2, T1SYMF2, T2SYMF2, F1SYMF2, F2SYMF2, S1SYMF2, S2SYMF2, OSYMF, T1SYMS, T2SYMS, F1SYMS, F2SYMS, S1SYMS, S2SYMS) and ISSA-LSTM for the coal. For Fig. 1 (g), (h), and (j), the deviation between the forecasting data obtained by different layers of Symflets’s two-, three-, and five-orders and the raw data is small. Furthermore, in Fig. 1 (i) and (k), although the deviation between the predicted value obtained by F2SYMF1 and S2SYMS and the observed value is large, the deviation between the forecasting data obtained by other layers (e.g. T1SYMF1 and F1SYMF1 et al.) and the observed value is still small.
Table 1 displays the prediction error based on seven different layers of different orders of different wavelet and ISSA-LSTM for the coal.
In Table 1, the prediction error of the S2H is the smallest compared with the prediction error of the one to six layers of the Haar wavelet according to six error evaluation indicators including U1, U2, RMSPE, RMSE, MAPE, and MAE. Therefore, based on the prediction error comparison results of different layers of the Haar wavelet, using the S2H to predict CE generated by coal is the best.
In Table 1, the prediction error of F2DBT1 is the smallest compared with the prediction error of the other layers of Daubechies wavelet’s two orders. For the different layers of Daubechies wavelet’s three orders, the prediction error of the two layers is smallest. Concerning the different layers of Daubechies wavelet’s four, five, and six orders, the prediction error of five, six, and seven layers are smallest, according to six different error evaluation indicators. Compared with F2DBT1, T1DBT2, S1DBF2, and S2DBS, the prediction error of F2DBF1 is the smallest. And U1, U2, RMSE, MAPE, MAE, and RMSPE of F2DBF1 reduced by 0.2912, 0.6088, 1.6227, 0.3125, 2.4501, 0.2876, 0.4163, 0.4384, 0.0905, 1.1624, 0.0829, 2.9960, 0.0091, 0.0136, 0.0210, 0.0053, 0.0370, 0.0060, and 0.0036, 0.0033, 0.0018, 0.0171, 0.0016, 0.0282 compared with that of the optimal of Daubechies wavelet’s other orders.
In Table 1, the prediction error of seven, seven, two, and three layers of Symflets wavelet’s two-, three-, four-, and five-orders is the smallest compared with the prediction error of the different layers of Symflets wavelet’s two-, three-, four-, and five-orders. For the prediction error of different layers of Symflets wavelet’s six orders, MAE and MAPE of two layers are the smallest, however, RMSE, RMSPE, U1, and U2 of six layers are smallest, that is to say, the six prediction error indicators are not uniform. So, it is uncertain whether the prediction performance of two layers is good or the prediction performance of six layers is good based on the Symflets wavelet’s six orders. In a word, compared with the optimal layers of Symflets wavelet’s two-, three-, four-, and six-orders, the prediction error evaluation index of T2SYMF2 is the smallest.
By comparing the prediction error of S2H, F2DBF1, and T2SYMF2, it is found that the prediction error of F2DBF1 is the smallest based on six indexes. Therefore, F2DBF1 is suitable for decomposing CE generated by coal, and using ISSA-LSTM to predict its components can obtain lower prediction error indicators.
4.2. Optimal Layers of Wavelet’s Orders for CE Produced by the Natural Gas
Similar to section 4.1, in this Subsection, the optimal layers of wavelet’s orders based on ISSA-LSTM for CE produced by the natural gas are determined. Fig. 2 represents the comparison of prediction results of the test set based on different layers of different wavelet’s orders and ISSA-LSTM for the natural gas including the comparison of different layers of Haar wavelet, Daubechies wavelet’s different orders, and Symflets’s different orders. To better show the coincidence between the forecasting data and the observed data, some predicted values are hidden.
In Fig. 2 (a), for a total of 59 predicted values, some predicted values deviate greatly from the observed data, for example, the second and 52nd points obtained by OH, and the fourth point obtained by F1H, F2H, S1H, and S2H. The rest of the predicted values obtained from OH et al. have a small deviation from the observed values, such as the 57th, 58th, and 59th points.
In Fig. 2 (b) – (f), different layers of Daubechies wavelet’s different orders can follow the trend of the raw data. However, at some points, the deviation between the predicted value and the observed value is very large, such as the prediction values obtained by ODBT1 at the second, third, fourth, 51st, 52nd, and 53rd points, by F2DBT1 at 51st, 52nd and 53rd, 55th, 56th, and 57th, by ODBT2 at second, third, 6th, 7th, 8th, 55th points, by ODBF1 at fourth point, by F2DBF2 at 51 st, 52 nd th, 53rd, 54 th, 55th, 56th, 57th, and 58th points, and by F2DBS at 51st, 52nd, 53rd, 54th, 55th, 56th, and 57th points. In general, the predicted value obtained from different layers of Daubechies wavelet’s four orders is in good agreement with the observed value at most points.
In Fig. 2 (g), at the front of the breakpoint, except for the large difference between the forecasting data obtained from OSYMT1 and the observed data, the predicted value obtained from T1SYMT1, T2SYMT1, F1SYMT1, F2SYMT1, S1SYMT1, and S2SYMT1 is in good agreement with the observed value. At the back of the breakpoint, the predicted value obtained from different layers of Symflets’s different orders is very different from the raw data. In Fig. 2 (h), the predicted value obtained from OSYMT2 is quite different from the raw data at most points. In Fig. 2 (i), behind the breakpoint, the predicted value obtained from S1SYMF1 is quite different from the raw data. In Fig. 2 (j), the difference between the predicted value obtained from different layers of Symflets’s different orders and the observed value is very small. Similar to the prediction result in Fig. 2 (h), the predicted value obtained from OSYMS is quite different from the observed value at most points in Fig. 2 (k).
Table 2 gives the prediction error based on seven different layers of different orders of different wavelet and ISSA-LSTM for the natural gas. It can be seen from Table 2 that the forecasting error of two layers of Haar wavelet is the smallest, which is different from other layers of Haar wavelet based on RMSE, MAE, MAPE, RMSPE, U1, and U2.
As can be seen from Table 2, for two, three, four, and six orders of Daubechies wavelet, the prediction error obtained by two, two, two, and two layers is the smallest. For five orders of Daubechies wavelet, the prediction error obtained by six layers is the smallest. In addition, compared with two, two, two, and two layers of Daubechies wavelet’s two, three, four, and six orders, six layers of Daubechies wavelet’s five orders has the smallest prediction error, and the U1, U2, RMSE, MAPE, MAE, and RMSPE of six layers of Daubechies wavelet’s five orders are 0.0009, 0.0018, 0.2580, 0.2277, 0.0934, 0.0030, 0.0009, 0.0018, 0.2529, 0.1508, 0.0847, 0.0017, 0.0009, 0.0017,0.2466, 0.1413, 0.0582, 0.0022, and 0.0003, 0.0005, 0.0772, 0.0711, 0.0459, 0.0003, lower than those of two, two, two, and two layers of Daubechies wavelet’s two, three, four, and six orders.
As can be seen from Table 2, for two, three, and four orders of Symflets wavelet, the prediction error obtained by two, two, and two layers is the smallest. For five orders of Symflets wavelet, the prediction error obtained by six layers is the smallest. For six orders of Symflets wavelet, the prediction error obtained by four layers is the smallest. For optimal layers of Symflets wavelet’s different orders, the prediction error obtained by S1SYMF2 is the smallest.
According to the prediction error results in Table 2, the prediction error of S1DBF2 is the smallest, and compared with T1H and S1DBF2, the U1, U2, RMSE, MAPE, MAE, and RMSPE of S1SYMF2 is reduced by 0.0097, 0.0194, 2.7225, 1.0423, 0.4981, 0.0182, and 0.0019, 0.0038, 0.5368, 0.3274, 0.1744, 0.00368. Therefore, it is most appropriate to use S1SYMF2 to decompose CE generated by natural gas, which can obtain the minimum forecasting error and more accurate prediction results.
4.3. Optimal Layers of Wavelet’s Orders for CE Produced by the Petroleum
In this subsection, the optimal layers of wavelet’s orders based on ISSA-LSTM for CE produced by petroleum are determined. The wavelet basis function used is still Haar, Daubechies, and Symflets.
Fig. 3 (a) demonstrates the comparison of forecasting results of the test set according to different layers of the Haar wavelet. It can be seen from Fig. 3 (a) that the predicted value obtained from OH, T1H, T2H, F1H, F2H, S1H, and S2H is in good agreement with the observed value on the whole, but there are still some points where the predicted value and the observed value differ greatly, for example, the predicted value obtained from OH and T1H et al. is at point 4. Fig. 3 (b)–(f) demonstrates the comparison of forecasting results of test sets based on different layers of Daubechies wavelet’s different orders. For Fig. 3 (b)–(f), the forecasting data and the raw data are in good agreement, except for the large deviation between the forecasting data and the raw data obtained from ODBS, at some points.
Fig. 3 (g)–(k) demonstrates the comparison of forecasting results of test sets based on different layers of Symflets wavelet’s different orders. It can be seen from Fig. 3 that the forecasting data obtained from different layers of Symflets wavelet’s different orders is in good agreement with the observed data, especially the predicted value obtained from OSYMF2, T1SYMF2, T2SYMF2, F1SYMF2, F2SYMF2 S1SYMF2, and S2SYMF2 is in the best agreement with the observed value. However, the forecasting data gotten from OSYMT1 deviates greatly from the raw data, at individual points, such as points 4 and 5 in Fig. 3 (g). The forecasting data obtained from OSYMS deviates greatly from the raw data, at individual points, such as points 4 and 5 in Fig. 3 (k).
Table 3 displays the prediction error based on seven different layers of different orders of different wavelet and ISSA-LSTM for the petroleum. In Table 3, it can be seen that the prediction error of F1H is the smallest, based on U1, U2, RMSE, MAPE, MAE, and RMSPE, but it is not much different from that of S1H and S2H.
In Table 3, compared with the prediction error of other layers of Daubechies wavelet’s two orders, the prediction error of S1DBT1 is the smallest. Among different layers of Daubechies wavelet’s three orders, the prediction error of T2DBT2 is the smallest. For different layers of Daubechies wavelet’s four, five, and six orders, the prediction error of three layers of Daubechies wavelet’s five, six, and seven orders is the smallest. Furthermore, the prediction error of S2DBS is 1.8623, 0.3367, 0.6033, 0.0109, 0.0050, 0.0099, 0.2356, 0.1054, 0.1684, 0.0018, 0.0006, 0.0013, 0.7782, 0.0622, 0.1270, 0.0060, 0.0020, 0.0041, and 0.7836, 0.0574, 0.1307, 0.0066, 0.0021, 0.0042 lower than that of six, three, five, and six layers of Daubechies wavelet’s two, three, four, and five orders. So, for different layers of Daubechies wavelet’s different orders, the prediction performance of S2DBS is the best, from the perspective of the prediction error evaluation index.
It can be seen from Table 3 that four, seven, four, seven, and four layers of symflets wavelet’s two, three, four, five, and six orders have the smallest prediction error and the best prediction performance, compared with other layers of symflets wavelet’s two, three, four, five, and six orders. Moreover, S2SYMF2 has the smallest prediction error, compared with four, seven, four, and four layers of symflets wavelet’s two, three, four, and six orders., the prediction error of S2SYMF2 is 2.6265, 0.4120, 0.7792, 0.0155, 0.0070, 0.0139, 0.4456, 0.1807, 0.3024, 0.0030, 0.0012, 0.0024, 1.3517, 0.2814, 0.4887, 0.0078, 0.0036, 0.0072, and 0.2607, 0.0813, 0.1362, 0.0014, 0.0007, 0.0014 lower than that of four, seven, four, and four layers of symflets wavelet’s two, three, four, and six orders.
Based on the comparison results of prediction errors in Table 3, compared with F1H and S2DBS, S2SYMF2 has smaller prediction errors, that is, it has better prediction performance, and is more suitable for processing CE generated by the petroleum, to obtain higher prediction accuracy.
Discussions Related to This Study
Two key issues in this study are discussed. Model ablation is discussed in Section 5.1. Benchmark preprocessing method-ISSA-LSTM versus the established two-stage model is discussed in Section 5.2.
5.1. Model Ablation Discussions
To verify the impact of different components of the proposed model on the prediction results, taking the prediction of carbon emissions from coal as an example, this study investigates the effects of separately using LSTM and LSTM-ISSA on the prediction of carbon emissions from coal. The RMSE, MAE, MAPE, RMSPE, U1 and U2 in predicting carbon emissions from coal using LSTM alone are 8.8167, 2.6239, 7.8382, 0.1002, 0.0501, and 0.0988, while the errors in predicting carbon emissions from coal using LSTM-ISSA are 40.9320, 6.1432, 30.7916, 0.3348, 0.1949, and 0.3327, respectively. Compared with using LSTM alone for prediction, the prediction error of LSTM-ISSA is reduced, indicating that adding the ISSA module reduces the prediction error. The prediction error obtained by using LSTM-ISSA to predict the time series after wavelet processing has been improved compared to using LSTM-ISSA. This indicates that adding a wavelet processing module further improves the prediction performance.
5.2. Benchmark Preprocessing Method-ISSA-LSTM Versus the Established Two-Stage Model
In this study, a novel two-stage model is developed according to LSTM optimized by ISSA and WT for predicting the average contribution rate of CE generated by coal, natural gas, and petroleum in the US. Because different layers of different wavelet’s orders have a direct impact on the prediction accuracy, the optimal layers of wavelet’s order suitable for CE generated by coal, natural gas, and petroleum are determined. On this basis, future CE is predicted and the order of average contribution rate is fixed.
Three issues related to this study are discussed. First, the comparison between different preprocessing methods and the determined optimal layers of the wavelet’s order is carried out, which shows that using the optimal layers of the wavelet’s order to process CE generated by the coal, natural gas, and petroleum has better prediction performance. Second, compared with multi-step prediction, the one-step prediction has higher prediction accuracy, that is, better prediction performance; Third, because some error evaluation indicators obtained by the prediction results are inconsistent, the rationality of error indicators is discussed. In addition, some error evaluation indicators are inconsistent, such as the RMSE of T1SYMS being larger than that of S1SYMS, on the contrary, the MAPE of T1SYMS is smaller than that of S1SYMS, but, this does not affect the determination of the optimal layers of wavelet’s order. At the same time, the average error evaluation index is considered, which will make the comparison more reasonable. Although this study has achieved certain research results, there is still a lot of room for improvement. For example, this study only conducted vertical comparisons without horizontal comparisons. Therefore, in future research, it is necessary to compare it with research results in other literature. In addition, the results of this study only demonstrate the average contribution rate of carbon emissions generated by coal, oil, and natural gas. It is necessary to further strengthen the contribution of existing achievements to environmental science theory and to enrich the theoretical contributions of environmental science models. This will be the main focus of future research.
Acknowledgment
The China Postdoctoral Science Foundation (Grant No. 2021M702948), the Cultivation Project for Basic Research and Innovation of Yanshan University (Grant No. 2021LGQN015), and the Hebei Natural Science Foundation (Grant No. E2022203097).
Notes
Author Contributions
W.Q. (Lecturer) conducted all the experiments. Q.M. (Master student) collected the data used in the experiment and performed exception handling on the data. L.S. (Master student) used wavelet theory for data decomposition and wrote the manuscript. H.X. (Senior engineer) participated in the entire experimental process. X.Y. (Associate professor) provided guidance for the entire experiment. N.H. (Teaching assistant) wrote the manuscript and revised the manuscript. Y.W. (Senior engineer) made revisions to the manuscript.
Conflict-of-Interest Statement
The authors declare that they have no conflict of interest.
References
1. Li XM, Song Y, Yao ZS, Xiao RB. Forecasting China’s CO2 emissions for energy consumption based on cointegration approach. Discrete Dyn Nat Soc. 2018;1:4235076.
https://doi.org/10.1155/2018/4235076
2. Sun W, Zhang JJ. Analysis influence factors and forecast energy-related CO2 emissions: evidence from Hebei. Environ Monit. Assess. 2020;192:1–17.
https://doi.org/10.1007/s10661-020-08617-3
3. Yu JL, Shao CF, Xue CY, Hu HQ. China’s aircraft-related CO2 emissions: decomposition analysis, decoupling status, and future trends. Energy Policy. 2020;138:111215.
https://doi.org/10.1016/j.enpol.2019.111215
4. Karakurt I, Aydin G. Development of regression models to forecast the CO2 emissions from fossil fuels in the BRICS and MINT countries. Energy. 2023;263:125650.
https://doi.org/10.1016/j.energy.2022.125650
5. Bokde ND, Tranberg B, Andresen GB. Short-term CO2 emissions forecasting based on decomposition approaches and its impact on electricity market schedule. Appl. Energy. 2021;281:116061.
https://doi.org/10.1016/j.apenergy.2020.116061
6. Pan XF, Xu HT, Song ML, Lu YD, Zong TJ. Forecasting of industrial structure evolution and CO2 emissions in Liaoning Province. J. Clean Prod. 2021;285:124870.
https://doi.org/10.1016/j.jclepro.2020.124870
7. Huang YS, Shen L, Liu H. Grey relational analysis, principal component analysis and forecasting of carbon emissions based on long short-term memory in China. J. Clean Prod. 2019;209:415–423.
https://doi.org/10.1016/j.jclepro.2018.10.128
8. Li ML, Wang W, De G, Ji XH, Tan ZF. Forecasting carbon emissions related to energy consumption in Beijing-Tianjin-Hebei region based on grey prediction theory and extreme learning machine optimized by support vector machine algorithm. Energies. 2018;11(9)2475.
https://doi.org/10.3390/en11092475
12. Wang ZX, Ye DJ. Forecasting Chinese carbon emissions from fossil energy consumption using non-linear grey multivariable models. J. Clean Prod. 2017;142:600–612.
https://doi.org/10.1016/j.jclepro.2016.08.067
13. Wang ZX, Li Q. Modelling the nonlinear relationship between CO2 emissions and economic growth using a PSO algorithm-based grey Verhulst model. J. Clean Prod. 2019;207:214–224.
https://doi.org/10.1016/j.jclepro.2018.10.010
14. Ding S, Xu N, Ye J, Zhou WJ, Zhang XX. Estimating Chinese energy-related CO2 emissions by employing a novel discrete grey prediction model. J. Clean Prod. 2020;259:120793.
https://doi.org/10.1016/j.jclepro.2020.120793
15. Ofosu-Adarkwa J, Xie NM, Javed SA. Forecasting CO2 emissions of China’s cement industry using a hybrid Verhulst-GM (1, N) model and emissions’technical conversion. Renew. Sust. Energ. Rev. 2020;130:109945.
https://doi.org/10.1016/j.rser.2020.109945
16. Zhou WH, Zeng B, Wang JZ, Luo XS, Liu XZ. Forecasting Chinese carbon emissions using a novel grey rolling prediction model. Chaos Solitons Fractals. 2021;147:110968.
https://doi.org/10.1016/j.chaos.2021.110968
17. Gao MY, Yang HL, Xiao QZ, Goh M. A novel fractional grey Riccati model for carbon emission prediction. J. Clean Prod. 2021;282:124471.
https://doi.org/10.1016/j.jclepro.2020.124471
18. Qiao ZR, Meng XM, Wu LF. Forecasting carbon dioxide emissions in APEC member countries by a new cumulative grey model. Ecol. Indic. 2021;125:107593.
https://doi.org/10.1016/j.ecolind.2021.107593
19. Xie WL, Wu WZ, Liu C, Zhang T, Dong ZJ. Forecasting fuel combustion-related CO2 emissions by a novel continuous fractional nonlinear grey Bernoulli model with grey wolf optimizer. Environ. Sci. Pollut. Res. 2021;28:38128–38144.
https://doi.org/10.1007/s11356-021-12736-w
20. Sun W, Wang CF, Zhang CC. Factor analysis and forecasting of CO2 emissions in Hebei, using extreme learning machine based on particle swarm optimization. J. Clean Prod. 2017;162:1095–1101.
https://doi.org/10.1016/j.jclepro.2017.06.016
21. Zhao X, Han M, Ding LL, Calin AC. Forecasting carbon dioxide emissions based on a hybrid of mixed data sampling regression model and back propagation neural network in the USA. Environ. Sci. Pollut. Res. 2018;25:2899–2910.
https://doi.org/10.1007/s11356-017-0642-6
22. Fang DB, Zhang XL, Yu Q, Jin TC, Tian L. A novel method for carbon dioxide emission forecasting based on improved Gaussian processes regression. J. Clean Prod. 2018;173:143–150.
https://doi.org/10.1016/j.jclepro.2017.05.102
23. Qiao WB, Lu HF, Zhou GF, Azimi M, Yang Q, Tian WC. A hybrid algorithm for carbon dioxide emissions forecasting based on improved lion swarm optimizer. J. Clean Prod. 2020;244:118612.
https://doi.org/10.1016/j.jclepro.2019.118612
24. Wen L, Yuan XY. Forecasting CO2 emissions in Chinas commercial department, through BP neural network based on random forest and PSO. Sci. Total Environ. 2020;718:137194.
https://doi.org/10.1016/j.scitotenv.2020.137194
25. Pao HT, Fu HC, Tseng CL. Forecasting of CO2 emissions, energy consumption, and economic growth in China using an improved grey model. Energy. 2012;40(1)400–409.
https://doi.org/10.1016/j.energy.2012.01.037
26. Pao HT, Tsai CM. Modeling and forecasting the CO2 emissions, energy consumption, and economic growth in Brazil. Energy. 2011;36(5)2450–2458.
https://doi.org/10.1016/j.energy.2011.01.032
27. Mohmmed A, Li ZH, Arowolo AO, et al. Driving factors of CO2 emissions and nexus with economic growth, development and human health in the Top Ten emitting countries. Resour. Conserv. Recycl. 2019;148:157–169.
https://doi.org/10.1016/j.resconrec.2019.03.048
28. Menyah K, Wolde-Rufael Y. CO2 emissions, nuclear energy, renewable energy and economic growth in the US. Energy Policy. 2010;38(6)2911–2915.
https://doi.org/10.1016/j.enpol.2010.01.024
29. Mason K, Duggan J, Howley E. Forecasting energy demand, wind generation and carbon dioxide emissions in Ireland using evolutionary neural networks. Energy. 2018;155:705–720.
https://doi.org/10.1016/j.energy.2018.04.192
30. Ding P, Liu XJ, Li HQ, et al. Useful life prediction based on wavelet packet decomposition and two-dimensional convolutional neural network for lithium-ion batteries. Renew. Sust. Energ. Rev. 2021;148:111287.
https://doi.org/10.1016/j.rser.2021.111287
31. Qiao WB, Li ZY, Liu W, Liu EB. Fastest-growing source prediction of US electricity production based on a novel hybrid model using wavelet transform. Int. J. Energy Res. 2022;46(2)1766–1788.
https://doi.org/10.1002/er.7293
32. Shensa MJ. The discrete wavelet transform: wedding the a trous and Mallat algorithms. IEEE Trans. Signal Process. 1992;40(10)2464–2482.
https://doi.org/10.1109/78.157290
33. Qiao WB, Liu W, Liu EB. A combination model based on wavelet transform for predicting the difference between monthly natural gas production and consumption of US. Energy. 2021;235:121216.
https://doi.org/10.1016/j.energy.2021.121216
34. Qiao W, Fu ZH, Du MJ, Nan W, Liu EB. Seasonal peak load prediction of underground gas storage using a novel two-stage model combining improved complete ensemble empirical mode decomposition and long short-term memory with a sparrow search algorithm. Energy. 2023;274:127376.
https://doi.org/10.1016/j.energy.2023.127376
35. Li ZY, Liu L, Qiao WB. Short-term natural gas consumption prediction based on wavelet transform and bidirectional long short-term memory optimized by Bayesian network. Energy Sci. Eng. 2022;10:3281–3300.
https://doi.org/10.1002/ese3.1218
37. Qiao WB, Wang YN, Zhang JZ, Tian WC, Tian Y, Yang Q. An innovative coupled model in view of wavelet transform for predicting short-term PM10 concentration. J. Environ. Manage. 2021;289:112438.
https://doi.org/10.1016/j.jenvman.2021.112438
38. Gharehchopogh FS, Namazi M, Ebrahimi L, Abdollahzadeh B. Advances in sparrow search algorithm: a comprehensive survey. Arch. Comput. Method Eng. 2023;30(1)427–455.
https://doi.org/10.1007/s11831-022-09804-w
39. Zhang Z, He R, Yang K. A bioinspired path planning approach for mobile robots based on improved sparrow search algorithm. Adv. Manuf. 2022;10(1)114–130.
https://doi.org/10.1007/s40436-021-00366-x
40. Liu YW, Sun J, Shang YL, Zhang XD, Ren S, Wang D. A novel remaining useful life prediction method for lithium-ion battery based on long short-term memory network optimized by improved sparrow search algorithm. J. Energy Storage. 2023;61:106645.
https://doi.org/10.1016/j.est.2023.106645
41. Wu R, Huang HS, Wei JN, et al. An improved sparrow search algorithm based on quantum computations and multi-strategy enhancement. Expert Syst. Appl. 2023;215:119421.
https://doi.org/10.1016/j.eswa.2022.119421
42. Tubishat M, Ja’afar S, Alswaitti M, et al. Dynamic salp swarm algorithm for feature selection. Expert Syst. Appl. 2021;164:113873.
https://doi.org/10.1016/j.eswa.2020.113873
43. Li Z, Guo JF, Gao XY, Yang XH, He YL. A multi-strategy improved sparrow search algorithm of large-scale refrigeration system: Optimal loading distribution of chillers. Appl. Energy. 2023;349:121623.
https://doi.org/10.1016/j.apenergy.2023.121623
44. Song CG, Yao LH, Hua CY, Ni QH. A novel hybrid model for water quality prediction based on synchrosqueezed wavelet transform technique and improved long short-term memory. J. Hydrol. 2021;603:126879.
https://doi.org/10.1016/j.jhydrol.2021.126879
45. Liu JY, Wang SX, Wei N, Chen X, Xie HY, Wang J. Natural gas consumption forecasting: A discussion on forecasting history and future challenges. J. Nat. Gas Sci. Eng. 2021;90:103930.
https://doi.org/10.1016/j.jngse.2021.103930
47. Lei YG, He ZJ, Zi YY. Application of the EEMD method to rotor fault diagnosis of rotating machinery. Mech. Syst. Signal Proc. 2009;23(4)1327–1338.
https://doi.org/10.1016/j.ymssp.2008.11.005