Modeling Methane Emission of Wastewater Anaerobic Pond at Palm Oil Mill Using Radial Basis Function Neural Network

Plant-based industries such as palm oil mills will generate wastewater rich in organic matter. Palm oil mill effluent (POME) treatment in Indonesia is still dominant with conventional methods without the capture of methane. This system does not know the value of methane emitted into the atmosphere. Measurement and testing of biomethane from anaerobic ponds of palm oil mills are relatively difficult because gas material is rapidly changing. An alternative methodology that is accurate through modeling with a radial basis function neural network (RBFNN) with abiotic variable input. The aim of this research is to find out an anaerobic pond methane emission model of POME and simulation to find out the dynamics of methane emissions. Methane emission data is measured by a TGS2611 methane gas sensor CH4-meter system and using closed static chambers. A sampling of wastewater and methane gas was conducted in October-November 2018. The results showed that the methane gas emission model was obtained in the AP with RBFNN. The best RBFNN model had a 5-5-3 network architecture, spread 0.11 and error-goals 0.0005, R 0.940652 and MSE 0.003166. The reliability of RBFNN in determining models with non-linear field data variables was quite good, which was influenced by the number of data patterns, types and accuracy of the variables, network architecture, and the ANN model used. The simulation and prediction of methane emissions in the lowest-moderate-highest variable value scenario found that the COD-R and VS-R variables greatly affected the anaerobic pond WWTP emissions of multiple feeding systems. Even so, inlet wastewater temperature and rainfall variables had not significantly affected methane gas emissions, because the temperature was in a mesophilic range (30-40 C) and the effect of rainfall would depend mainly on the high-low levels of organic matter (COD and VS). Keywords— RBFNN; methane emissions; anaerobic pond; POME; simulation dan prediction; COD and VS.


I. INTRODUCTION
The industrial and transportation sectors in Indonesia still use many energy sources from fossil fuels which increase GHGs as pollutants in the atmosphere. From industrial activities also produced other sources of pollutants in the form of solid waste and liquid waste. Plant-based industrial wastewater (vegetable) rich in organic material can be a source of pollution (pollutants). These industries include tapioca factories, crumb rubber factories, sugar factories and palm oil mills [1].
Palm oil mills produce high enough wastewater, between 0.75-0.9 m 3 POME/ton FFB [2]. The palm oil mill effluent (POME) needs further processing before it can be discharged into the environment or utilized for land applications. POM wastewater treatment in Indonesia still uses the conventional method with a pond system (ponding) so that it will emit methane gas directly into the atmosphere. Methane (CH 4 ) is the second most important greenhouse gas (GHG) after carbon dioxide (CO 2 ). Although it accounts for less than 0.5% of the concentration of atmospheric carbon gases, it is around 20% of the power of global radiation [3]. This is because methane has a much stronger radioactive power (34 times stronger than CO 2 ); [3]. In the palm oil mill wastewater treatment plant (POM-WWTP) the amount of methane released from the anaerobic pond is not known with certainty but continues to increase with the increase in oil palm plantations and mills. Methane is not only a source of pollutants but also has the potential to be a renewable energy source that is environmentally friendly and sustainable [4], [5].
It is necessary to develop a methodology in measuring and estimating methane emissions so that its value can be known from time to time. This is to support the presentation of high accuracy data (tier 3) for the management and control of methane emissions, especially in the development of the conversion of POME into an energy source. Support the policy of implementing new and renewable energy (NRE) > 23% in 2025 and 31% in 2050 (GL 79/2014), as well as the Indonesian government's commitment in the 2015 Climate Change Summit, with the target of reducing greenhouse gas emissions to 29% in 2030.
Methane is a gaseous compound that has difficulty in measuring and testing. In order to obtain accurate results, there needs to be a capable but straightforward method of analyzing non-linear environmental data, one of which is by modeling which utilizes the input variables of abiotic environmental factors. It is possible to make predictions, forecasting, and simulations [6]. One such methodology is modeling using artificial neural networks (ANN) with a radial basis function neural network (RBFNN).
The use of RBFNN in various scientific fields has been widely researched and reported, such as in the fields of health and natural resources and the environment with quite good results. These studies include diagnoses of diabetes [7], studies of survival opportunities for burn patients [8], diagnosis of several diseases [9], prediction of palm oil production [10], prediction of surface roughness [11], estimation of solar radiation [12], and classification of pest detection in tea plantations [6].
The RBF neural network can be applied to predict and simulate methane gas emissions in anaerobic ponds based on abiotic parameters (environmental factors and wastewater factors) that are more relatively simple in-field measurements and testing. This study aims to find a methane emission model from the RBFNN-based POM-WWTP anaerobic pond and conduct simulations with various abiotic variable scenarios to determine the dynamics of methane emissions.

A. Description of Palm Oil Mill Wastewater Treatment
Field measurements and sampling of palm oil mill wastewater were in the region of Banyuasin Regency, South Sumatra Province, with a production capacity of 60 tons of FFB/hour, ± 21 km away from Palembang (-2.826S, 104.732E). The WWTP consisted of 7 pounds, including 3 oil quotation ponds, 1 cooling pond, and 3 anaerobic ponds (AP).
This research focused on anaerobic ponds in its activities on the degraded of organic matter, which emitted methane gas as one of the greenhouse gases (GHG) and caused global warming and climate change. Methane gas was also, at the same time a potential source of new and renewable energy (NRE). This gas was emitted into the atmosphere from the microbial activity of an organic anaerobic pond which was characterized by active bubbles biogas production (biomethane). The measurement of methane gas emissions was carried out in three anaerobic ponds (AP). AP dimensions could be seen in Fig. 1 with a depth of ± 6 meters, and the total volume of the entire AP ± 46,305 m 3 , with HRT > 130 days.
Wastewater after treatment was used for the irrigation of oil palm plantations (land application). The wastewater treatment process started from the oil extraction pond to the gravity cooling pond, then was pumped and fed to AP2-AP1 (combined) and AP3 (± 500 meters) together (multiple feeding), with a ratio of 50:50, 40:60, and 60:40 according to the quality of the processing results to meet the BOD ≤ 5000 mg/L (maximum content in POME land application).

B. Monitoring the Characteristics of Wastewater and Biomethane Production
Wastewater sampling from anaerobic ponds inlet and outlet in the period per two weeks for ± 2 months (n = 6) included 6 sampling locations (Fig. 1). Wastewater samples, from each sampling point, were compiled from palm oil mill (POM) operations in the morning (±09.00) and afternoon (±16.00). A pond depth composite (0-1 meter) was also performed. The wastewater characteristic test variables were COD (chemical oxygen demand), VS (volatile solids), AP inlet and outlet temperatures, and rainfall. COD with the COD-Vario Photometer-System, Lovibond testing method; Volatile solid (VS) with the Standard Methods for the Examination of Water and Wastewater testing method [13]; pH with a portable pH-meter Adwa AD-111 directly in the field. The rainfall data in mm/day from the data of the rainforest plantations of one group of oil palm plantations was ± 2,000 meters from the WWTP outlet (-2.821S, 104.700E).
Measurement of methane gas emissions was done with closed static chambers equipped with CH 4 -meters. Chamber to capture biogas (methane), made of transparent polypropylene (PP) material, in the form of a cylinder with a size of 0.30 x 0.28 x 0.415 m (top diameter x bottom diameter x height), containment volume = 0.02742 m 3 (27.42 liters) and a 0.07 m 2 cross-sectional area. The volume of the containment became 25.44 liters when the application was above the anaerobic pond, with 3 cm submerged below the surface of the pond (effective height of the hood = 0.385 m); and placement of hoods in locations around the inlet, middle and anaerobic pond outlets, on AP2-AP1 and AP3. Measurement of methane gas emissions was carried out for six days (n = 6) on the combined AP2-AP1 (n = 3) and AP3 (n = 3), for 12 hours per day (06.00 to 18.00), with air flushing chamber every 2 hours.
The methane gas concentration measuring instrument used a sensor system, namely, CH 4 -meter modified from [14]. The CH 4 -meter was equipped with a TGS2611 methane sensor, SHT11 air temperature and humidity sensor, Arduino Mega 2560 microcontroller (ATMega2560), 20x4 LCD, and data logger (micro SD) storage. The TGS2611 and SHT11 censored from the CH 4 -meter are mounted on the chamber [15]. (1) Where: E is emissions/flux CH 4 (mg/m 2 /minute); dc/dt is difference in CH 4 concentration per unit time (ppm/minute); Vch is containment volume (m 3 ); ACh is cover area (m 2 ); Wm is molecular weight CH 4 (16.04.10 3 mg); Vm is molecular volume of CH 4 (22.41.10 -3 m 3 ) and T is chamber air temperature on average at sampling ( o C). The total methane gas emissions rate per sampling point per 6 hours and per day was calculated by integrating the emission value using the Simpson Numerical Method [15,18], as follows: (2) Where: f(x) is total emissions of methane (mg/m 2 /6 hours or day); a is the first hour of measurement of emissions and b is the final hour of measurement of emissions.

C. Radial Basis Function (RBF) Algorithm
RBF neural network structure consisted of three layers, namely the input layer, the hidden layer, and the output layer [19,20,21]. The input layer consisted of the source node (sensor unit) that connects the network to its environment. In the hidden layer applied a nonlinear transformation from the input layer to the hidden layer, so we needed an unsupervised learning method to apply it. At the output, layer was linear so with the guided learning method in the process [21,22]. The connection between the input and the hidden layer had no weight. Neurons hidden in processing units performed radial basis functions [20], [24]; (Fig. 2) The RBF neural network, the hidden layer used the Gaussian activation function as a radial basis function [24], with the mathematical notation: and in the form of equations [19]- [22], [24]: Where: φj is the Gaussian function, ║.║ is the Euclidean norm (distance), and σj is the standard deviation (hidden layer node width) of the Gaussian function to j with the center value (% $ ). The output values of the RBF network are (4 5 = 6 5 % : . # $ % = 7 8 $5 9 $:; . Where: N is the number of neurons (cluster) of hidden units, 8 $5 is the weight connection between nodes in the hidden layer to the output. Function of σ = is in accordance with [6], [19], [25]: Where: d max is the largest distance value in hidden j and h is the number of centers.
Unsupervised learning method was carried out to determine the center and standard deviation value of the input variables at each node in the hidden layer.
After getting the value in the hidden layer node, the next step is to calculate the hidden layer to the output layer by using supervised learning method, it is the same as multilayer perception (MLP); [26], [36].
Training algorithm to analyze data characteristics through RBFNN would follow these stages [19]: a. Initialize the weights on the hidden layer to the output layer randomly. Then, all the outputs (y k ) are calculated by using the equation (5); b. Calculate the error or the difference in the output results (C 5 , it is the unit of error that will be used for weight changes as shown by: Where: 5 is the target of input data and 4 5 is the output in k node. c. If the error level does not match with what is desired (close to zero), then the weight change rate (8 $5 ) will be calculated when seeing the weight change, with the acceleration α as shown by: In this phase, the error calculation in the hidden layer is not calculated, because when the input layer moves to the hidden layer K-means algorithms has been carried out, so the values obtained are appropriate. d. Stage on the weight change is conducted by calculating all weight changes, ie: This process continues until the weight does not change again (fixed).

D. Pre-Processing and Compiling Input-Output Data
Data obtained, through field measurements, and from laboratory analysis. The data were then processed and displayed using Matlab R2017b and MS.Excel. Before being used in the training and testing of the RBF neural network, the data were normalized using the following function.
Where: a is the minimum value, b is the maximum value, x is the original data, and x' is the normalized data. Data normalization is intended to make the value of all data between 0.1-0.9 [26,27]. The model input data uses 5 variables, which were environmental factors and wastewater factors, namely: chemical oxygen demand removed (COD-R); (mg/L), volatile solid degraded (VS-R); (mg/L), the temperature of anaerobic pond inlet and outlet wastewater ( o C), and rainfall (mm/day). While the model output data were methane emissions (mg/m 2 /6 hours (morning-noon)), methane emissions (mg/m 2 /6 hours (noon-afternoon)), and methane emissions per day (mg/m 2 /day). Of the 6 field data patterns divided into two, one each for training data and testing data. Training data was to build a radial basis function neural network model and testing data was to test the formed network model.

E. Evaluation of Predictive Accuracy
The reliability of prediction and simulation models was demonstrated through statistical analysis. Statistical analysis that was commonly used to build good predictive capacity and capacity by trained ANN were: R, MSE, MAE, and RMSE [22,24,29,30,31,32] With the following formula: Where: R is the correlation coefficient between observational and predictive data, Q i is observable data, P i is the predictive value, n is the number of repetitions of data. MSE (mean square error) is a general measure of the difference between the predicted value of the model and the observed value. MAE (mean absolute error) is to measure the accuracy of predictions with the average error in units of the same size as the origin (measuring the predicted value how close to the observed value). And RMSE (normalized/ root mean square error) is for fast performance information which is a measure of the variation of values determined around the observation data.

A. Modeling Methane Emissions with RBFNN
The process of degradation of organic matter in the palm oil mill wastewater treatment plant (POM-WWTP) produced methane which was the simplest form of gas hydrocarbons. Biomethane in biogas through the performance of microorganisms was affected by several factors, such as: temperature, pH, nutrients (organic matter), toxicity, HRT, OLR, reactor design, and redox potential (Eh); [33], [34]. To find out the influence of environmental factors and wastewater factors (abiotic factors) on methane production could be done through a study through modeling, such as radial basis function neural network models (RBFNN).
The determination of the input layer was based on a literature review and trial-error [6], that was, the variable has a high closeness relationship to methane emissions. Trialerror methods are also carried out to determine the optimum configuration in network training [35] and obtained the best ANN architecture. The type and number of variables correlated and influenced on microbial activity in the overhaul of organic matter in anaerobic ponds to produce biogas (methane). Using the Matlab R2017b program to find and test variable combinations related to methane production. The process of training and testing the RBF network was with several trials on spread values, error-goals, types and number of different variables, so that the smallest MSE, RMSE, and MAE values and the highest R were obtained as the best network architecture [22]- [25].
Distribution of training and testing data got each of 3 sets of patterns, and after normalizing the data, testing was carried out in order to obtain the best architecture and RBFNN models of some of the input variables that were owned and the objectives in the development of the model (Table 1).
The results of the training and network testing process with spread values of 0.11, 0.2 and 0.3 were the three lowest values to choose the best one based on the highest R-value and the lowest MSE in the training data and /or testing data. This happened because in the iteration, the model with errorgoal (eg) that was very, very small (near zero) has been obtained, namely MSE < 10 -35 , so the change in eg did not affect the decline in the value of MSE in trial-error.    The spread value was the density of the radial basis function, the more extensive spread was smoother function (default spread = 1); [25], and spread constants could cause RBF to be overfitting or underfitting [36], so it needed constant spread right, not too high or too low.
Based on the R and MSE values of training and testing data, the best model was at a spread of 0.11 at eg, 0.0005 (Fig. 3). The optimal RBFNN network architecture was obtained 5-5-3 (5 input layer neurons, 5 hidden layer clusters and 3 output layer neurons) which had the highest correlation coefficient (R = 0.940652); ( Table 2). The regression line showed the correlation between the test results (output) of the best network model with a very high R value or the coefficient of determination (R 2 ) 0.8848 (Fig. 3). This meant that the predicted output with the RBFNN model with the variable (input layer) used was able to explain methane emissions (actual data) from a real anaerobic pond, while other variables influenced the remaining 1-R2. The input variables were: COD, VS, temperature, and rainfall; this was in line with previous studies [15].
The input layers of this network were COD-R, VS-R, AP inlet temperature, AP outlet temperature, and rainfall, which were abiotic environmental variables affecting the production of biogas (methane) in anaerobic ponds of palm oil mills [15].
The network with MSE, RMSE and MAE values from the test data on each spread with the smallest value proved to be the best network model, this meant that the network had reached optimal conditions (convergent). The error values mentioned above in the training data were very small-close to zero (10 -16 ), this meant that the training process with training data was very effective in building the model. Therefore, the determination of the best network model using the variables R, MSE, RMSE, and MAE was in accordance with the training and testing data from this study. The same thing had been conducted by Haviluddin and Tahyudin [37] with RBFNN for the prediction of internet network data traffic in East Kalimantan, and Sofian et al. [28] for the prediction of monthly rainfall in South Sumatra.
After training and testing and data denormalization, the network output values could be obtained. By comparing the network output values with the actual data, the obtained average output error coefficients for Y1 and Y2 were 0.2593; Y3 was 0.2245 (Table 3). This value was strongly influenced by the number of data patterns, types and accuracy of variables, network architecture, and the ANN model used [22,27]. RBFNN was very good in modeling the field of natural resources and the environment that has non-linear data characteristics, as in previous studies [6], [10]- [12], [38].

B. Simulation of Methane Emissions in Anaerobic Ponds
The network model obtained by the RBF requirements could be used for simulations and predictions, to find out the performance and dynamics of biomethane production from the oil palm mill AP. Simulations were with the lowest, mean, and highest scenarios of various input layers: COD-R, VS-R, AP inlet temperature and rainfall.  The highest AP methane emission rate was 67,768 mg/m 2 /day in the highest VS-R of 20,220 mg/L, with other variables having a mean value (Fig. 4). This value proved that high methane emissions were not always at high COD values. Through this fact, it was shown that the variable VS had an important role in the formation of methane in the AP (coefficient conversion of methane to VS). This was in line with research [15], [34], [40]. The VS-R ratio to methane production according to [34], 0.7 m 3 CH 4 /kg VS were degraded in the urban domestic wastewater anaerobic process.
At the highest COD-R value (30,000 mg/L), methane 35,021 mg/m 2 /day was emitted or half of the methane emissions at the highest VS-R. This supported the previous discussion that COD-R, which was of medium value (average) but with a maximum VS-R value, would emit the highest methane. Thus, the highest COD-R value did not always cause the highest methane emissions, but there were other factors, such as volatile solid. The VS value of organic solid waste was the estimating variable of the amount of methane [39].
The highest methane emission value was on the average COD-R value (49,913 mg/m 2 /day), this was believed to be the influence of other factors such as pH value and oil-fat content. This could occur and be improved due to the POM-WWPT applying the multiple-feeding system method. In this method, the wastewater was fed to several anaerobic ponds simultaneously, after going through the oil quotation ponds and cooling ponds. This system could cause pH values and oil-fat levels that were dynamic (fluctuating) which could affect the performance of micro-organisms in the degradation of organic matter wastewater that formed biomethane.
Anaerobic pond methane emissions did not appear to be affected by changes (minimum-moderate-maximum simulation) of the temperature of the wastewater fed to the AP inlet (49,913 mg/m 2 /day). This could be understood because the temperature of the wastewater inlet and the temperature of the waste pond, was relatively in a mesophilic temperature range (30-41 o C), so that the relative microbial activity was the same [5,40]. Anaerobic decomposition could occur in three temperature ranges, R a i n f a l l namely: psychrophilic (< 30 o C), mesophilic (30-40 o C) and thermophilic (50-60 o C). Anaerobic decomposition process was very sensitive to temperature changes, the optimal temperature of thermophilic ranged 52-58 o C, but negative impacts could occur at temperatures higher than 60 o C. This was caused by the toxicity of ammonia which increases with increasing temperature [41], [42].
Simulations on rainfall variables that occurred in the POM-WWTP environment did not affect methane emissions (Fig. 4). This was believed to be due to relatively small rainfall (< 90 mm/day). Rainfall caused changes in pH and temperature in anaerobic ponds. However, when the research was conducted, the rainfall that had not been able to cause changes in the two environmental variables, so the effect of rainfall on the simulation of methane emissions was not yet significant. It was also related that at the highest rainfall, mean COD-R and VS-R, were not able to decrease or increase the value of methane emissions, this reinforced the explanation that methane emissions were very much determined by the COD-R and VS-R variables. COD values indicated high levels of organic matter, while VS indicated high volatile organic matter (volatile) as a basic component of biomethane formation by methanogenic bacteria. COD and VS variables could be used to measure the biodegradation of organic matter [39]. The research by Irvan et al. [43], resulted in COD and VS decomposition efficiency reaching 77 and 63.5% in 8-day HRT with CSTR reactors for POME processing in North Sumatra, respectively.
At the level of all the highest variables, the methane emission was not the highest, namely 53,396 < 67,768 mg/ m 2 /day ( Fig. 4 and 5). This relates to the highest rainfall value as well, so that in this condition, rainfall was a factor that causes the detention of methane emissions, up to certain rainfall intensity. This condition was different from the previous explanation where other variables were of average (moderate) value and maximum rainfall resulted still the same methane emissions. The facts above showed that the increase in methane emissions is largely determined by the variables COD and VS [34]. At the lowest variable value of methane emissions of 22,470 mg/m 2 /day, it was worth half of the methane emissions in the average variable (44,913); (Fig. 5). This value was the lowest emission in this simulation (Y3), this occurs because of the levels of input variables with the lowest levels of organic matter COD and VS as well. Methane emissions for the 6 hour period noon-afternoon (12.00-18.00); (Y2) were higher than the morning-noon (06.00-12.00); (Y1) when the variables were of average value (12,980 > 10,487 mg/m 2 /6 hours) and maximum (12,858 > 12,619 mg/m 2 /6 hours), except for minimal variable levels (Fig. 5). This fact explained that during the noon-afternoon period the methane emissions were higher with the activity of methanogenic bacteria, the formation of methane was more active due to the increased temperature of the wastewater pond [40]. The increase in the temperature of the wastewater pond in the noon-afternoon period due to the drastic air temperature increased during this period, spurred an increase in the performance of microbes for the degradation of organic matter from anaerobic pond wastewater so that methane was emitted higher. Another finding was that the morning-noon period methane emission at the lowest variable level was higher (7,605 > 4,492 mg/ m 2 /6 hours) than the noon-afternoon period. This could be explained that the change in temperature of wastewater, due to changes in air temperature by solar radiation or rainfall intensity significantly affects the methane emissions, in addition to the value of organic matter (COD and VS) wastewater. This fact served as preliminary information and the need for further research to determine the effect of the dynamics of wastewater temperature -anaerobic pond temperature -organic matter content on the value of methane emissions in anaerobic ponds.

IV. CONCLUSIONS
The RBFNN had been successfully used to build the POM-WWTP anaerobic pond methane emissions model. The best RBFNN model was with 5-5-3 network architecture, spread 0.11 and e.g. 0.0005, R 0.940652 and MSE 0.003166. The reliability of RBFNN in determining models with nonlinear field data variables was quite good. This ANN model was influenced by the number of data patterns, types and accuracy of variables, network architecture, and the ANN model used. Simulation and prediction of methane emissions in the lowest-average-highest variable value scenario found that the COD-R and VS-R variables significantly affect the anaerobic pond methane emissions in the WWTP palm oil mill multiple feeding system. However, the role of inlet wastewater temperature and rainfall had not yet been seen to play a role in methane emissions, because the temperature was in a mesophilic range (30-40 o C) and the effect of rainfall would depend largely on the high-low levels of organic matter (COD and VS).

ACKNOWLEDGMENT
The authors thank PT SPOI's management for facilitating the fieldwork. The first author thanks the Dean of the Faculty of Science and Technology, State Islamic University of Raden Fatah Palembang for their support and permission to carry out research. Thank you to Muhammad Syafriansyah (Sriwijaya University) and Adi Kurniadi (State Islamic University of Raden Fatah Palembang) for their support in taking field data.

REFERENCES
[1] Suprihatin. 2009. Ecological and financial benefits of using agroindustrial wastewater as a raw material in biogas production to reduce greenhouse gas emissions. Journal Agromet, 23 (2)