SENSITIVITY STUDY OF THE EFFECT POLYMER FLOODING PARAMETERS TO IMPROVE OIL RECOVERY USING X-GRADIENT BOOSTING ALGORITHM

Implementation of waterflooding sometimes cannot increase oil recovery effectively and requires additional methods to increase oil recovery. Polymer flooding is a common chemical EOR method that has been implemented in the last few decades and provides good effectiveness in increasing oil recovery and can reduce the amount of injection fluid injected into the reservoir. Seeing the success of polymer flooding in increasing oil recovery, it is necessary to know the parameters that influence the success of polymer flooding so that it can be evaluated and taken into consideration in creating a new scheme to increase oil recovery with polymer flooding. The parameters tested in this study include Injection Rate, Injection Time, Injection Pressure, Adsorption, Inaccessible Pore Volume, Residual Resistance Factor. This research uses the X-Gardient Boosting Algorithm to look at the most influential parameters in polymer flooding. These results highlight the significance of injection time and injection rate as key factors affecting the effectiveness of polymer flooding in the studied case.


Introduction
The recovery of oil and natural gas in fields that have entered the tertiary recovery stage is generally carried out using Enhanced Oil Recovery (EOR) methods. EOR is a technology that has gained attention from many oil companies as a solution to increase the recovery of oil and gas reserves, and chemical EOR, especially polymer injection/flooding, has been commonly implemented in the last decade (Juárez-Morejón et al., 2019;Koh et al., 2016). The mechanism of polymer flooding is expected to reduce the mobility ratio of the injection fluid, thus reducing fingering and increasing the efficiency of fluid displacement (Gogarty, 1967;Koh et al., 2016;Lake, 1989;Needham & Doe, 1987;Saqer & Osama, 2016;Skauge et al., 2014;Wassmuth et al., 2007).
To support the smooth implementation of polymer flooding, there are several parameters that are considered to contribute to its success, and this study aims to determine the most influential parameter in order to provide a good scheme for implementing polymer flooding in the field. If the mobility ratio is greater than one, then polymer flooding will be carried out to increase the viscosity of the injection fluid and reduce the saturation oil residual in the reservoir (Cenk et al., 2017). Saturation oil residual is one of the parameters used to measure the success of polymer flooding (Koh et al., 2016), where the value of saturation oil residual obtained after polymer flooding can be used as an indicator of the method's success. The success of polymer injection/flooding can be influenced by several factors such as reservoir salinity or injected water salinity, reservoir temperature or injected water temperature, and reservoir rock properties (Hidayat & ALMolhem, 2019).
Experiments have been conducted on porous media that show the properties of the injected polymer can be influenced by polymer adsorption in the reservoir, residual resistance factor, and inaccessible pore volume (Hidayat & ALMolhem, 2019). There are several parameters that determine the success of polymer flooding, including Pore Volume (PV), Injection Rate, Injection Time, and Injection Pressure (Erfando et al., 2019). In order to achieve efficiency in the implementation of polymer flooding for economic oil recovery, it is important to study the parameters that have a significant impact on polymer flooding. Therefore, this study will perform sensitivity analysis using different methods and adding parameters that are considered to have an reservoir with oil-wet rock characteristics. The parameters used for this study were rock properties, and the most influential result was connate water saturation, which had a significant impact on increasing oil recovery in the field (Firozjaii & S, 2018).
In 2014, oil recovery prediction was also carried out for Surfactant Polymer Flooding using the Response Surface Modeling (RSM) method with varying random parameter values in order to obtain the best scheme for the Surfactant Polymer Flooding process (Douarche et al., 2014). There have been many methods used to analyze the performance of chemical EOR, particularly for sensitivity analysis. Since the 1990s, the application of artificial intelligence in analyzing EOR operations has been an interesting subject for researchers, and recently, approaches such as machine learning have also been widely used in the oil industry (Larestani et al., 2022). Machine learning implementation has also been widely used in the polymer flooding process, such as by (Tadjer et al., 2021), where they evaluated and compared various types of machine learning algorithms located in the Approximate Dynamic Programming (ADP) domain to obtain optimal Value of Information (VOI) in determining the appropriate scheme for polymer flooding.
One of the machine learning algorithms, X-Gradient Boosting Algorithm, is considered the most powerful algorithm for building prediction models (Freund & Schapire, 1997). This algorithm has been used in several studies related to polymer flooding to develop prediction models and provide a good scheme for polymer flooding projects. (Phankokkruad & Wacharawichanant, 2019) predicted the mechanical properties of high molecular weight polymers using the X-Gradient Boosting Algorithm. The obtained results were compared with laboratory testing to validate the model accuracy. The application of the X-Gradient Boosting Algorithm resulted in a good level of efficiency and saved time in method development.
In the same year, (Shaik et al., 2019) conducted a study on the prediction of the response produced by monovalent and divalent ions on polymer properties. The testing was conducted to examine the effects of concentration, temperature, shear rate, and salinity on the rheology of the polymer. Recently, (Larestani et al., 2022) compared two algorithms, the cascade neural network and gradient boosting decision tree, to predict the performance of surfactant-polymer flooding. A sensitivity analysis was also conducted on the input parameters used. The sensitivity analysis results showed that surfactant concentration and surfactant slug size were the most influential factors in predicting the Recovery Factor (RF). The use of the X-Gradint Boosting algorithm to determine the sensitivity factor is a breakthrough to obtain fast, accurate and easy results, especially for polymer flooding.

Research Methods
This research used the CMG (Computer Modelling Group) software with the STARS simulator for modelling the base case and CMOST to assist in iterative modelling. Then, a machine learning analysis was performed using the X-Gradient Boosting Algorithm to build a predictive model and determine feature importance in determining the influential parameters in polymer flooding.  The following are rock and reservoir fluid characteristic data obtained from the journals (Erfando et al., 2019) and (Hidayat & ALMolhem, 2019) as the base case model. The rock characteristic used in this research is conglomerate rock originating from the Barito Basin, Borneo. This reservoir model will be assumed as a heterogeneous reservoir with varied porosity and permeability values. Porosity heterogeneity will be conducted in the range of 5% -40% values and permeability in the range of 30 mD -400 mD based on the criteria of conglomerate rock in the journal published by (Huggett, 2006).

Results and Discussions
Initiation and model simulation have been carried out using CMG-STARS, where the simulation model was run for a period of 5 years from 2022 to 2027, and the results obtained for OOIP can be seen in Table 4 below. From the results of the model simulation, a graph of oil recovery vs time was obtained with the aim of examining the effect of polymer injection on the case or field under study. The graph can be seen in Figure 2. After initializing the reservoir model, modeling iterations were performed in CMOST for 500 datasets. Six independent parameters were set as input values, and oil recovery was set as the output value. The upper and lower limits were based on the actual values from the model, which are listed in Table 2. The distribution patterns used for the iterations were continuous real and discrete real. In this study, the continuous real distribution pattern was used for subsurface parameters such as adsorption, IPV (Innacessible pore volume), and RRF (Residual Resistance Factor), where the parameter values are infinite or uncertain. The discrete real distribution pattern was used for surface parameters such as injection pressure, injection time, and injection rate, where the parameter values are more measurable and limited according to the specified limits (Fu et al., 1991;Johnson, 2020;Mortimer, 2013).
In conducting sensitivity analysis of the effect of polymer on oil recovery using x-gradient boosting algorithm, several steps are required, such as EDA (Exploratory Data Analysis) and Data Processing, normalization of data, machine learning modeling, features importance, and parameter ranking. In the EDA stage, the main objective is to examine the distribution and ensure that the dataset does not suffer from multicollinearity (Komorowski et al., 2016). Multicollinearity refers to a condition where there is a correlation between variables, including the correlation between independent parameters and dependent parameters, as well as among independent parameters themselves (Daoud, 2018;Davino et al., 2022;Farrar & Glauber, 1964;Kim, 2019;Shrestha, 2020). Low multicollinearity values can sometimes cause issues with data correlation, and values that are too high can also create problems that need to be addressed, such as an increase in standard error, algorithm inaccuracy in prediction, and the presence of several variables that are not statistically significant (Liang & Zhao, 2019). One method for identifying multicollinearity values is the VIF (Variance Inflation Factor), where a VIF value of 1 represents a good distribution, according to the VIF interpretation (Miles, 2014;Thompson et al., 2017) in Table 5. Based on this, EDA (Exploratory Data Analysis) and data processing have been conducted on 500 datasets or design of parameters in this research, resulting in a VIF value of 1, indicating that the parameters used do not have correlation among independent parameters or there is no multicollinearity present in Figure 3.

Fig. 3. Heatmap Correlation Parameters
Data normalization is performed with the aim of scaling the parameter values used so that they can be processed more easily, quickly, and without causing a heavy workload during data processing, thereby reducing memory and power requirements for classification processes in machine learning modeling (S. Jain et al., 2018;Patro & sahu, 2015). In this study, the min-max normalization method is used, where the parameter values are scaled within the range of 0 to 1, and this is a commonly used method for normalizing datasets (Ekenel & Stiefelhagen, 2006;A. Jain et al., 2005;Khan et al., 2020;Wu et al., 2005).
Machine learning modeling was performed using hyper-parameter tuning XGBoost algorithm with the aim of obtaining the best predictive model, where tuning was done by dividing the data into 3 parts namely training data, validation data, and test data to find the best model and prevent overfitting in making predictions (Gupta et al., 2019;Han et al., 2020;Kirori, 2019). Therefore, based on the above, this study also used RSCV (Randomized Search Cross Validation) with the aim of speeding up the computation of the XGBoost algorithm and being more efficient when compared to grid search (Bergstra & Bengio, 2012;Putatunda & Rama, 2018). For machine learning modeling in this study, 3 variations of the training and testing data ratio were used, namely 0.7:0.3, 0.8:0.2, 0.9:0.1, and 3 cross validations were performed with 50 combinations of RSCV hyper-parameters, resulting in 150 fitting models. The best R2 values for the various models obtained are as follows.   Table 4 and Figures 4 -6, it can be interpreted that X-Gradient Boosting has successfully performed predictive modelling due to the value of R2 approaching 1, which is sufficient to indicate that the built model is successful and good, thus further analysis such as features importance and parameter ranking can be carried out (Mousavi et al., 2020). It can also be seen from the training and testing residual plots that the results are well-fitted as the points are randomly scattered around the linear line on the plot (Fox & Weisberg, 2018). Validation of actual oil recovery data (model CMG) versus XGBoost oil recovery predictions was carried out to see the matching data from the XGBoost oil recovery predictions that correspond to the actual data and indicate that the results have achieved good prediction. Based on this, predictions were made and the results obtained can be seen in Figure 7. Which shows that the predictions made resulted in an RMSE (Root Mean Squared Error) of 0.56 and MAPE (Mean Absolute Percentage Error) of 2.03%, indicating good prediction according to the standard range where MAPE < 10% is categorized as good prediction (Chang et al., 2007). The feature importance was conducted on the X-gradient boosting algorithm with the decrease of Mean Squared Error (MSE) value as a crucial point of a parameter, meaning that the larger the decrease of the MSE value, the more it will affect the output parameter value (oil recovery) (Liang & Zhao, 2019;Yang & Guan, 2022). From this feature importance, parameter ranking will be given, where parameters with significant MSE decrease will be categorized as crucial or most influential parameters in this study. The ranking of input parameters used in the study has been obtained and presented in Table 7   Based on Table 7 and Figure 7, X-Gradient Boosting Algorithm has successfully ranked the polymer flooding parameters according to their influence on the oil recovery value. The results showed that Injection Time and Injection Rate were the most significant parameters compared to the other 4 parameters such as Injection Pressure, Adsorption, RRF (Residual Resistance Factor), and IPV (Inaccessible Pore Volume). This is in line with the study conducted by (Hidayat & ALMolhem, 2019) which stated that injection duration or injection time is a parameter that has an influence on increasing oil recovery

Conclusion
Based on the results and discussion provided, the conclusion drawn from this study is that the injection time and injection rate parameters are the most influential parameters on the performance of polymer flooding in terms of oil recovery, with importance levels of 0.452632 and 0.430075, respectively. The focus on both factors is expected to optimize the recovery factor in polymer flooding during implementation in the field. Injection pressure has an importance level of 0.064662, while adsorption, RRF, and IPV have importance levels of 0.025564, 0.021053, and 0.006014, respectively. The study also produced an accurate predictive model using X-gradient boosting, with three variations of the training and testing data ratios. The ratio of 0.7:0.3 resulted in a train R2 of 0.9886 and a test R2 of 0.9645, the ratio of 0.8:0.2 resulted in a train R2 of 0.9891 and a test R2 of 0.9579, while the ratio of 0.9:0.1 resulted in a train R2 of 0.9890 and a test R2 of 0.9660. In this case, the X-gradient boosting algorithm proves to be a robust and accurate method for conducting sensitivity analysis in polymer flooding.  Understanding key aspects and design optimization. SPE Middle East Oil and Gas Show