Production Dynamic Prediction Method of Waterflooding Reservoir Based on Deep Convolution Generative Adversarial Network (DC-GAN)
1Cooperative Innovation Center of Unconventional Oil and Gas (Ministry of Education & Hubei Province), Yangtze University, Wuhan, 430100, China
2Key Laboratory of Drilling and Production Engineering for Oil and Gas, Wuhan, 430100, China
3School of Petroleum Engineering, Yangtze University, Wuhan, 430100, China
*Corresponding Author: Xiang Rao. Email: email@example.com
Received: 29 September 2021; Accepted: 22 December 2021
Abstract: The rapid production dynamic prediction of water-flooding reservoirs based on well location deployment has been the basis of production optimization of water-flooding reservoirs. Considering that the construction of geological models with traditional numerical simulation software is complicated, the computational efficiency of the simulation calculation is often low, and the numerical simulation tools need to be repeated iteratively in the process of model optimization, machine learning methods have been used for fast reservoir simulation. However, traditional artificial neural network (ANN) has large degrees of freedom, slow convergence speed, and complex network model. This paper aims to predict the production performance of water flooding reservoirs based on a deep convolutional generative adversarial network (DC-GAN) model, and establish a dynamic mapping relationship between well location deployment and output oil saturation. The network structure is based on an improved U-Net framework. Through a deep convolutional network and deconvolution network, the features of input well deployment images are extracted, and the stability of the adversarial model is strengthened. The training speed and accuracy of the proxy model are improved, and the oil saturation of water flooding reservoirs is dynamically predicted. The results show that the trained DC-GAN has significant advantages in predicting oil saturation by the well-location employment map. The cosine similarity between the oil saturation map given by the trained DC-GAN and the oil saturation map generated by the numerical simulator is compared. In above, DC-GAN is an effective method to conduct a proxy model to quickly predict the production performance of water flooding reservoirs.
Keywords: Waterflooding reservoir; well location deployment; dynamic prediction; DC-GAN
The development of new methods for oilfield production optimization has been an urgent problem that should be solved in the process of oil production. The layout of injection and production wells in the oilfield is the key to the optimization of reservoir development. The well location deployment and well location optimization are directly related to the final development effect of the oilfield. Zhou et al.  took Jilin’s two wells ultra-low permeability oilfield as the research object, using numerical simulation software to simulate and predict the diamond well location and rectangular well location of the reservoir, selecting the best well location. Cao et al.  selected ultra-low permeability reservoirs in Changqing’s Oilfield as the research object, through the adjustment of rhombic inverted nine-point injection-production well location in this block, the feasibility analysis of well location conversion and well location refinement was carried out, and the oil recovery rate was optimized and enhanced. According to the characteristics of ultra-low permeability and tight reservoirs in Ordos Basin, Zhao et al.  used the numerical simulation software to study the horizontal well and well location form. Zhang et al.  conducted low permeability reservoir fracturing horizontal well pattern optimization. Cai et al.  used numerical simulation and reservoir engineering methods, studied the horizontal well and vertical well joint well pattern type and well spacing optimization. Zhao et al.  combined seepage theory with reservoir numerical simulation, derived the productivity formula of staggered well pattern in low permeability reservoir, obtained the relationship curve between well pattern form factor and dimensionless production. There are also other excellent studies. combining numerical simulation with well location optimization [7–14].
However, we know that the basis of reservoir production optimization is to be able to make rapid production dynamic prediction based on well location deployment, to obtain the optimal production mode and well layout mode in oilfield development planning [15,16]. The geological modeling and simulation calculation process of the traditional numerical simulation method is very time-consuming, and most optimization algorithms need to call numerical simulation tools repeatedly and iteratively, which leads to the optimization efficiency being greatly reduced. Therefore, it is of great significance to establish the rapid prediction model of water flooding reservoir production dynamic under different well location deployment by intelligent method for well pattern optimization. With the rapid development of artificial intelligence technology, an intelligent optimization algorithm is gradually applied to the field of reservoir development and has achieved a lot of remarkable results [17–23]. Gu et al. [24,25] proposed a new method of remaining oil distribution prediction based on machine learning. According to the existing reservoir numerical simulation results, the remaining oil distribution prediction training was carried out under the model of support vector machine and long-term and short-term memory neural network, and the remaining oil prediction model was built to accomplish the purpose of simple and rapid prediction of remaining oil in oil plane. Gu et al.  proposed a prediction model based on an artificial neural network (ANN) for the problems existing in the traditional water flooding reservoir production prediction method. The model selected the Bayesian regularization algorithm to train the model and used the nonlinear autoregressive network with external input as the structure of the oil production prediction model. Negash et al.  used the embedded discrete fracture model to generate sample data for the development of water injection huff and puff in fractured horizontal wells, and based on the artificial neural network model, a proxy model for production performances prediction of the fractured horizontal well is constructed. Rao et al.  established the production prediction model of extra-high water cut stage of water flooding reservoirs. Although ANN has been widely used to conduct the proxy model for production prediction, the slow convergence speed and low prediction accuracy also exist in ANN.
In 2020, Wang et al.  proposed an automatic well test interpretation method for the radial composite reservoir based on a convolutional neural network (CNN). This article points out and verifies that CNN can effectively avoid data overfitting, quickly and accurately obtain obvious feature optimization networks through large data scale, and improve the accuracy of the model prediction. In the same year, Li et al.  proposed the prediction of gas channeling direction based on DC-GAN. The DC-GAN was used to establish the dynamic mapping of permeability field and gas saturation distribution. The research shows that the DC-GAN has good performance in extracting permeability characteristics, and has an excellent performance in the mapping relationship between input and output of high-dimensional model [31–33]. The deep convolution network is introduced into the generative adversarial network, which can not only effectively extract features, but also quickly converge data, and optimize the training model [34–38]. In addition, we also researched a large number of references to verify the powerful potential of DC-GAN [39–43].
According to the above literature research, the authors found that the DC-GAN method can effectively avoid data over-fitting, quickly and accurately extract data features, and has good performance in dynamic prediction, author decided to use the DC-GAN model to predict the dynamic production of water flooding reservoirs. Based on the improved U-Net framework , the dynamic mapping relationship between the well location deployment and the output oil saturation under waterflooding was established, and the proxy model that can quickly predict the production performance of water flooding reservoirs was constructed, and the accuracy and efficiency of the dynamic prediction about oil saturation under waterflooding reservoirs were significantly improved.
This paper aims to establish a prediction model based on DC-GAN to reflect the dynamic mapping relationship between well location deployment and output oil saturation in waterflooding reservoirs. Due to the production parameters such as well location deployment and geological parameters such as oil saturation being mostly established on the whole reservoir scale, there will be a large amount of calculation data and an uncertain geological model. Therefore, the methods used ANN and other machine learning methods that directly train the data itself are difficult to obtain good training results and cause the network model structure to be extremely large and complex. Therefore, the idea of this paper is to convert these two kinds of data into an image. Through DC-GAN, the features of input parameters are extracted, and the mapping relationship is established, then the proxy model is optimized combined with GAN.
The quality of the sample is directly related to the accuracy and stability of the neural network model. Therefore, this paper uses the traditional numerical simulator (MRST) for input image , that this simulator is a famous open simulator developed by the Norwegian Institute of science, technology and industry, and SINTEF, it can effectively reduce the difficulty of the input image and improve the efficiency, and ensure the efficiency and accuracy of the preprocessing. Basic physical properties used in this numerical example Table 1 and the data preprocessing process are shown in Fig. 1.
Input image: According to the actual reservoir size, the grid number (pixel point of the image) and grid size are determined. The range of the number of water injection wells is given according to the average well spacing information on the site, and the number of water injection wells is randomly generated. The number of production wells is determined according to the ratio of randomly generated water injection wells to the number of production wells. Then, the grid position of each well is randomly determined according to the number of wells, and it is required that each grid has at most one well (we set 1 as water injection wells, −1 as production wells, 0 as empty wells, and the number of production wells should be much larger than that of water injection wells according to the actual production). Finally, the corresponding well location deployment picture (Fig. 2a) is obtained. According to the determined injection rate of water injection wells (200 square/day) and the injection-production balance, the liquid production rate of production wells is determined, and the information of each well location is taken into the numerical simulator for calculation,
Output image: As mentioned above, we use the traditional numerical simulation to obtain the real-time oil saturation map (as shown in Fig. 2b) in the case of the corresponding well location deployment map (input image) and the known geological parameters. We generally believe that the injection-production balance of water-drive reservoirs, so the surface pressure changes little, that is, the pore volume changes little. Therefore, the cumulative oil production of the whole block can be estimated according to the distribution of oil saturation. The formula is as follows:
where: refers to oil saturation, refers to pore volume, and refers to the porosity.
This oil saturation map is used as the output image to verify the output image of the sample training of the input well deployment map through the DC-GAN.
A generative adversarial network (GAN) is a network model of deep learning. It can force the generated image to be almost indistinguishable from the real image in statistics, to generate quite realistic synthetic images and predict the required data with new data. The GAN is composed of the generator (G) and discriminator (D). The generator takes a random vector as the input data and decodes it into a synthetic image. The image is introduced into the discriminator for discrimination. The discriminator takes an image as the input and predicts whether the image is a real image from the training set or a false image of the generator. In this paper, the input of the generator is directly the well location deployment picture, and the discriminator determines whether the input picture is the real oil saturation picture or the picture generated by the generator. The purpose of the generator network can deceive the discriminator network. Therefore, with the operation of training, it can gradually generate more and more realistic images, that is, images that can not be distinguished from real images. At the same time, the discriminator is constantly adapting to the gradually improved ability of the generator and improving its ability, which is the mutual competition between the generator (G) and the discriminator (D), also known as confrontation training.
The generator G(x) is used to represent the mapping relationship G(x):X->Y of the input image, and the discriminator (D) is used to identify whether the data comes from the model D(y) generated by the generator or the obtained model D(G(x)). D attempts to maximize the probability of its correct classification of true and false (log (x)), and G attempts to minimize D will predict that its output is false (log(1−D(G(x)))), its loss function is .
The formula of GAN is:
The distance between the two distributions is measured by . The greater the V, the greater the distance, and the greater the difference between the two distributions; D is to make the distance as far as possible; G is to make the distance as small as possible, to form a confrontation:
where and represent solving expectations from real data and expectations from the proxy model.
The loss function is expressed as:
In adversarial training, we continuously improve the ability of generators and discriminators by iterative computation. The network construction idea of our GAN is shown in Fig. 3 below.
A convolutional neural network (CNN) is a feedforward neural network, which is composed of the input layer, convolution layer, pooling layer, and full connection layer. The convolution layer is the core layer of constructing a convolutional neural network, which produces most of the computation in the network. The convolution layer can have several, and the main purpose is to detect features. The pooling layer is generally sandwiched in the middle of the convolution layer, which is used to compress the amount of data and parameters and to greatly reduce the order of magnitude, to avoid the phenomenon of overfitting. The activation function follows the convolution layer, which adds nonlinearity to the network and improves the performance and generalization of the neural network. CNN can effectively reduce the dimension of big data pictures to a small amount of data, and can effectively retain picture features. At the same time, CNN has translation invariance. It can make efficient use of data when processing pictures and can obtain generalized data representation with only a few training samples. Therefore, the convolutional neural network has the ability to efficient and stable feature extraction.
The input and convolution kernel of the convolution layer is usually multi-dimensional array data. Convolution operation can be regarded as the process of convolution kernel sliding on the input data. Convolution operation reads the sum of pixels in each region through the movement of convolution, inputs feature map, extracts blocks, and applies the same transformation to all these blocks to generate an output feature map. Unlike regular networks, the neurons in each layer of a convolution neural network are arranged in three dimensions: width, height, and depth, and the depth in the convolution neural network grid refer to the number of layers of the network. The dimension of the input well location deployment map in this paper is (width, height, and depth) We can see that the neurons in the layer connect only to a small area in the previous layer, not to a full connection, as shown in Fig. 4.
The structure of DC-GAN and GAN is similar, GAN is difficult to train mapping relationships between images directly, while CNN is an effective way to process images, so DC-GAN adds a means to process images within the framework of GAN, thus enabling adversarial neural networks to train mapping relationships between images effectively, so it transforms the multi-layer perceptron of the original GAN generation model G and the discriminant model D into two convolutional neural networks. In the process of constructing the generator, the deep convolutional network is used to replace the traditional nonlinear mapping. By inputting multidimensional vector parameters and through a series of convolution operations, the well location deployment map is convoluted, offset, normalized, activated, and other steps to form the middle parameter feature image. At the same time, we use deconvolution network operation for up-sampling to form the output oil saturation map after multi-dimensional mapping and use the normal distribution to sample points in the potential space, to improve randomness and further improve the robustness of GAN training.
The network model in the generator is constructed based on an improved U-Net framework (as shown in Fig. 5). Using dropout in the discriminator to reduce neuronal connections and avoid overfitting, generally speaking, the sparse gradient will hinder the training of GAN, and the maximum pooling operation and ReLU activation function will result in sparse gradient, so we use step convolution instead of the maximum pooling layer for upsampling, the generator hide layer uses ReLU, the last layer uses tanh as the activation function, and the D hide layer uses LeakyReLU layer to replace ReLU activation. The last layer uses softmax as the activation function. LeakyReLU is similar to ReLU, but it allows a smaller negative activation value, thereby relaxing the sparsity limitation.
We use Pytorch to draw several common activation functions (Fig. 2 for details). Our innovation is that the input sample changes directly from the original noise to the well location deployment map, which simplifies the steps and improves the calculation rate. At the same time, the input sample of the generator changes from the original random vector to the feature vector output by the convolution layer mentioned above, and the new oil saturation map is predicted according to the characteristics of the well location deployment map.
We introduce a deconvolution layer in the model. The size of the deconvolution layer corresponds to the size of the connected convolution layer. The convolution parameters are the same. The left side extracts the characteristics of the well location deployment map by convolution, and the right side obtains the real oil saturation information through the deconvolution layer. The left input dynamic well location deployment map, the right output oil saturation map of water drive, the generator transforms image to image, and the discriminator is similar to a classifier, giving the probability of the real data of the generator output data, then they are confronted and trained. Only in this way, the trained generator model can generate the corresponding output data according to the input samples provided, rather than the traditional adversarial neural network, which is just to copy a type of image to achieve the purpose of predicting saturation.
The same points: (1) The ‘U’ structure of the U-Net framework is adopted, and the blue box is represented as a multi-channel feature map. During the down-sampling, each module has two effective convolutions and one maximum pooling calculation. The convolution kernel of the convolution layer is 3 × 3, and the convolution kernel of the maximum pooling layer is 2 × 2. In the up-sampling, each module performs one deconvolution and two convolution networks, and the convolution kernel of the deconvolution layer is 2 × 2.
Differences: (1) The U-Net framework does not use padding for boundary nulling when performing convolution operations, but in this paper, we use padding for boundary nulling to ensure that the size of the output feature graph remains unchanged.
(1) ReLU: Rectified Linear Unit from Fig. 6c, it can be seen that the value greater than 0 is not affected by this function, and the value less than 0 is returned to 0 by this activation function: ReLU(x) = max(0, x). The ReLU converges faster than the Sigmoid/Tanh function and has no exponential operation, requiring a small amount of computation, where the ReLU function sets all negatives to zero.
(2) LeakyReLU: It is an improved version of ReLU, which solves the necrosis problem of ReLU and uses a small probability at the time, that is , it is also involved in gradient descent.
We use the traditional numerical simulator to generate input well location deployment and output oil saturation samples. According to the technique of , which had compared the suitable number of image samples. We take the samples at five times points, respectively, 60, 120, 180, 240, 300 days, we each time the amount of data for 6000 samples. The reservoir model is of injection-production balance, and well locations are stochastically generated. The true oil saturation map (generated by the numerical simulator) is compared with the “false” oil saturation map generated by the DC-GAN model to verify the accuracy of the model, as shown in Fig. 7.
The results show that the training speed of the DC-GAN proxy model is faster than the traditional numerical simulator, it is worth noting that this is still the case of a small number of grids when the reservoir numerical model has a large number of grids, the proxy model in this paper will be more obvious to reduce the computational time. To some extent, it shows that the DC-GAN proxy model has fast training speed and great development potential. It is an efficient intelligent algorithm.
At the same time, we take the training of samples at 60 days as an example to show the data changes of generator and discriminator loss rate during DC-GAN training, as shown in Table 3. The generator functions as close as possible to the real sample generated by the DC-GAN model, that is, the closer D(G(x)) is to 1, the better. The role of the discriminator is to make D(y) as close as 1 and D(G(x)) as close as 0, However, whether the real sample or the generated sample, when the probability of the discriminator D(y) is 0.5, the state is the most ideal, it is impossible to distinguish whether the sample comes from the real sample or the generated sample.
As shown in Table 3, we take the number of iterations six times and find that with the increase of the number of iterations, the loss rate of generator D and discriminator G gradually decreases, while D(y) is used to judge whether data comes from a real model or DC-GAN proxy model, so it is closer and closer to 0.5, which it is best. The generator functions as close as possible to the real sample generated by the DC-GAN model, that is, the closer D(G(x)) is to 1, the better. D(G(x)) in the generator and discriminator is closer to 0.5, indicating that the stability of the model is gradually enhanced.
The relationship between training loss values and time obtained from training for DC-GAN is shown in Fig. 8. The relationship between training loss values and the number of iterations under different production times is analyzed. To better observe the state of iteration reaching equilibrium, we set the maximum number of iterations for 3000 times at different times. Each training includes an independent training process for the generator and the discriminator.
Obviously, for most models, the model can reach the Nash equilibrium state and the accuracy of the model reach equilibrium before 300 training iterations. In the model of 300 days of production, oil saturation is more complex and changeable due to the distribution pattern, and water breakthrough in most of the production wells. The characteristic parameters are more complex than those before, which requires a lot of training time to reach equilibrium.
For two vectors, we can imagine them as two lines in space, starting from the origin ([0, 0, …]) and pointing in different directions. There is an angle between the two lines. If the angle is 0 degrees, it means the same direction and the line overlap. If the angle is 90 degrees, that means forming a right angle, the direction is completely different. If the angle is 180 degrees, that means the opposite direction. Therefore, we can judge the similarity of vectors by the angle. The smaller the angle is, the more similar it represents.
Assuming A and B are two n-dimensional vectors, A is [A1, A2,…, A n], B is [B1, B2,…, Bn], then the cosine of the angle θ between A and B equals
The closer the cosine value is to 1, the closer the angle is to 0 degree, that is, the more similar the two vectors are, which is called cosine similarity. By comparing the true and “false” (generated) maps of six groups about oil saturation distribution, the corresponding cosine similarity values are obtained, as shown in Table 4.
The numbers in the table represent the cosine similarity between the two images, and it can be seen that the similarity between the images is generally above 80% and even up to 97%, which shows the high accuracy of the training model.
Under the condition of the same geological parameters, according to the same well location deployment map, taking 300 days as an example, the oil saturation map obtained by DC-GAN model training is drawn, and compared with the oil saturation map obtained by reservoir numerical simulator, as shown in Fig. 9. According to the similarity of images, the training results are consistent with the traditional digital simulation results. The oil saturation map based on DC-GAN model training can accurately extract the variation characteristics of the oil saturation map, and can effectively predict the dynamic mapping relationship between well location deployment and oil saturation in water flooding reservoirs, reflecting the reliability of the DC-GAN model.
DC-GAN is a model of deep learning and has great advantages in extracting image features and high-dimensional mapping. Based on the DC-GAN framework, the convolution layer and deconvolution layer are used to extract the feature of well location deployment images, and the oil saturation images are output. The real oil saturation map and the oil saturation map generated by the DC-GAN model are distinguished. Then, the model is continuously trained by the discriminant results, and an efficient proxy model for the dynamic mapping of well location deployment to oil saturation distribution is established. The actual example shows that the model can predict the distribution of oil saturation in the case of water flooding with high efficiency and high accuracy. By comparing the cosine similarity between the model results in different time points and the oil saturation images from the numerical simulator, good image similarity is found, which also verifies the good performances of the DC-GAN based proxy model. In addition, the work done in this paper may provide a reference for the application of DC-GAN to more reservoir engineering fields.
Funding Statement: The authors thank the supports from the National Natural Science Foundation of China (No. 52104017), the Open Foundation of Cooperative Innovation Center of Unconventional Oil and Gas (Ministry of Education & Hubei Province) (No. UOG2022-14), and the open fund of the State Center for Research and Development of Oil Shale Exploitation (33550000-21-ZC0611-0008).
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
|This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.|