Research on Denoising of Cryo-em Images Based on Deep Learning

: Cryo-em (Cryogenic electron microscopy) is a technology this can build bio-macromolecule of three-dimensional structure. Under the condition of now, the projection image of the biological macromolecule which is collected by the Cryo-em technology that the contrast is low, the signal to noise is low, image blurring, and not easy to distinguish single particle from background, the corresponding processing technology is lagging behind. Therefore, make Cryo-em image denoising useful, and maintaining bio-macromolecule of contour or signal of function-construct improve Cryo-em image quality or resolution of Cryo-em three-dimensional structure have important effect. This paper researched a denoising function base on GANs (generative adversarial networks), purpose an improved discriminant model base on Wasserstein distance and an improved image denoising model by add gray constraint. Our model turn discriminant model’s training process from binary classification’s training process into regression task training process, it make GANs in training process more stable, more reasonable parameter passing. Meantime, we also propose an improved generative model by add gray constraint. The experimental results show that our model can increase the peak signal-to-noise ratio of the Cryo-em simulation image by 10.3 dB and improve SSIM (Structural Similarity Index) of the denoised image results by 0.43. Compared with traditional image denoising algorithms such as BM3D (Block Matching 3D), our model can better save the model structure and the vein signal in the original image and the operation speed is faster.


Introduction
With the development of Cryo-em, the requirements for the accuracy of three-dimensional model of Cryo-em particles are also increasing. However, Cryo-em images only have low signal-to-noise ratio, are due to the electron beam intensity, medium, moderateness, temperature, exposure time and the particle's motion etc. This lead to reconstructing high-resolution 3D models is severely hampered, in order to better solve the problem of low precision of reconstructed 3D model, we must denoise original Cryo-em images, reduce noise levels in Cryo-em images, as much as possible to preserve the original contours, textures and other details of the particles in the Cryo-em images toimprove the visual effect of the image.
Currently, the most common denoising method for Cryo-em images is the classification based twodimensional projection image averaging method [1]. First this meth-od selects all single particles manually or automatically from the Cryo-em images, then, assuming that these single particles have only a limited orientation, the obtained single-particle images are classified, and particles having a clear orientation are obtained by performing and averaging. This method is relatively simple and has certain limitations. In the actual acquisition, the noise in the original Cryo-em images has effect on the extraction of single particles. Another method is to denoise the image using the Radon transform [2], different from the two-dimensional classification aver-aging method mentioned above. This method does not need to assume the orientation of a single particle, direct two-dimensional Radon transform on single-particle images, special properties based on projected images, let the two-dimensional discrete Radon transform replace the continuous Radon transform. This method is overly dependent on the determination of the equivalent line of a single-particle projection image, in the actual extraction, the presence of noise will have a serious impact on the extraction of single particles. Another method is a LANL filtering algorithm for denoising based on the rotational symmetry of bio-macromolecules. This algorithm is suitable for samples of icosahedral symmetrical structure, the Cryo-em is not only symmetrical in structure, so the algorithm is not versatile.
We use the GAN [3] model to denoise the image of the Cryo-em to improve the signal-to-noise ratio of the image. We use the high-resolution electron microscope data provided by Warwick University as a target example, and simulate Cryo-em image by adding noise to the dataset. Our model turn discriminant model's training process from binary classification's training process into regression task training process, it make GANs in training process more stable, more reasonable parameter passing. Meantime, we also propose an improved generative model by add gray constraint. In order to better verify the effectiveness of the method under different noise intensities, we tested using different pictures of noise intensity σ = [10,80].

Related Works
Image denoising has been an actively research topic in computer vision for last couple of decades. It has rich literature in traditional signal processing based method. BM3D [4] is one of the best methods on image denoising in that domain. In this paper, we show an improved GAN model to denoise Cryo-em images.
Recently, GAN has made great breakthroughs in the field of image denoising, and it has a good effect on pictures with low SNR. GAN consists of two parts: Generator and discriminator. The generator is used to generate the effect of image denoising, and the discriminator is used to judge the effect of image denoising. The training process is shown in Fig. 1.
Finally, we take the noisy image as input to the trained generator model, and the generator model generates the image as an output.
The GAN model used is different for different problems. In the case of limited data and computing resources, it is necessary to pay attention to the rationality and weight convergence of the overall structure of the model.
The traditional GAN has some problems: (1) The model collapsed. The GAN model will degenerate during the training process, resulting in the generated image not meeting expectations. (2) It is not easy to converge. GAN finds the Nash equilibrium point in the high latitude non-convex function, and the gradient descent finds the Nash equilibrium point in the convex function. (3) The gradient disappears. By improving the JS divergence and KL divergence in the traditional GAN loss function to measure the distance between the two distributions, the discriminator training speed is too fast, so that it is difficult for the generator to obtain enough gradients, resulting in the disappearance of the gradient.
According to the above problems, various experts and scholars have proposed different improvement methods. WGAN [5] can effectively solve the problem of model collapse and convergence. The residual learning [6] can effectively solve the problem of gradient disappearance. We refer to the pix2pix [7] models for design and improvement. We optimize the model structure, add residual learning in the generator model and discriminator model, use it for gradient transmission, and solve the problem of gradient disappearance, we also improved generative model by add gray constraint for reduce grayscale difference between the result image with the target image.
In order to better judge a denoising algorithm, it is necessary to observe whether the algorithm can remove noise to the greatest extent, and can maintain the original signal, and has lower time complexity and space complexity. We usually use PSNR (Peak Signal to Noise Ratio, see Eq. (2)), MSE (Mean Square Error, see Eq. (3)), SSIM (Structural Similarity Index, see Eq. (4)) [8] and the time complexity of the algorithm to evaluate the quality of the algorithm.
PSNR(x, y) = 10 × log 10 � SSIM(x, y) = �2µ x µ y +c 1 ��2σ xy +c 2 � �µ x 2 +µ y 2 +c 1 ��σ x 2 +σ y 2 +c 2 � The indicators of PSNR and MSE in image processing are based on statistical features of image gray values, while SSIM is based on the correlation between adjacent pixels. The closer the original image and the denoised image are in structure, the closer the value of SSIM is to one.

Methods
In this section, we present the proposed our GAN model and dataset. Generally, training a GAN model for a specific task generally involves three steps: (1) Network architecture design; (2) Obtaining the dataset; (3) Model learning form dataset. For network architecture design, we modify pix2pix network to make it suitable for image denoising, and adjust the depth and the loss function of the model based on WGAN. For obtaining the dataset, we achieve the high-resolution electron microscope data provided by Warwick University as a target example, and simulate Cryo-em image by add noise to the dataset. For model learning form dataset, we use the residual learning and adjust the optimization function in the generator and discriminator of our GAN network.

Model
We set the size of convolutional filters [9] to be 4 × 4 and set the size of stride to be 2. We adjust the size of convolutional filters for avoid the checkerboard artifact in result. Our input and output are (256, 256) grayscale images. Our discriminator is also a convolutional neural network with a similar structure to the generator. Its input and output are (256, 256) grayscale images and a result vectors, respectively.
The generator and discriminator of our GAN network use the Adam [10] optimization algorithm and the SGD optimization algorithm respectively. Adam optimization algorithm is an extension of stochastic gradient descent algorithm, the method computes individual adaptive learning rates for different parameters from estimates of first and second moments of the gradients, such that the parameter update is more stable.
Cryo-em images are different from ordinary camera images. Cryo-em images saved in MRC file by 32-bit floating point type format, so we need to make preprocessing before denoising. We use wasserstein distance to improve our model for better preserving the texture and edge information of the data. Following the principle in WGAN, we clip the parameters of the network to between [-c, c], we set c to 0.001, and we also removed sigmoid activation function. Due to the strong correlation between pixels in the Cryo-em image, we improve generative model by add gray constraint. Finally, our discriminant model loss function is represented by Eq. (5), and the generator model loss is represented by Eq. (6).
is the coefficient of the constraint term, we set it to 0.1. is the expected result, and is the noise image. The definition of is shown in Eq. (7), where v is noise.
( , ) represents the pixel difference between the generated image and the target image, we define the difference between the generated image and the target image as shown in Eq.8.
= + (7) In summary, our model has two characteristics: (1) Optimization model stability by using wasserstein distance. (2) Reduce the difference between generated and target images by adding gray constraint. This allows our model to preserve the texture and edge information of the Cryo-em image as major as possible and eliminating the noise.

Noise Analysis
In the process of image processing, since most of the noise is caused by contamination of electronic devices and the like, the noise that is usually added to the image is a noise model of Gaussian noise and Poisson noise in actual simulation. At present, most of the denoising algorithms denoise Gaussian white noise, the noise probability density function is normal distribution, the power spectral density function is constant, and the Gaussian distribution probability density formula is shown in Eq. (9).
The MRC file we used is stored using a floating-point 32-bit data type. To calculate the data we scaled the file to [0,255], we give probability density plots for each pixel of an image of different Cryoem particles, as shown in Fig. 2.

Results
In this section, we present our experimental finding. We made necessary changes on top of pix2pix. We train the model using an open machine learning dataset from Warwick University.

MRC Image Reading and Image Simulation
We train the generator and discriminator, and our goal is to optimize the quality of the image through the generator and improve the SNR of the image. Let the picture discriminator accurately distinguish between the generated image and the real image. The MRC [11] image we used was from a CPV (cytoplasmic polyhedrosis virus) virus projection image taken by FEI. We use the mrcfile plugin to read the MRC file, which can extract the image data of the MRC file format into matrix data for research and calculation. A partial MRC file picture is shown in Fig. 3.

Figure 3: Cryo-em images
The training dataset we used was an open machine learning dataset from Warwick University, which was an electron microscope image dataset. We simulated the electronic noise in the Cryo-em image by adding Gaussian noise. The comparison between the Cryo-em image and the simulated image is shown in Fig. 4. We want to denoise the image of the noisy electron microscope and continue to train the model until the desired result is obtained. Denoise the Cryo-em image using a trained generator.

Results and Analysis
In the image denoising task, BM3D algorithm is a common algorithm for denoising images. BM3D algorithm uses image blocks to denoise images, and this algorithm is one of the most effective denoising methods. Compared with the BM3D algorithm, our algorithm uses the GAN algorithm to train a large number of original images and its noise images, denoise Cryo-em image s by learned knowledge. Our method has significantly improved speed in the denoising experiment, and quality of images has also been improved. We will also give our denoising results in different noise intensities, as well as experimental comparisons with different mainstream denoising methods. The comparison of the noise addition results (σ = 40) is shown in Fig. 5. We will give a comparison of our methods with methods such as BM3D, K-SVD [12], and GuassianBlur, as shown in Fig. 6.  In the table, we show the denoising effect of different methods in the same noise image, and mark the best results in the table. Here we can see that our method achieves the most excellent results in all indicators.
In the training of the generator and discriminator models, we adjusted different learning and different optimization algorithms. In the training of the generator and discriminator models, we adjusted different learning and different optimization functions. In order to clearly display the data, we will give the training loss curve of the generator and discriminator in the experiment. Each experiment with different parameters was recorded. We performed a record every 500 steps. The loss statistics are shown in Fig. 7. The detailed parameters of the experiment are shown in Tab. 2.  In order to observe the difference between the data more intuitively, we give the data statistical histogram of the denoising results of each experiment as shown in Fig. 8. We found that the model has a better effect in the noisy picture when the training speed of the generator is slightly higher than the training speed of the discriminator. We will give the effect of our method on different noise intensities (see Fig. 9) with their histograms (see Fig. 10) and chose the best experimental model parameters for denoising the MRC file of the Cryo-em image (see Fig. 11).

Conclusion
This paper introduces a method based on deep learning for denoising of Cryo-em images. A GAN model was improved, using a large number of Cryo-em simulation data for training and tuning, and finally the best GAN model was selected for the denoising of the original Cryo-em image. The experimental results show that the GAN model presented in this paper has good effect on denoising on the simulation dataset. It has certain effect on denoising of Cryo-em images, which can reduce the noise of the image and maximize the original signal of the image.

Conflicts of Interest:
We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.