Open Access
ARTICLE
Multi-View Deep Fuzzy Clustering for Data Representation Learning
1 School of Software, Dalian University of Technology, Dalian, China
2 School of Computer Science and Technology, Dalian University of Technology, Dalian, China
* Corresponding Author: Zhikui Chen. Email:
(This article belongs to the Special Issue: Multimodal Learning for Big Data)
Computers, Materials & Continua 2026, 88(1), 32 https://doi.org/10.32604/cmc.2026.076717
Received 25 November 2025; Accepted 03 March 2026; Issue published 08 May 2026
Abstract
With the increasing development of ocean information technology, the multi-view fuzzy clustering is attracting increasing attention in pattern mining for massive multi-view ocean data of heterogeneous distributions, owing to its superior performance. However, the previous multi-view fuzzy clustering methods cannot fully consider informative topologies hidden in data distributions, which are crucial to recognize partitions of data. Moreover, they fail to capture invariant structures of multi-view ocean data in learning clustering-specific fusion representation. In addition, they do not take into consideration consistencies contained in the manifolds of data generation in mining soft patterns. To address those challenges, the deep multi-view generative fuzzy contrastive clustering (DMGFCC) is proposed within a Siamese architecture, which captures soft patterns of data via clustering-specific fusion representations of invariant structures in informative topologies. To be specific, a multi-view Siamese generative adversarial architecture is designed to capture the joint distribution of data as well as invariant structures, which is composed of the view-specific generator network providing pairwise implicit constraints, the view-specific discriminator network distilling knowledge of real data, and the view-specific cluster network capturing fuzzy patterns of fusion information. Furthermore, a generative adversarial dual contrastive clustering loss is devised, which consists of a generative adversarial loss fitting data distributions and a dual contrastive clustering loss learning soft patterns with consistencies of data manifolds. Finally, extensive experiments are conducted on four benchmark datasets, and the results demonstrate the competitive performance compared with the 11 representative methods.Keywords
With the increasing development of ocean information technology, the information from the ocean has demonstrated significant potential across a wide range of fields, such as ocean exploration and ocean current prediction [1–4]. Large amounts of data are inevitably produced and collected from ocean sensors, which are often classified as multi-view data. For example, underwater equipment may collect text, image, sound, and video information to derive meaningful insights into marine environments. These multi-view data describe a richer picture of the real world by consistent and complementary knowledge in heterogeneous views. Thus, they pose great challenges to the effective mining of hidden patterns, which becomes a crucial task in the future ocean computational intelligence.
Multi-view fuzzy clustering, as a fundamental technique of unsupervised learning, captures robust patterns of multi-view data from the consistent and complementary knowledge of heterogeneous views [5,6]. It extracts fuzzy knowledge from heterogeneous data by softening crisp partition boundaries to possibilistic memberships that measure similarities of data. Early multi-view fuzzy clustering algorithms were based on fuzzy c-means clustering and possibilistic c-means, which can be approximately split into collaborative multi-view fuzzy clustering (Co-MvFCM) and multi-kernel multi-view fuzzy clustering (Mk-MvFCM). Co-MvFCM mines consensus patterns via mutual links of intra-view soft partitions rooted in fuzzy prototype clustering schemes. For instance, Jiang et al. [7] utilized soft partitions to capture view-specific data structures and mine view-common fuzzy patterns via inter-view mutual links with entropy maximizing. Mk-MvFCM utilizes the collaborative learning of intra-view local structures of the kernel space to explore common patterns of views. For example, Zeng et al. [8] integrated the view-specific local partition with the view-common global partition to mine consensus fuzzy patterns on the basis of a common latent space of multiple kernels. Those early multi-view fuzzy clustering methods usually perform in the shallow feature spaces, which cannot well capture intrinsic patterns hidden in inter-view nonlinear correlations and intra-view deep semantics.
Recently, deep multi-view clustering methods have attracted much attention, which leverage hierarchical nonlinear transformations to merge consistent and complementary knowledge of inter-view correlations and intra-view semantics for pattern mining of multi-view data. For example, deep canonical correlation analysis utilizes the maximization of correlation between views to mine a view-common feature subspace with complementary information for clustering pattern recognition [9]. Deep multi-view subspace clustering learns a data-driven self-expression coefficient matrix to integrate complementary information between views, which uses the linear dependence constraint to uncover the subspace of each class [10]. Deep multi-view matrix factorization discovers a set of subspace bases with the non-negative constraint to capture a consensus feature space, and then leverages k-means to learn clustering patterns [11]. Deep multi-view spectral clustering learns a common eigenvector matrix of views by reducing the discrepancy between the common eigenvector matrix and private eigenvector matrices [12].
Although those edge-cutting deep multi-view clustering methods have achieved promising performance, most of them mine data patterns on the basis of local structure information rooted in point-to-point mappings of data reconstructions, instead of informative topologies hidden in data distributions that are crucial to recognize partitions of data. Furthermore, the deep multi-view clustering methods usually utilize the single-flow computing architecture which cannot ensure invariant structures of data in clustering-specific fusion representation learning of consistent and complementary knowledge in views. In addition, they do not take into consideration consistencies contained in the manifolds of data generation in an unsupervised manner. Thus, the deep multi-view clustering methods cannot well fit the joint distribution in data of complex distributions.
To address those challenges of pattern mining in ocean multi-view data with complex heterogeneous distribution, the deep multi-view generative fuzzy contrastive clustering (DMGFCC) method is proposed, which captures fuzzy patterns from complementary knowledge hidden in multi-view data as well as clustering-specific fusion representations in an end-to-end manner. Specifically, a multi-view Siamese generative adversarial architecture is designed with two symmetrical sister networks, which can explicitly capture data-invariant structures of multi-view data in clustering-specific fusion representation learning.
In the Siamese architecture, each sister network, which learns distributions of data generation and data partition in an end-to-end paradigm, is composed of the view-specific generator network, the view-specific discriminator network, and the view-specific cluster network. The view-specific generator network fits a probabilistic generation function between noisy inputs of category-static information and observable samples, which produces multi-view data with pair-wise constraints. The view-specific discriminator network models a conditional discriminant function to provide supervision information for joint distribution fitting of multi-view data via distinguishing real data from generated data. The view-specific cluster network utilizes pair-wise indicator features to measure the implicit constraints of multi-view data. Furthermore, a generative adversarial dual contrastive clustering loss composed of a generative adversarial loss and a dual contrastive clustering loss is introduced. The generative adversarial loss assists the view-specific generator network and the view-specific discriminator network to produce data with category knowledge that are subject to the real distribution. The dual contrastive clustering loss helps the view-specific generator network and the view-specific cluster network to capture fuzzy patterns of multi-view data, which can encourage consistencies of data manifolds in the generative clustering with implicit pair-wise knowledge.
The main contributions of this paper are listed as follows:
• The deep multi-view generative fuzzy contrastive clustering (DMGFCC) is designed within a Siamese architecture composed of the view-specific generator network, the view-specific discriminator network, and the view-specific cluster network, which can capture fuzzy informative patterns as well as clustering-specific fusion representations.
• The generative adversarial dual contrastive clustering loss is devised to train model parameters, which consists of a generative adversarial loss fitting the joint distributions between data and categories and a dual contrastive clustering loss capturing fuzzy patterns of multi-view data by fusion representations.
• Extensive experiments are conducted on four benchmark datasets to validate model performance. The results illustrate that DMGFCC achieves competitive performance compared with compared representative methods, especially on complex large datasets.
The rest of this paper is organized as follows: Section 2 gives a detailed review of related work. Section 3 describes the proposed multi-view Siamese generative adversarial architecture. Section 4 introduces the generative adversarial dual contrastive clustering loss, and Section 5 presents the extensive results. Finally, Section 6 concludes this work.
2.1 Shallow Multi-View Fuzzy Clustering
Existing shallow multi-view fuzzy clustering methods can be roughly classified into two categories, i.e., collaborative multi-view fuzzy clustering (Co-MvFCM) and multi-kernel multi-view fuzzy clustering (Mk-MvFCM).
Co-MvFCM mines fuzzy patterns of multi-view data via knowledge transfer of view-specific complementary information. For example, Cleuziou et al. [5] conduct fuzzy c-means clustering on views and reduce the inter-view disagreements between view-specific membership degrees to infer the stable consensus partition of data, i.e., the geometric mean of view-specific membership degrees. Jiang et al. [7] design the view weights measuring the importance of views in the convergence process of fuzzy clustering to focus on the contribution of key views in generating consensus data partition. Wang and Chen [13] leverage min-max mechanism with variable weights of views indicating the current view of the highest cost to perform fuzzy clustering on each view in turn to construct consensus patterns with low costs. Yang and Sinaga [14] construct the view-level and feature-level weight scheme via discovering the core feature in key view in fuzzy clustering, to mine consensus fuzzy patterns in a focused manner. Zhang et al. [15] design cross-view anchor graph indicating the similarity relationship of multi-view data for latent information learning to directly obtain the clustering result in one-step clustering. Yin et al. [16] utilize local structure preserving mechanism to balance the global and local information during discriminative clustering, where the intra-cluster compactness and inter-cluster separability are considered simultaneously.
Mk-MvFCM captures soft patterns of multi-view data via utilizing the nonlinear kernel to compress mismatches between data manifolds and distance metrics in similarity measurements. For instance, Tzortzis and Likas [17] design the linear combination of multiple kernels to extract representations of multi-view data on which the fuzzy clustering patterns are mined. Guo et al. [18] propose the cooperation fuzzy clustering scheme which alternatively performs the multi-kernel fusion and pattern recognition to model consensus clustering partition. Ye et al. [19] utilize a co-regularized kernel fuzzy clustering algorithm to mine the consensus partition of multi-view data via maximizing adaptive similarities between view-common and view-specific clustering indicators. Zeng et al. [8] leverage the multikernel framework to fit the mapping from data space to a common kernel space via endowing each view with multiple kernels, where fuzzy clustering of views is conducted to produce the global partition.
2.2 Deep Multi-View Fuzzy Clustering
Deep multi-view fuzzy clustering introduces various deep neural networks into the pattern mining of fuzzy clustering, which enables the fitting of latent complex distributions. Trosten et al. [20] deploy deep networks with clustering and contrastive heads to extract the view-weighted deep representations of multi-view data for predicting fuzzy clustering partitions, where contrastive learning is utilized to improve the separation between representations. Gao et al. [21] leverage the DCCA (deep canonical correlation analysis) autoencoder architecture to enhance the view-common self-expression matrix via the canonical correlation maximization between deep representations of views, where the self-expression matrix is utilized in spectral clustering to produce fuzzy clustering partitions. Mao et al. [22] design mutual information-based fuzzy clustering networks via implementing the inter-view mutual information maximization and intra-view mutual information minimization to capture view-common class information and reduce view-specific details, respectively, which can devise an unadulterated consensus partition of multi-view data. Cheng et al. [23] construct the multi-view graph convolution networks to induce consensus clustering assignment via graph relationships into embedding representations for fuzzy clustering. Yin et al. [24] propose a variational autoencoder combined with a Gaussian mixture prior distribution via modeling the generation process of multi-view data to predict the conditional posterior distributions of clusters. Shi et al. [25] utilize an entropy regularized self-weighted autoencoder with consensus membership to tune the membership uniformity and reduce cluster assignment discrepancy among views of data, for consistent fuzzy clustering partitions.
3 Deep Multi-View Generative Fuzzy Contrastive Clustering
The deep multi-view generative fuzzy contrastive clustering (DMGFCC) is devised within a Siamese architecture that utilizes invariant structures and informative topologies of data distributions to mine soft patterns. Furthermore, it learns clustering-specific fusion representations of multi-view data via contrastive learning of consistent and complementary knowledge, which can guarantee the intra-cluster compactness and the inter-cluster separation of data manifolds in pattern mining. As shown in Fig. 1, DMGFCC is constructed as dual computing flows of symmetrical sister networks that are composed of the view-specific generator network, the view-specific discriminator network, and the view-specific cluster network. The view-specific generator network fits the joint distribution between category variables and sample variables to capture informative topologies for data generation with implicit constraints. The view-specific discriminator network assists the view-specific generator network to fit real distributions of data via the adversarial game theory between real data and fake data. The view-specific cluster network utilizes the inter-view and intra-view contrastive learning of category-invariant structures to explore fuzzy patterns hidden in data manifolds as well as clustering-specific representations.

Figure 1: The architecture of DMGFCC. Top: Overall architecture. Bottom-Left: The view-specific generator network. Bottom-Center: The view-specific discriminator network. Bottom-Right: The view-specific cluster network.
3.1 The View-Specific Generator Network
The view-specific generator network learns a generative mapping function of data joint distributions, which is responsible for information transfer between the class space and the data space in each view. It captures knowledge of informative topologies via synthesizing pair-wise multi-view data with the help of intra-cluster invariant structures and view-specific perturbations. To be specific, the view-specific generator network of the
where
In the Siamese network, the view-specific generator network uses the generative mapping function of the joint distribution of data generation to generate pairwise multi-view data belonging to one out of the predefined K categories, which can deceive the view-specific discriminator networks. For instance, if fed with two sets of hidden vectors with the same category-static vectors, the view-specific generator network synthesizes data belonging to the same clusters. Otherwise, it produces data of different clusters. Consequently, the generator network can promote the intra-cluster consistencies and inter-cluster inconsistencies of data manifolds in the view-specific cluster networks.
3.2 The View-Specific Discriminator Network
The view-specific discriminator network aims to model a statistical decision function that can accurately distinguish synthetic multi-view data with real multi-view data, providing supervision information for the view-specific generator network in joint distribution fitting. The view-specific discriminator network
where
In the Siamese architecture, the view-specific discriminator network utilizes the adversarial game of structure information hidden in synthetic data and real data, assisting the view-specific generator network to produce multi-view data with pairwise constraints in joint distribution fitting. For instance, if the view-specific discriminator network captures divergencies of intrinsic structures between synthetic data and real data, i.e., a low probability output of the view-specific discriminator networks (Eq. (2)) for fake data, the synthetic data cannot follow the intrinsic distribution of real data. In other words, there are significant divergences between synthetic data and real data. Then, the view-specific generator network utilizes that divergency information of data structures to enhance the quality of fake data of predefined categories by optimizing parameters towards deceiving the view-specific discriminator network.
3.3 The View-Specific Cluster Network
The view-specific cluster network learns a semantic mapping from the data space to the pattern space, where samples are grouped into each cluster via soft memberships. It captures fuzzy patterns of complementary information in multi-view data, utilizing indicator features in which each element denotes the membership probability. That is, data are transformed into vectors with the dimension being the number of pre-defined clusters, where each dimension denotes a kind of pattern. To this end, the indicator feature is stacked at the end of the view-specific cluster network, computed via:
where
in which
Afterwards, the view-specific cluster network learns the fusion representations of multi-view data via the average of indicator features in each view, which puts constraints on mining of consensus patterns. The fusion representation is computed via:
In the Siamese architecture, the view-specific cluster network utilizes pairwise data with implicit constraints to extract clustering-specific fusion representations of inter-view complementary information as well as fuzzy consensus patterns of intra-cluster invariant structures. For instance, data of the same clusters are input into the view-specific cluster networks; the indicator features activate in the same elements. Otherwise, indicator features activate different elements. By optimizing the implicit constraints, the view-specific cluster network ensures inter-view complementarity and consistency of multi-view data.
4 The Generative Adversarial Dual Contrastive Clustering Loss
In this section, a generative adversarial dual contrastive clustering loss (GADCCL) is designed to supervise the learning of DMGFCC. It utilizes invariant semantics hidden in consistent and complementary information of multi-view data to learn fuzzy patterns, as well as clustering-specific fusion representations. Moreover, GADCCL leverages invariant structures of data endowed by the inter-view and intra-view Siamese architectures to preserve consistencies of manifolds in multi-view data fuzzy clustering. GADCCL is composed of a generative adversarial loss fitting joint distributions between classes and samples in views and a dual contrastive loss mining fuzzy multi-view consensus patterns of views. GADCCL is computed as follows:
where
4.1 The Generative Adversarial Loss
The generative adversarial loss
where
The generative adversarial loss ensures that the deep fuzzy multi-view Siamese network produces inter-view and intra-view pairwise data. That is, it endows data with pair-wise implicit knowledge which facilitates the intra-cluster consistencies in fuzzy pattern mining, as well as for inter-view complementarities in representation learning for the view-specific cluster networks.
The dual contrastive loss
4.2.1 The Max-Min Contrastive Learning Loss
The max-min contrastive learning loss is designed based on the triplet form of contrastive learning. It is able to excavate structures hidden in the inter-view and intra-view consistent and complementary information in the local instance perspective. To be specific, given the multi-view dataset
where
To accurately measure similarities of samples in fuzzy pattern mining of the view-specific cluster networks, the concept distance between indicator features of samples, outputs of the view-specific cluster networks, is computed by the cosine similarity as follows:
where
Afterwards, the contrastive loss is re-computed in the following form:
At the same time, each indicator feature
where samples of the same cluster maximize the probability
4.2.2 The Mutual Information Contrastive Learning Loss
The mutual information contrastive learning loss is derived from the entropy form of contrastive learning. It utilizes the implicit pairwise constrains to maximize the mutual information of intra-view invariant structures and inter-view consistent and complementary information in the global distribution perspective. Specifically, given the multi-view dataset
where
In the mutual information contrastive learning loss,
At the same time, the true condition probability of the
where
Thus, the
where
Similarly, the inter-view mutual information contrastive loss between the
The inter-view mutual information contrastive loss promotes the fusion of consistent and complementary information in soft pattern mining.
Thus, mutual information contrastive learning loss is summed as follows:
4.2.3 The Fusion Clustering Loss
The fusion clustering loss aligns complementary information from view-specific clustering representations to further guide learning of the Siamese cluster networks. To be specific, given the view-specific fusion indicator feature
where
in which
where
The details of DMGFCC are outlined in Algorithm 1.

In this section, extensive experiments are conducted on four multi-view benchmark datasets to validate the performance of DMGFCC, compared with 11 representative methods. All the experiments are implemented by Python.
Four multi-view benchmark datasets are utilized to assess the performance of all the 11 clustering methods. The statistics are listed in Table 1 with the following descriptions.

• MNIST-USPS, a benchmark multi-view image dataset, is composed of 7291 samples where two views of the similar distributions are extracted from MNIST (Modified National Institute of Standards and Technology) and USPS (United States Postal Service), respectively.
• MNIST-EDGE, a benchmark multi-view image dataset, consists of 54,000 samples. It is of high complexity in the volume and variety. Each sample utilizes the original digital image and the edge digital image as two views.
• MNIST-INVERSE is a two-view image dataset composed of 60,000 samples, where each sample is represented by the original digital image and the digital image with inverse pixels. The dataset is also with a complex distribution.
• EDGE-INVERSE is composed of 54,000 samples with two views, in which the edge digital image and the inverse digital image are used as views to represent samples. This dataset is more complex than other three datasets in the distributions.
Five clustering metrics are used to fully validate the performance of DMGFCC. For the clustering metrics, the high value indicates the good performance, and the detail definitions are listed as follows:
• Accuracy (ACC) is defined as the ratio of the number of samples with correct assignment to the number of samples, by comparing clustering-inference assignment with ground-truth assignment in the following form:
where N is the number of samples,
• Normalized Mutual Information (NMI) measures correlations between clustering-inference assignment and ground-truth assignment via the entropy theory, defined as follows:
where C and G denote the clustering-inference assignment and the ground-truth assignment, respectively.
• Adjusted Rand Index (ARI) measures similarities between clustering-inference assignment and ground-truth assignment based on consistencies of the two assignment, as follows:
where
• F1-score (F1) is defined as the harmonic mean of the precision and the recall of clustering-inference assignment with the following form:
where
• Purity measures consistencies between clustering-inference assignment with ground-truth assignment via the ratio of the number of samples with correct assignment to the number of samples, as follows:
where
11 representative multi-view fuzzy clustering methods are selected to validate the performance of DMGFCC by comparison, which can be divided into two groups, i.e., shallow multi-view fuzzy clustering method and deep multi-view fuzzy clustering method. Specifically, the shallow multi-view fuzzy clustering method includes CoFKM [5], WV-Co-FCM [7], CoMK-FC [8], OMVFC-LICAG [15], and DFMKLS [16]. The deep multi-view fuzzy clustering method includes MAGCN [23], DEC [26], IDEC [27], BMVC [28], DEMVC [29], and DSwMFC [25].
Table 2 showcases the results of clustering experiments conducted on four multi-view benchmark datasets. In Table 2, DMGFCC on v1 and DMGFCC on v2 represent the clustering results that are achieved by comparing the predicted 10-dimensional one-hot labels on the first view and the second view with the ground-truth ones, respectively.

As shown in Table 2, DMGFCC achieves the state-of-the-art performance in comparison with 11 methods. In detail, DMGFCC attains further comprehensive performance improvements over the high performances of other methods on the five metrics across the four datasets. For example, on the MNIST-USPS dataset, DMGFCC reaches an ACC of 0.9967, a NMI of 0.9909, an ARI of 0.9925, a F1 of 0.9934, and a Purity of 0.9967, surpassing the second-best results by 0.0136, 0.0014, 0.0111, 0.0052 and 0.0127, respectively. Those performance improvements demonstrate that the design of the multi-view Siamese generative adversarial architecture and the generative adversarial dual contrastive clustering loss in DMGFCC enables the state-of-the-art performance that other methods cannot achieve. Meanwhile, DMGFCC on v1 and DMGFCC on v2 report slightly lower clustering performances, which demonstrates that DMGFCC is indeed dependent on the fusion mining of multi-view data patterns. Furthermore, most of the deep multi-view fuzzy clustering methods achieve a better performance than the shallow multi-view fuzzy clustering methods, which demonstrates that the capture of deep relations hidden in multi-view data can benefit the pattern mining of multi-view fuzzy clustering.
Furthermore, to illustrate the statistical performance of DMGFCC, the Nemenyi test is conducted according to the average ranks of clustering numerical results. Fig. 2 shows the Nemenyi statistical test result of all methods on the four datasets and the five metrics, and there are 2 observations. (1) DMGFCC achieves the first rank of all methods, which demonstrates its comprehensive optimality and stable performance across datasets and metrics, providing rigorous empirical evidence for multi-view clustering method selection. (2) Under the significance level

Figure 2: The Nemenyi statistical test result of all methods.
The t-SNE (t-distributed Stochastic Neighbor Embedding) algorithm is conducted on the data points extracted by DMGFCC from four multi-view datasets, to visualize the clustering performance of DMGFCC. The amount of data points used in t-SNE algorithm is 2000 for each dataset, and data points from different ground-truth classes are labeled with different colors.
As shown Fig. 3, the data points, which are endowed with the property of intra-cluster compactness and inter-cluster separation by DMGFCC, lie on the two-dimensional visualization space in a well-classified manner. The data points with the same colors gather while the data points with different colors stay away from each other.

Figure 3: The t-SNE visualizing results of DMGFCC on the four datasets.
In Fig. 4, the confusion matrices on the four multi-view datasets are displayed. In the confusion matrices which are calculated according to the ground-truth labels and the clustering-inference labels, the diagonal elements close to one indicate the good performance on each class. As shown in Fig. 4, all the four confusion matrices are close to the identity matrices with the diagonal elements significantly close to one, which also demonstrates the outperformance of DMGFCC.

Figure 4: The confusion matrices of DMGFCC on the four datasets.
To evaluate the contribution of each loss component in DMGFCC, the loss ablation experiments are conducted on the four datasets. As depicted in Table 3, there exist four loss ablation variants where DMGFCC w/o

To explore the sensitivity of DMGFCC to trade-off hyper-parameters, i.e.,

Figure 5: The hyper-parameter sensitivities of DMGFCC on the four datasets.
To verify the convergence of DMGFCC, Fig. 6 shows the normalized loss curves on the four datasets in 0–80 epochs of training. On the MNIST-USPS dataset, the loss curve decreases rapidly during 0–16 epochs, and then shows slight fluctuations around a loss value of 0.07. On the MNIST-INVERSE dataset, the loss curve has three trends: rapid decrease in 0–13 epochs, slow decrease in 14–37 epochs, and slight fluctuation in the following epochs around the loss value 0.18. And the loss curves on the MNIST-EDGE dataset and the EDGE-INVERSE dataset are almost sandwiched between the two curves above. In general, DMGFCC can converge after 40-epoch training on all four multi-view benchmark datasets, and the different convergence performance may be relevant to their sizes.

Figure 6: The convergence analysis results of DMGFCC on the four datasets.
As shown in Algorithm 1, the time complexity and the space complexity of DMGFCC are
The time complexity. In Algorithm 1, for each training epoch, the first line (Line 1 for short) samples pairwise input vectors for each view, which takes
The space complexity. In the Siamese architecture of DMGFCC, view-specific generator, discriminator, and cluster networks take
In the experiments,
In this paper, the deep multi-view generative fuzzy contrastive clustering (DMGFCC) is proposed within a Siamese architecture to capture soft patterns of data via clustering-specific fusion representations of invariant structures in informative topologies. Specifically, a multi-view Siamese generative adversarial architecture is designed to capture the joint distribution of data as well as invariant structures, which is composed of the view-specific generator network providing pairwise implicit constraints, the view-specific discriminator network distilling knowledge of real data, and the view-specific cluster network capturing fuzzy patterns of fusion information. Furthermore, a generative adversarial dual contrastive clustering loss consisting of a generative adversarial loss and a dual contrastive clustering loss is devised to supervise the learning of architecture parameters. Finally, experimental results on four benchmark datasets demonstrate the competitive performance of DMGFCC compared with the 11 representative methods. In the future, more multi-view fuzzy clustering schemes will be explored.
Acknowledgement: Not applicable.
Funding Statement: This work was partly supported by the National Natural Science Foundation of China under Grant 62476038.
Author Contributions: The authors confirm contribution to the paper as follows: Conceptualization, data curation, Jing Gao and Peng Li; methodology, software, validation, formal analysis, writing—original draft preparation, visualization, Jianing Zhang; investigation, resources, supervision, project administration, funding acquisition, Zhikui Chen; writing—review and editing, Zhikui Chen, Jing Gao and Peng Li. All authors reviewed and approved the final version of the manuscript.
Availability of Data and Materials: Not applicable.
Ethics Approval: Not applicable.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1. Xu Y, Liu Y, Shang J, Lin J, Ma D. OceanAgent: a small-scale multi-modal assistant for ocean exploration. Expert Syst Appl. 2026;298:129640. [Google Scholar]
2. Lei L, Huang J, Zhou Y. Multimodal fusion-based spatiotemporal incremental learning for ocean environment perception under sparse observation. Inf Fusion. 2024;108(7):102360. doi:10.1016/j.inffus.2024.102360. [Google Scholar] [CrossRef]
3. Li M, Hou Y, Song X, Hou C, Xiong Z, Ma D. Self-attention-guided multiindicator retrieval for ocean surface wind field with multimodal data augmentation and fusion. IEEE Trans Geosci Remote Sens. 2024;62(13):1–22. doi:10.1109/tgrs.2024.3452136. [Google Scholar] [CrossRef]
4. Bai L, Qiu L, Zheng J, Zhang Y, Chen X, Sun Y. A parallel convolution attention and temporal sequence attention neural network approach for ocean current prediction incorporating spatial-temporal coupling mechanism. Expert Syst Appl. 2025;281(2):127681. doi:10.1016/j.eswa.2025.127681. [Google Scholar] [CrossRef]
5. Cleuziou G, Exbrayat M, Martin L, Sublemontier JH. CoFKM: a centralized method for multiple-view clustering. In: Proceedings of the 2009 Ninth IEEE International Conference on Data Mining; 2009 Dec 6–9; Miami, FL, USA. p. 752–7. [Google Scholar]
6. Yang H, Deng Z, Zhang W, Wu Q, Choi K, Wang S. End-to-end multiview fuzzy clustering with double representation learning and visible-hidden view cooperation. Trans Fuzzy Syst. 2024;32(2):483–97. doi:10.1109/tfuzz.2023.3300925. [Google Scholar] [CrossRef]
7. Jiang Y, Chung F, Wang S, Deng Z, Wang J, Qian P. Collaborative fuzzy clustering from multiple weighted views. IEEE Trans Cybern. 2015;45(4):688–701. doi:10.1109/tcyb.2014.2334595. [Google Scholar] [PubMed] [CrossRef]
8. Zeng S, Wang X, Cui H, Zheng C, Feng DD. A unified collaborative multikernel fuzzy clustering for multiview data. IEEE Trans Fuzzy Syst. 2018;26(3):1671–87. doi:10.1109/tfuzz.2017.2743679. [Google Scholar] [CrossRef]
9. Kumar D, Maji P. Discriminative deep canonical correlation analysis for multi-view data. IEEE Trans Neural Netw Learn Syst. 2024;35(10):14288–300. doi:10.1109/tnnls.2023.3277633. [Google Scholar] [PubMed] [CrossRef]
10. Yu X, Jiang Y, Chao G, Chu D. Deep contrastive multi-view subspace clustering with representation and cluster interactive learning. IEEE Trans Knowl Data Eng. 2025;37(1):188–99. doi:10.1109/tkde.2024.3484161. [Google Scholar] [CrossRef]
11. Che H, Li C, Leung M, Ouyang D, Dai X, Wen S. Robust hypergraph regularized deep non-negative matrix factorization for multi-view clustering. IEEE Trans Emerg Top Comput Intell. 2025;9(2):1817–29. doi:10.1109/tetci.2024.3451352. [Google Scholar] [CrossRef]
12. Zhao M, Yang W, Nie F. Deep multi-view spectral clustering via ensemble. Pattern Recognit. 2023;144(10):109836. doi:10.1016/j.patcog.2023.109836. [Google Scholar] [CrossRef]
13. Wang Y, Chen L. Multi-view fuzzy clustering with minimax optimization for effective clustering of data from multiple sources. Expert Syst Appl. 2017;72(4):457–66. doi:10.1016/j.eswa.2016.10.006. [Google Scholar] [CrossRef]
14. Yang M, Sinaga KP. Collaborative feature-weighted multi-view fuzzy c-means clustering. Pattern Recognit. 2021;119:108064. doi:10.1016/j.patcog.2021.108064. [Google Scholar] [CrossRef]
15. Zhang C, Chen L, Shi Z, Ding W. Latent information-guided one-step multi-view fuzzy clustering based on cross-view anchor graph. Inf Fusion. 2024;102(6):102025. doi:10.1016/j.inffus.2023.102025. [Google Scholar] [CrossRef]
16. Yin J, Sun S, Wei L, Wang P. Discriminatively fuzzy multi-view k-means clustering with local structure preserving. In: Proceedings of the AAAI’24: AAAI Conference on Artificial Intelligence; 2024 Feb 20–27; Vancouver, BC, Canada. p. 16478–485. [Google Scholar]
17. Tzortzis G, Likas A. Kernel-based weighted multi-view clustering. In: Proceedings of the 2012 IEEE 12th International Conference on Data Mining; 2012 Dec 10–13; Brussels, Belgium. p. 675–84. [Google Scholar]
18. Guo D, Zhang J, Liu X, Cui Y, Zhao C. Multiple kernel learning based multi-view spectral clustering. In: Proceedings of the 2014 22nd International Conference on Pattern Recognition; 2014 Aug 24–28; Stockholm, Sweden. p. 3774–9. [Google Scholar]
19. Ye Y, Liu X, Yin J, Zhu E. Co-regularized kernel k-means for multi-view clustering. In: Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR); 2016 Dec 4–8; Cancun, Mexico. p. 1583–8. [Google Scholar]
20. Trosten DJ, Løkse S, Jenssen R, Kampffmeyer M. Reconsidering representation alignment for multi-view clustering. In: Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021 Jun 20–25; Nashville, TN, USA. p. 1255–65. [Google Scholar]
21. Gao Q, Lian H, Wang Q, Sun G. Cross-modal subspace clustering via deep canonical correlation analysis. Proc AAAI Conf Artif Intell. 2020;34(4):3938–45. doi:10.1609/aaai.v34i04.5808. [Google Scholar] [CrossRef]
22. Mao Y, Yan X, Guo Q, Ye Y. Deep mutual information maximin for cross-modal clustering. Proc AAAI Conf Artif Intell. 2021;35(10):8893–901. doi:10.1609/aaai.v35i10.17076. [Google Scholar] [CrossRef]
23. Cheng J, Wang Q, Tao Z, Xie D, Gao Q. Multi-view attribute graph convolution networks for clustering. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20); 2021 Jan 7–15; Yokohama, Japan. p. 2973–9. [Google Scholar]
24. Yin M, Huang W, Gao J. Shared generative latent representation learning for multi-view clustering. Proc AAAI Conf Artif Intell. 2020;34(4):6688–95. doi:10.1609/aaai.v34i04.6146. [Google Scholar] [CrossRef]
25. Shi M, Zhao X, Yin X, Xiao Y, Guo J. Deep self-weighted multi-view fuzzy clustering. Knowl Based Syst. 2025;328(2):114158. doi:10.1016/j.knosys.2025.114158. [Google Scholar] [CrossRef]
26. Xie J, Girshick RB, Farhadi A. Unsupervised deep embedding for clustering analysis. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning; 2016 Jun 19–24; New York, NY, USA. p. 478–87. [Google Scholar]
27. Guo X, Gao L, Liu X, Yin J. Improved deep embedded clustering with local structure preservation. In: IJCAI’17: Proceedings of the 26th International Joint Conference on Artificial Intelligence; 2017 Aug 19–25; Melbourne, VIC, Australia. p. 1753–9. [Google Scholar]
28. Zhang Z, Liu L, Shen F, Shen HT, Shao L. Binary multi-view clustering. IEEE Trans Pattern Anal Mach Intell. 2019;41(7):1774–82. doi:10.1109/tpami.2018.2847335. [Google Scholar] [PubMed] [CrossRef]
29. Xu J, Ren Y, Li G, Pan L, Zhu C, Xu Z. Deep embedded multi-view clustering with collaborative training. Inf Sci. 2021;573:279–90. doi:10.1016/j.ins.2020.12.073. [Google Scholar] [CrossRef]
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools