|Intelligent Automation & Soft Computing |
Deep Contextual Learning for Event-Based Potential User Recommendation in Online Social Networks
Research Scholar, Department of Electrical and Electronics Engineering, PSG College of Technology, Coimbatore, 641004, India
*Corresponding Author: T. Manojpraphakar. Email: email@example.com
Received: 11 November 2021; Accepted: 31 December 2021
Abstract: Event recommendation allows people to identify various recent upcoming social events. Based on the Profile or User recommendation people will identify the group of users to subscribe the event and to participate, despite it faces cold-start issues intrinsically. The existing models exploit multiple contextual factors to mitigate the cold-start issues in essential applications on profile recommendations to the event. However, those existing solution does not incorporate the correlation and covariance measures among various contextual factors. Moreover, recommending similar profiles to various groups of the events also has not been well analyzed in the existing literature. The proposed prototype model Correlation Aware Deep Contextual Learning (CADCL) solves the mentioned issues. CADCL explores correlation on the different perspectives of user and event features to alleviate the sparsity problem. Latent Dirichlet Allocation (LDA) has been employed to extract the latent contextual information to increase the high relevancy rate in a hidden layer of the deep learning architecture. Finally, decision of the profile recommendation to the events is integrated on basis of influence weight to the correlation. Experimental analysis of the proposed architecture on Meetup dataset is cross-validated and performance metrics such as Precision 0.99%, Recall 0.88% and F Measure 0.93% are proved to be better on comparing with current state of art approaches.
Keywords: Event recommendation; event based online social network; profile recommendation; latent dirichlet allocation; influence weight
Increasingly high utilized event-based online social networks like Meetup provides online platform solutions for users with the provision to create identify and share much interested offline social events. Each event posted in online social networks is integrated with various attributes of the event created by the organizer including the location and timestamp. The timestamp is the inclusion of the beginning and conclusion of the event held with textual content outlining the event . A large number of events published continuously on various timestamps and locations in online social networks has to lead to difficulty in identifying the attractive events for users and it leads to cold start problems [2,3]. Many states of art approaches have exploited multiple contextual factors such as spatial factors, temporal factors, textual factors on basis of content, preference, effects, and social details to mitigate cold-start issues towards online social event recommendations .
Existing state of art techniques focus on recommending future occurring events for individuals but it fails to recommend the events to a group of users who likes to take part in the events together, e.g., it is recommending friends for attending exhibitions and travelling with families. Further many challenges lie in identifying the events which relate to all group members with particular event interests or user preferences but traditional techniques [5,6] fail to incorporate the preference on the recommendation. Fig. 1 represents the general context-aware recommendation of the dataset.
The preference-based recommendation has been modelled as group recommendation as it models the similar user’s preferences as a group on aggregating the group members. In addition, model-based approaches also extract the interaction of users to form a group as a generative process to produce better performance but all those techniques fail to consider the cold start problem. Contextual information has been extracted on the different periods that have been used for making a recommendation list on utilizing the user feedback as represented in Fig. 1. Although event recommendation based on generative and aggregating method produces better performance in addition to considering the contextual factor but never considers the correlation of the contextual factors.
In this research work, a new architecture entitled Correlation aware Deep Contextual learning which utilizes the correlation of the contextual information on the different perspectives of user and event features has been modelled. To extract the user and event features, the Latent Dirichlet Allocation method has been incorporated. Additionally, influence the weight of the user and event on the particular context on multifaceted information has been ranked using knowledge level, hierarchy level, and participation level on the relevant event in the objective function of deep belief network.
The remaining article is sectioned into the following parts. Section 2 explains the more similar works based on the decision-making process for participant recommendation to the event. In Section 3, the correlation aware deep influence prediction framework for recommending profile to the social event using Deep belief Network. In Section 4, experimental validation has been carried out on the proposed framework against state of art approaches using performance metrics on benchmark dataset. Finally, the article has been concluded.
2 Related Work
This particular article, explains event recommendation model using deep learning approaches has been examined in detail on basis of architectures for feature representations and similarity measures of the data points. Each of those deep learning techniques which produce better performance in terms of effectiveness on the evaluation of the model has been represented in detail and some techniques which perform nearly similar to the proposed model is described as follows.
Li et al. (2016) proposed Deep Influence Predict (DIP) has been analysed to explore the features of the Meetup dataset using the deep learning architecture titled Recurrent Neural Network. RNN identify the target or potential users through different patterns and behaviours of the profile on the social networking service applications on effective feature extraction on the data layers. Those layers learn multiple levels of representations and abstractions of the latent data through individual participation records on similar topics .
Wang et al. (2011) stated that Dynamic mutual influence allows integration on potential event participants on the discrimination process. Initially, the network is pruning has been applied to the dataset on various social factors. This model generates the group-oriented event participation as a decision making process on two stages by capturing the users’ preference on the profiles on the first stage addition to their latent social connections on their mutual information as the second stage with multifaceted interest and adoption of the information’s .
De Campos, et al. (2020) stated that an automatic model to extract and exploit various sources of information on basis of profile-based expert recommendation and through document filtering from a machine learning perspective by clustering expert’s textual sources. It builds profiles and captures the different hidden topics in which the experts are interested. The experts will then be represented through multi-faceted profiles for hierarchical representation .
Minghua, et al. (2019) discussed, event-based social networks (EBSN) provide a convenient platform for identifying the participants for online events. This semantic-enhanced and context-aware hybrid collaborative filtering model has been utilized for event recommendation, which combines semantic content analysis and contextual event influence for user neighbourhood selection on exploiting the latent topic model for analysing event description .
Joseph Ogundele, et al. (2018) discussed user preferences mined from the influences of geographical locations, event categories, and social and temporal preference by assuming that each of these influences has the same weight for all users. Personalized event recommendation framework which employs the multi-criteria decision-making approach by utilizing the personalized criterion’s weight, dominance intensity measures (i.e., dominating and dominated measures) are computed for alternatives (i.e., candidate events) of each criterion, and the set of alternatives is ranked based on the estimated dominance intensity measures to recommend k top-ranked events .
3 Proposed Model
Detailed design specification of the proposed technique titled as Correlation aware Deep Contextual learning architecture employed for high dimensional data containing the profiles and events on the inclusion of parametric tuning of the deep learning layers to obtain the prediction results of the user for event and group of events
3.1 Dataset Pre-Processing
A large variety of datasets in the form of high dimensional data are curated. Data Pre-processing has been applied in the form of missing value prediction and dimensionality reduction to determine an effective recommendation list on quality attribute set.
3.1.1 Missing Value Imputation
Missing Value Imputation has been used factor analysis. Factor analysis determines the maximum common variance on a particular data field. It follows Kaiser’s  criterion of using eigenvalues. It uses the specified field’s variance score to fill in the field’s missing value. It can also be calculated using the maximum likelihood method based on the correlation of the data field .
3.1.2 Dimensionality Reduction
The dimensionality reduction technique uses principal component analysis (PCA) to reduce a large dimensional data set to a smaller dimensional dataset. PCA is a linear transformation technique, which reduces correlation-based functions. It aims to project subspace with less dimension in large size data. It is processed using the dimensional transformation matrix .
3.2 Event Vector Generation
An event is described as a personalized unique activity on following attributes such as content, location, Spatial-temporal patterns and host. It is to eliminate the textual content sparsity and it increases the performance of the execution of group event recommendations to similar user clusters. In addition, in a text-based context, the collected information related to the event is represented in form of a vector. Vector information contains implicit feedback and diverse context into the discrete timeslot. Moreover, every event of the dataset has been represented as a 6 × 21-dimensional vector dataset. The Contextual information has been represented in form of a hierarchical tree model to organize the various contexts.
The vector contains the contextual information of the event organizer and the event participants along with the information presenting the relationship between users among the groups and their memberships among the groups. Event vector has generated on basis of online and offline user relationships on their similar participation to the available groups . Additionally, the weight is computed on basis of the online and offline social relationships of users. Users’ vectors generated are to gather the various social dimensions on multiple events. The online social user based on a particular event graph has to be modelled and it has to be combined into the same latent space on basis of relation through event-user factors, event-character factors, event-locale factor and event-spatial factors.
3.3 Bayesian Probability Generation Representation
Bayesian probabilistic representation is a type of vector classification based on mathematical representation to identify or compute the relationship of multiple instances and its variables of the determined vector to generate the recommendation class. The recommendation class is considered as a set of the probability distribution of the vector variables. The recommendation methods produce the users for the event on basis of a probabilistic model. In addition, it employs existing event-based learning paradigms to observe probability distributions obtained from existing vector data and these existing probability distributions has been utilized to determine the recommendation list of users for a group of events .
The model is employed to extract a group of events on basis of location preferences and textual content Preferences. The Bayesian probability generative model improves the quality of groups ‘event recommendations, especially on textual content preferences. Users’ interest in content topic and location has been estimated separately for each group. Model is processed on the assumption that users are independent of any event and dependent on the topic and location is represented by the combined probability distribution against U, T, and E through the following computation as
The Combined probability distribution over U and E is
Latent Dirichlet Allocation (LDA) is to extract the Latent Information on Contextual Information of user and event . LDA process is a Bayesian Probabilistic Generative Representation of multidimensional data and it is model the context factor. The fundamental design is to represent the data as a random combination over latent topics in the hidden layer of the Deep Belief Network, where a topic of the event is characterized by a distribution of the data over various dimensions. The Dirichlet is a flexible distribution of the various dimension of data with finite-dimensional. It facilitates the data inference and parameter estimation on the combined distribution of a topic combination θ and through the event parameters α and β and a combination set of N topics is represented as z, and a combination set of N words w is represented by the equation as
C(E|Tn, β) is a finite-dimensional probability which customized on the topic is given by
A k-dimensional Latent Dirichlet variable of any random event is a variable that can take any values in the (k-1)-simplex for user recommendation with the following probability density in the unigram model for the representation of the topics specific field. It is about observing and collecting latent topics from body text variables such as title, tags, and a brief description of events in a defined vector to represent each event as a probability distribution. Fig. 2 represents latent Dirichlet Allocation The probability distribution of latent topics on the vector is given by
where Specific feature is represented as wn
• The finite Parameter of the Latent Dirichlet Allocation per record distribution is α
• Linear Parameter of the Latent Dirichlet Allocation per data point is β
• Topic Distribution is given as θ
• The topic for J data point in a record i
The LDA model is represented as a Bayesian Probabilistic Generative Graphical paradigm for generating a group-based recommendation for available user clusters. There are multilevel to the LDA representation for group event recommendation. The allocation parameters α and β are two dimensional Dirichlet parameters that are predicted to be gathered in the process for generating a high dimensional record for the events. The variable θ is a field level variable that is acquired per column. Finally, the variables z and z| are data point Dirichlet variables acquired for each data point representation in any vector.
variables z and z| are data point Dirichlet variables acquired for each data point representation in any vector. De Finetti’s theorem is employed to the representation of the combined distribution of an unbounded exchangeable sequence of multidimensional attributes of the dataset with random variables is random parameter conditioned on topics were modelled using the following representation
Combining θ over z and adding θ over z for generating the marginal distribution of data points in the vector for event categorization. The similarity between a group of events is examined using cosine similarity among the vectors of the event group is given by
Finally, the outcome of the marginal probabilities distribution of data items has been obtained using the probability of distribution mp(D|α, β) as
In this, data points considered as features are generated by topics generated using marginal probability function and those topics are unbounded exchangeable among the vector on using De finite theorem. Inference of the recommendation has to be computed on the marginal posterior distribution of the latent variables of the data matrix. The marginalize over the hidden variable with function on combining the variables between θ and β in the computation over latent topics of the data distributions is as represented
Although the marginal posterior distribution of the latent topics is intractable for accurate inference of the event and its user is computed using a large category of approximate inference algorithms, this LDA paradigm has been considered on the inclusion of Laplace user approximation and variational user approximation on the set of variational user parameters on user preferences. The variational user parameters are selected using an optimization procedure to identify the largest possible lower bound in the relationship of user activities. The generated values of the variational user parameters have been computed on reducing the KullbackLeibler (KL) divergence  between the variational user distribution and the true posterior p(θ, z|w,α,β) of the event. This reduction can be carried out using the iterative fixed-point method.
3.4 Influence Weight
Influence weight is computed using the bidirectional relationship between user activities and interactional context on the multinomial vector. Various types of contexts in the extracted parameters have different impacts on users’ preferences. To gather these impacts, Influence weight model-based spatial contextual factor, Temporal Contextual Factor, Location Contextual Factor has been used.
3.4.1 Missing Value Imputation
The long term model is a weighted sum of particular user preference attribute on the available features of the event. A context-aware user influence for any event is computed as a combination of their social impact among available users in the group and with their uniqueness among user features of the event. User for event participation is selected on basis of neighbors available in the group in addition to their weight combination of event features on their long-term interest.
3.4.2 Short Term Model
An event-user rating matrix is utilized to compute the user’s short-term interest in the particular event through its event rating scores. Further influence weight is computed for each user through the event features along with the user-event rating matrix. Those computations produce similarities between the user’s features for user recommendation.
3.5 Correlation Aware Deep Contextual Learning
In this model, deep learning architecture named as deep belief network has been employed on the outcomes of influence weight and latent features of LDA. It considers the recommendation of the group of events as a ranking issue and optimizes a paradigm using a fitness function for better results .
3.5.1 Ranking Function
It uses the listwise approach to optimize the ranking of the recommended list to utilize influence weight. It produces the optimal ordering of the recommendation list. Fig. 3 represents the ranking of the recommendation list.
The optimization criterion for recommendation list in addition to the generic optimization criterion for personalized ratings of inbound relationships of online and offline social events into the paradigm is as
where Θ is the optimal parameter set of LDA
Eui is the set of the positive event which includes the events participated by ui,
Evi is the set of negative events which includes the balance events attended by vi.
3.5.2 Visible Units
Visible units use the latent variable of events on basis of convergence divergence through maximum likelihood which is applied for learning weight. In this layer, gradient descent has been used for weight updates . Tab. 1 provides the parameterized component of the deep belief network for generating the recommendation list of profiles for a group of events.
The visible layer uses the partition function for the visible vector gathered from the LDA process. This function is employed for normalizing the value before ranking. A New visible layer has been generated on basis of normalizing, current weights of the user preference and bias values of the ranking function. The procedure uses hyper tunning on the parameter list for Gibb’s sampling. It achieves better convergence. Fig. 4 represents the proposed architecture of the work.
After an adequate number of sampling iterations, the approximated marginal posteriors can be utilized to determine Convergence. It represents the attribute similarities between the user and event on the social affinities. Algorithm 1 gives the procedure of the correlation aware deep contextual learning on basis of a deep belief network.
The outcome of the algorithm contains the events handled by the event organizer with more related content than those handled by other event organizers. Along with this, the correlation between textual content and event organizer has been modelled to mitigate the textual content sparsity with high quality. It is beneficial to remove the sparsity of textual content and to increase the performance of group event recommendations for similar user clusters. Finally, location-based similarity has been computed to increase the efficiency of the model to produce a better result on the similarity between two events handled at the same locale.
4 Experimental Results
Experimental analysis of deep learning architecture using parametric tuning on high dimensional data for recommendation list generation on the correlation of event and profiles has been carried out on the Meetup dataset which is high dimensional. The performance of the proposed group recommendation model on profiles has been evaluated utilizing precision, recall and F-measure against real-time dataset in addition to computing cosine similarity and Euclidean distance on the obtained list. The proposed model will be tested and evaluated using Java technology. In this particular platform, the handling of configuration file recommendations is flexible to train and validate the system.
In this work, 60% of the collected Meetup data set  is used to train an appropriate learning architecture to generate a recommendation list, and 20% of the data set is used to validate the proposed model for cross-validation finally, 20% of the data to test the model. In this work, 80% is used for training data, of which 60% is used for training the model, and 20% is used for validating the trained model.
4.1 Dataset Description
Extensive experiments on Meetup and Twitter datasets is to measure the outcome of the User Recommendation list and its ranking function performance has been carried out. In this model, the data set is data divided into equal parts for training and testing. Therefore, model training consumes 60%, verification consumes 20%, and testing consumes 20%. A detailed description of the datasets is represented as follows
Meetup data set is the most popular EBSNs. It contains an online platform for the user to create, identify and share online and offline events for a group of the user as a collection which is the most frequently used benchmark dataset for many recommendation based applications.
4.3 Twitter Dataset
Tweets dataset is a popular social media platform are extracted via Twitter’s application programming interface (API) which is around 27000 rare-event-related tweets on various contextual factors is considered as a benchmark dataset for potential user recommending events.
Aiming at the traditional in-depth impact prediction, the proposed technology is evaluated against the following performance indicators. In this paper, the proposed model is evaluated through 10-fold verification to calculate the performance of the recommendation model, which is generated on the above data set using activation and ranking functions. The performance evaluation of the proposed deep learning model depends on the activation function, hidden layer, visible layer, loss function and hyperparameter process of the model.
It is a measure of positive predictive value. It is also expressed as the proportion of related instances in each group recommendation list generated using the model. Fig. 5 illustrates the performance evaluation of the proposed accuracy measurement architecture by classifying the Meetup Dataset into high and low deficient profiles.
Performance metrics are used to determine the feasibility of the current architecture on the recommended profile list. Efficiency is achieved through hyper tuning of parameters.
True positive is the same number of data points in the feature set and false negative is the number of different real data points in the feature set .
A recall is the fraction of similar data points that have been extracted from the total number of similar data points in the recommendation list. The recall is the part of the relevant user event that are successfully classified into the exact classes.
True positive is several similar data points in the extracted feature set and false negative is several similar data points in the feature set. Fig. 6 represents the performance evaluation of the particular architecture on recall measure along with state of art approaches.
The quality of the prediction depends on the activation function in a deep learning architecture. The encoder computes the feature map to create the subspace. The F measure is an accurate measure to determine the quality of the recommendation model.
4.4.3 F Measure
It is the number of correct class predictions to the incoming data among the total number of predictions for the entire data category.
Accuracy is given by,
While different data points may have different effects on the results generated, they are likely to have the same impact on the correlation of context factors. Fig. 7 shows the current model’s performance in terms of f measure compared with accurate event profile recommendation. However, after a certain point, this data vector is reduced by the curse of dimensionality. In contrast, for the meetup dataset.
Gibbs sampling function acts as a normalization function to map the input data category into a new visible distribution on event participation. Then, the generative Bayesian probabilistic function is used to produce the original vector utilizing conditional probability and pairwise similarity. Tab. 2 represents the training parameters.
The evaluation of the result is described in Tab. 3 for a dataset with high deficient profiles of meet up the dataset. Performance of the complicated objective function has been revealed that latent user and its social connections with event produce the maximum participant to any event description.
Performance of the Twitter dataset on recommending the events the different user profiles on basis of the various user perspective and event perspectives has been computed through various methodologies. Due to the dataset being sparse, the group are generated based on correlation similarities of latent features of the user. Fig. 8 provides the performance of precision, Recall and F-measure results for the various methodologies on processing with a Twitter dataset on cold start scenario.
To examine the impact of the different components of our proposed technique greatly increase the performance of the topic generation and execution time on the Twitter dataset. For each topic, we present the top ten words with the highest generation probabilities. It is observed that the ten topics are different, and each of them is semantically coherent. Fig. 9 represent the performance outcome of the topic generation and execution time. Tab. 4 summarizes the performance of the Twitter dataset on various methodologies.
From the Figures and tables, it is identified that trend of comparison results is similar to that presented dataset for evaluation. Moreover, it is observed that the recommendation performance of each method does not change significantly with different group sizes. Finally, Gibbs sampling methods have been used to speed up the learning of our model.
A Correlation Aware Deep Contextual Learning framework to enhance the prediction of a group of user participation in events especially in the pandemic period on exploring the social behaviors and preference of users on various latent factors in the contextual information has been designed and implemented. A latent variable is identified by employing Latent Dirichlet allocation to increase the high relevancy rate in the hidden layer. Gibbs sampling inferred influence weight has to be computed on the long term interest representation model and short term interest representation model to jointly represent user impact on the event in the visible layer. The interest model uses the multifaceted information ranking based on knowledge level, hierarchy level and participation level on the relevant events in the activation layer of deep belief function. The proposed architecture obtains high relevant results with good scalability. Specifically, we designed a deep learning architecture towards prediction on basis of users’ preferences and their latent user social connections. The proposed model effectively uses the objective function for the decision-making process of event participation
Funding Statement: The authors received no specific funding for this study.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
|This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.|