Sentiment Analysis in Social Media for Competitive Environment Using Content Analysis

: Education sector has witnessed several changes in the recent past. These changes have forced private universities into fierce competition with each other to get more students enrolled. This competition has resulted in the adoption of marketing practices by private universities similar to commercial brands. To get competitive gain, universities must observe and examine the students’ feedback on their own social media sites along with the social media sites of their competitors. This study presents a novel framework which integrates numerous analytical approaches including statistical analysis, sentiment analysis, and text mining to accomplish a competitive analysis of social media sites of the universities. These techniques enable local universities to utilize social media for the identification of the most-discussed topics by students as well as based on the most unfavorable comments received, major areas for improvement. A comprehensive case study was conducted utilizing the proposed framework for competitive analysis of few top ranked international universities as well as local private universities in Lahore Pakistan. Experimental results show that diversity of shared content, frequency of posts, and schedule of updates, are the key areas for improvement for the local universities. Based on the competitive intelligence gained several recommendations are included in this paper that would enable local universities generally and Riphah international university (RIU) Lahore specifically to promote their brand and increase their attractiveness for potential students using social media and launch successful marketing campaigns targeting a large number of audiences at significantly reduced cost resulting in an increased number of enrolments.


Introduction
Several changes in the education sector have been witnessed in the last couple of decades, like decentralization of higher education institutes, the autonomy of Government-funded universities, and a large number of new private universities are to name a few. These trends have led higher education institutes, especially private universities, to compete with each other for potential students. Due to these waves of changes in the education sector, universities are gradually inclined towards a businessoriented model and started to adopt marketing and advertising business practices to get a competitive advantage [1][2][3].
From a business perspective, there are similarities between private universities and service provider companies. Potential students behave like customers in choosing a university. They compare different characteristics and features of universities in terms of academic environment and facilities. In this decision, parents are also involved. In most cases, they view universities as a product or service brand. Therefore, universities must establish themselves the same as a commercial brand [3,4].
A brand can be defined as the name, symbol, design, sign or blend of all of them, used for the identification of products or services of an individual or organization as well as differentiation from competitors [5]. For a higher education institute, a brand is an indication of its distinguishing features, shows its student satisfaction capability, creates confidence about the institute's capability to provide particular types and levels of higher education as well as help students in making enrolment decision [2]. A brand communicates essential information about university reputation, offered services and quality of services to students for decision making [4].
Consumers have now got all the facilities and power to exhibit their own opinion regarding the brand with other consumers. Now consumers are acting as brand ambassadors, and they create their own version of brand stories [5]. These stories are real-time, digital, and dynamic, much more influential than the stories broadcasted from traditional channels [6].
This study is designed to understand the most influential factors for students while choosing a private university in general and Riphah international university (RIU) Lahore in particular. Due to increasing usage of social media by universities to reach potential students, performance analysis of RIU Lahore on social media specially Facebook is conducted. Through benchmarking process RIU Lahore's performance is evaluated in comparison with other private universities in Lahore and highly reputed universities worldwide. The knowledge and insight gathered in problem identification process are utilized to establish a detailed guideline, enabling RIU Lahore to communicate their brand more effectively and attract more potential students. This paper is structured as follows. Section 2 presents literature review related to the idea of branding in universities using social and related work on social media analysis. Section 3 explores research questions and design. Section 4 shows the research design. Section 5 presents the results obtained in content analysis, sentiment analysis, and content strategies. Section 6 demonstrates the recommendation and finally, Section 7 offers conclusions and future directions.

Related Work
In a survey conducted by Gallup in October 2017, 53% of owners of small businesses in the United States (US) say they have an active presence on Facebook. Over half (54%) of small business owners think social media help them getting new customers and market their products (53%) to new and existing customers. Moreover, 51% say they are going to increase their presence on the internet by using social media sites [7]. In a 2014 survey, 92% of marketers said that marketing on social media was critical for their business, while 80% indicated an increase in their website traffic due to social media marketing [8]. The above statistics show the significance of social media for businesses regardless of their size and nature.
Almost all higher education institutes in the USA had already a presence on Facebook by 2012. Universities use it to communicate with prospective students, parents, alumni, and many others [9]. Several universities have already used Facebook to provide a virtual tour of their campus. Professors at Stanford have been conducting interactive sessions with students on Facebook. The University of massachusetts (UoM) Dartmouth's latest poll about social media usage for marketing by higher education institutes indicates that 100% of participant colleges and universities utilize one or other types of social media. According to the survey, 98% of institutes have a Facebook page, 84% have a Twitter account 86% maintain a YouTube channel. 66% of institutes run a blog, 41% are podcasting and 47% of admission staff use LinkedIn.
In the current dynamic and challenging business atmosphere, it is essential for businesses to examine regularly and systematically interpret their competitors' plans, products and services to remain competitive in business [10,11]. Since companies at large are using social media to engage existing and potential customers, business requires observing and analyzing the competitors' social media sites and their own sites. Dey, Haque, Khurdiya, and Shroff advocate that along with competitors' information, social media also provides an opportunity to compare customer behaviors regarding competing companies [11]. Social media competitive analytics has become an essential skill for companies to get prompt feedback from customers and generate analytical reports that help them attract and retain customers [11][12][13][14][15][16][17][18].
Several studies have been performed already on social data analysis in social media for instance [19][20][21][22][23]. In Yanga et al. found it difficult to translate social media data into business intelligence, hence they proposed a business decision making system based on social media analytics to improve business [23]. Singh et al., authors proposed a social media analytic technique employing hierarchical clustering and text mining to help with food supply chain management [24]. Ramanathan et al., established a theoretical model to enable retailers to better serve their customers using insights from social sites data [25]. Ramasamy et al., employed support vector machines to classify Twitter data into positive, negative and neutral sentiments to improve decision making process for business [26]. To address early risk assessment challenges, a supervised text classification approach was introduced by Burdisso et al., in 2019 [27].
Despite of numerous studies on social media analytics, there is still very limited research work on social media competitive analytics [13]. Competitive analysis has recently attracted attention of scientific community, for example, a study conducted by He, Zha, and Li performed a competitive analysis on social media of the three largest pizza companies in the US and found that business value can be effectively determined from its social media content [10]. Another study performed competitive analysis of Facebook data of three famous pharmacy chains in the US [15]. To perform a comprehensive analysis of their own social media platforms as well as of their competitors systematically, a framework has been proposed in this study. The proposed framework employs statistical analysis, text mining, and sentiment analysis techniques to get valuable insight from highly unstructured data from various social media platforms. 5606 CMC, 2022, vol.71, no.3

Research Questions
This study analyzed the Facebook sites of five of the top-ranked universities in the world and five local private universities in Lahore Pakistan. Text mining and sentiment analysis was performed along with the content analysis to answer the following research questions: • How can RIU Lahore use social media to attract potential students?
• How private universities in Lahore use social media to attract potential students?
• What are social media parameters used by the highly reputed international benchmark universities in order to get the attention of prospective students?
Once the answers to the above questions are found, good applicable practices can be suggested in view of available resources of these local universities, which can improve their performance on social media and can be beneficial for attracting more and more students for admissions in the future. This can also result in the reduction of the cost of the admission campaign.

Research Design
An agenda for social media analytics is depicted in Fig. 1, which presents few possible techniques, including text mining, content analysis, sentiment analysis, and statistical analysis of social media data [14]. This framework includes techniques from computer science, statistics, social science, and computational linguistics. Topic modelling, text classification, and a few other algorithms and techniques such as n-gram can be used in addition to the proposed techniques [15,16]. This framework is scalable as new methods and algorithms can be introduced with technology evolution.

Data Collection Techniques
Several methods can be used to collect data from social media sites. A web-crawling application may be utilized for this purpose. Moreover, social networks like Facebook, Twitter and YouTube provides some Application programming interfaces (API). Based on these APIs, organizations can build an application for customized and appropriate data collection. Although blogs do not offer an API for data collection, nevertheless they provide Resource description framework site summary (RSS) feeds for data tracking. In such cases where RSS is not available, Hypertext markup language (HTML) parsing, manual copying and web crawling can be applied for data collection though these techniques are cumbersome and sluggish.

Data Storage
A repository is required to store this extracted data from social media channels before performing any analysis. There is a possibility of information from the social networking sites by the organizations; it is vital to keep longitudinal data of social media sites in a repository. The development of a repository capable of storing heterogeneous longitudinal data can be facilitated by the proposed framework.

Data Analysis
This framework provides a guideline for the development of software systems for social media analytics capable of collection, storage, and analysis of longitudinal data from multiple social media platforms of competitors [15,17]. The framework can be utilized to generate periodic competitive intelligence reports providing business insights related to marketing, supply chain, customer service and many other business functions. For instance, businesses can improve their marketing strategies to reflect consumer demands on the basis of sentiment analysis of customer comments on their respective social media platforms resulting in a competitive advantage over their competitors. Companies can make better-informed decisions on the basis of understanding tendencies, problems and patterns detected from social media data analysis resulting in better customer experience and more value to the business [18,19].

Procedure
To find the answers to the proposed research questions social media analytics framework presented earlier was chosen. The first step was to collect data from the Facebook pages of selected universities by using the Graph API of Facebook. This data consists of Facebook posts, reactions, and comments. In addition, the number of followers was collected manually from each Facebook page. Data was collected for complete one year from 1 st December 2019 to 30 th November 2020.
After the extraction, the data was stored in multiple Excel files for analysis. The first analysis performed was statistical analysis which included correlation analysis as well as descriptive analysis to understand the associations among the data gathered from the Facebook pages of selected universities. The next step was text mining to uncover business insights and patterns in the textual data. Various tools were used for text mining in this study, including Leximancer, 'WordStat', and 'RapidMiner' [19].
Brand reputation can be effectively observed with the help of sentiment analysis [20]. It provides companies with an understanding of customer perception towards their products and services. Several tools are available for sentiment analysis, though, for this study, a popular tool, SentiStrength (http://se ntistrength.wlv.ac.uk) was used. SentiStrength enables researchers to extract the polarity of sentiment from a given text. Hence it can be used to find the positive, negative, and neutral sentiments in the comments posted by users on social media [20][21][22]. The results generated from these tools were carefully reviewed in order to identify new patterns, discuss findings, and provide recommendations. 5608 CMC, 2022, vol.71, no.3

Growth Rate
It is pertinent to mention the current number of followers each institute has on different social media platforms before getting into the details of the growth rate. Tab. 1 demonstrates the number of total followers of each university on four platforms: Facebook, Twitter, YouTube, and Instagram. Harvard university tops the list, followed by Oxford and Cambridge. These numbers are generally important to assess the popularity of a brand. However, they do not affect the analysis and conclusion as all the calculation presented here are on the basis of proportion. Hence the results of competitive analysis remain unbiased. Fig. 2-displays the followers' growth rates on Facebook as this study is limited to the detailed analysis of Facebook data only. The follower's growth rate assesses the effectiveness of the social media strategy adopted by a brand. An increase in this metric value is an indication that a brand is getting more new consumers. The line chart of Fig. 2 shows the comparison of RIU Lahore with international universities in terms of follower's growth rate during the observation period. Massachusetts institute of technology (MIT) got the highest score in this metric, followed by Oxford and Cambridge. Although Harvard university (HU) got the least growth rate despite the largest number of followers, MIT does not follow this trend and displays a significant growth in followers. Interestingly RIU Lahore is ahead of National university of singapore (NUS) and Harvard in this metric which exhibits increasing interest of consumers towards the RIU Lahore's brand.

Types of Posted Content
The proportion of each type of content in the total shared content is demonstrated in Fig. 3. All universities posted photos, videos, status, and link except HU. Harvard did not post any status during the whole period of observation. It is visible from the chart that photos, links, and videos were most frequently posted. Status is the least posted type of content by all benchmark universities. Two significant trends are visible in the chart. Firstly, all local universities used all four types of content to engage their audience. Secondly, the most frequently posted type of content was a photo. Videos were the second frequent type of content except in the case of FAST, where status took the second position, followed by the videos and link. It is pertinent to mention here that in contrary to international universities, local universities gave the least importance to the links; instead, they posted status more frequently.

Frequency of Posts
Greater number of updates on Facebook demonstrates that the institute has the availability of content to engage their current and potential students which leads to a vibrant and active image of the institute in students' perception. Fig. 4 exhibits the total number of posts by RIU Lahore compared with the selected international benchmark universities during the observation period. It is evident that the highest number of posts was made by the NUS (1257), while the lowest was that of RIU Lahore, which is 304 posts in a one-year duration. Another analysis was performed to guide RIU Lahore to choose the right time for updates as practiced by the international universities.
The line chart in Fig. 5 reveals that most of the universities post content on their Facebook pages during the normal working hours between 6 AM and 6 PM. The only exception is the NUS which has almost similar posts at night. RIU Lahore also posts at night but a rate of one-third compared to the posts at daytime. It shows that RIU Lahore is following the trend of the majority of international universities. Some vital statistics related to the post frequency were revealed during the analysis. All international universities post more than once a day which shows the availability of contents to share with the audience. Here again, NUS is top of the list posting almost four times a day. RIU Lahore is far behind these universities in this regard and needs to post more frequently to engage their existing and potential students.

Engagement
Engagement is a powerful metric for companies to assess the success of brands on their social media platforms. Engagement is the sum of likes, shares and comments. Fig. 6 presents the total engagement to demonstrate the overall response of the audience towards the shared content on the Facebook page by RIU Lahore and all other universities under observation in this study.
It is evident that RIU Lahore is far behind international universities, which is obvious due to a low number of followers and posts as compared to these international universities. When compared with local competitor universities, RIU Lahore got the second-highest sum of engagement which is a good sign for RIU Lahore's efforts in social media.
The Pie chart of Fig. 7 shows engagement by the type of posts made by all universities in the period of observation. The maximum engagement was received by the photos followed by the videos, while the share of link in total engagement is only 18% whereas, the status type's share is less than 0%. These findings can be used as a guideline when choosing to share the content on Facebook pages. The highest engagement rate was found on the HU page; as far as the type of content is concerned, it is evident that maximum engagement was found on the links shared on Harvard's Facebook page followed by videos and photos while engagement on status was very low when compared with other types of content. The second-highest rate of response was that of University of central punjab (UCP), and among the content shared by UCP, photos got the highest response, followed by the videos and status, respectively. RIU Lahore is doing well in this aspect as the total engagement of RIU Lahore is greater in number than three competitor universities in Lahore, i.e., University of lahore (UOL), FAST, and the University of south asia (USA). Considering the type of content, RIU Lahore got a maximum response on photos. Fig. 8 shows the engagement by types of posts of all universities.
Two very important metrics is the page and post engagement rate which shows how the social media efforts are resonating, calculated on the basis of total engagements and total followers. RIU Lahore's policy is getting a positive response from the audience as shown in Fig. 9. Likewise, it got the highest value of average post engagement rate among all universities in this comparison. It is evidence of the popularity of the shared content among the audience. Although RIU Lahore's number of followers and total engagement is meagre when compared with the international universities, still it got the highest score in these two metrics because these metrics are calculated on average; hence a page with a small number of followers can be compared with a page having a huge number of followers without affecting the results.   When the same test is performed to check the correlation among the likes and shares, a strong correlation (0.378268) is found. This means the audience who liked the posts also shared the posts. The likes and comments are also strongly correlated (0.618045) with each other. It shows that people who liked the posts also commented on the posts. A positive correlation between shares and comments (0.425139) illustrates that the audience who shared the posts often commented on the post too. Fig. 10 shows the correlation for likes, shares, and comments.

Text Mining
Text mining performed on the audience comments revealed the usage of several positive words by the audience of RIU. The most frequent word is "Riphah", signifying that many stories are circulating the brand name of the RIU and the importance of university image from the students perspective.
A widespread word is "Lahore", which demonstrates that the city image is influential in choosing a university as shown in Fig. 11. Few other persistent words are: "Program", "Admission", and "apply", showing frequent discussions on the topics related to admission and programs of study. Although not very frequent, but positive words do exist in the word cloud, including "Good", "Nice", and "Great". It is a positive sign for RIU Lahore as it demonstrates the positive outcome of the strategy being implemented on Facebook.

Sentiment Analysis
The results of sentiment analysis demonstrate the totality of the comments posted by the audience in response to the contents shared. The results are divided into positive, negative, and neutral categories. Fig. 8 presents the comparison of comments polarity among all universities. It is evident from the chart that the NUS is at the top with 48% positive comments, followed by the University of Oxford. There is a slight difference in the result of Harvard (40%) and Cambridge (39%). MIT is behind all international universities, with a positive score of 35%. Regarding the local private universities and RIU Lahore, there is no significant variation in the results. RIU Lahore, UCP, and the USA got almost the same percentage of positive, negative and neutral comments. In the case of UOL, the result is slightly different as it got 24% positive comments while 6% negative comments.
Nevertheless, the proportion of neutral comments is almost similar to other local universities. FAST's results demonstrate a considerably different trend. Positive comments for FAST reached a lowest of 18%, while the negative comments witnessed a rise to 10%. The proportion of neutral comments are almost the same as other local universities. Fig. 12 comments on the posts of all universities. To get the simplified result as a ratio, positive and negative comments were divided. Harvard university has the maximum number of comments, followed by the University of Oxford and the UCP. Although RIU Lahore did not receive a large number of comments when compared with international universities as it received a total of 338 positive comments and 32 negative comments during the observation period. RIU Lahore's ratio of positive over negative comments is at the peak, which is 10:1, which means that there are ten times more positive comments than negative comments during the observation period. For the overall improvement of its image, RIU Lahore is recommended to improve the underperforming areas in its capacity in comparison with the competitors and benchmark universities. The analysis of Facebook data suggests that the content shared by RIU Lahore lack diversity when compared with local competitors and international benchmark universities. A huge percentage (81%) of the content is comprised of photos only. While remaining 19% is comprised of videos, links and statuses. Analysis of other universities shows that their shared content comprised more videos (20%) and links (26%). With more diverse content, RIU Lahore can increase the engagement levels on its Facebook page as the different audience preferences are different in terms of content types.
RIU Lahore posted 304 times in the observation period, which is approximately five posts per week. RIU Lahore needs to research and create more content in order to increase its post frequency to at least seven posts per week.
A survey was conducted in the local private universities during this study, which reveals current students' preferences when they decided to choose their current university. The factors considered most by the students in descending order are: university image, facilities, student life, finances, people, convenience, program of study, miscellaneous, and promotion. However, the contents shared by RIU Lahore on its Facebook page do not significantly match with these preferences of its students, which results in loss of engagement. Therefore, importance should be given to the content that promotes the university's positive image among the students by highlighting the profiles of its professors, the international and local ranking of the university, and the quality of teaching. Moreover, the academic environment, facilities for sports and extracurricular activities provided should be highlighted as well.
Another interesting area for improvement is the schedule for updates on social media. Competitive analysis has revealed that most international universities update their Facebook page between 6 AM and 6 PM. Whilst most of the content posted by local universities during the observation period was between midnight and noon. The reason for this trend is that most of the students visit the university's Facebook page for updates regarding the events and activities while in university. The right time for posting is when most of the users are visiting the Facebook page of RIU Lahore so that the shared content should be delivered to the audience straight away. Specific tools like Facebook insights are available to find out the time when most of RIU Lahore's users are active.
In order to assess the effectiveness of the current social media policy of RIU Lahore, the comments on the shared content were carefully analyzed using sentiment analysis and text mining techniques. The polarity of comments is almost similar to competitor local universities in Lahore. However, when compared to international benchmark universities, the percentage of positive comments is significantly low. Interestingly the ratio of positive comments over negative comments is very high in the case of RIU Lahore, which implies that the audience has positively received the content shared by RIU Lahore.
RIU Lahore may utilize the power of commercial software available for social media analysis to trace its employed strategy's success in real-time. Furthermore, RIU Lahore should make a schedule for the posting on Facebook. Choice of posting time is vital to get more response from the audience. Greater engagement levels can be achieved if the posting is done when most of the followers usually remain online, reducing the chances of post being lost in the followers' news feed. In order to find out the time of maximum engagement, Facebook Insights may be employed.
Last but not least, repetition of the survey and the social media competitive analysis is recommended. The survey results indicate that students' preferences may change with the passage of time; hence, the survey should be repeated every three years to identify these changes. As far as social media analysis is concerned, based on the recommendations of marketing experts, it is suggested to repeat it after short intervals, which may be three to six months. This repetition aims to have a more precise valuation of current social media performance and collect insights for short term marketing campaigns and strategies.

Conclusion and Future Work
Effective usage of social has become an important strategy to get competitive advantage in business community. Our comprehensive case study reveals that universities are using social media tools not only to advertise their brand name but also to establish good relationship with existing and potential students. By keeping track of the competitors' Facebook pages universities can learn new techniques to engage their students and device new marketing strategies. Proposed framework will help universities in conducting social media competitive analysis. An in-depth case study was conducted which included five top ranked international universities and five local private universities employing techniques of the proposed framework. The outcomes of this study demonstrates that the applied framework and techniques are very effective in attaining competitive intelligence.
This study can serve as a decent footing and stimulus for further research on similar topics. However, some facts in this research may be further developed and improved, and they are as follows: The survey of influential factors considered by students in choosing a university was conducted in only private universities of Lahore. There are several Government universities in Lahore as well. Due to time and resources constraints, Government universities were not included in this survey. In future studies, students from Government universities can be included to further improve the results. This research was limited to only one social media platform that is Facebook. Future researches may study other social media channels like Twitter, Instagram, and YouTube. Furthermore, the difference in consumer behavior on different social media platforms of the same organization can be studied. Further research may be conducted in order to explore another potential of social media for brand promotion, like the increased engagement of current followers on different social media platforms.