Psychological Stress Detection from Social Media Data using a Novel Hybrid Model

One of the mental threat for individual’s health identified is Psychological stress from social media data. Hence, necessity is to predict and manage stress before it turns into a serious problem. However, Conventional stress detection methods exist, that rely on psychological scales & physiological devices that need full of victims participation which is time-consuming, complex and expensive. With the trending growth of social networks, people are addicted towards sharing personal moods via social media platforms to influence other users, leading to stressfulness. The developed novel hybrid model Psychological Stress Detection (PSD), automatically detect the victims’s psychological stress from social media data. It comprises of three (3) modules Probabilistic Naïve Bayes Classifier, Visual (Hue, Saturation, Value) and Social, to leverage text, image post and social interaction information; we defined set of stress-related textual ‘F = {f1, f2, f3, f4}’, visual ‘vF = {vf1, vf2}’, and social features ‘sf’ to predict stress from social media content. Experimental results show the proposed PSD model improves the detection process when compared to TensiStrength and Teenchat framework, PSD achieves 95% of Precision rate. PSD model will assist in developing stress detection tools for mental health agencies.


Introduction
Psychological stress is considered the biggest threat to an individual's health.Three of every hundred people in metropolitan areas are assessed to suffer from stress.Long-lasting stress may lead to many severe physical and mental problems, such as depressions, insomnia, and even suicide.Stress and suicide are closely interlinked.At its worst, anxiety can lead to suicide [1].According to World Health Organization (WHO) over 56 million Indian's suffer from depression and Corporate sector in India as well reported an increase in stress over the last two years, a survey by workplace solutions provider Regus in 2015, reported 57% of corporate India is under stress and about one in five people in the country need counseling, either psychological or psychiatric.The National Institute of Mental Health and Neurosciences (NIMHANS) published a mental health assessment said 5% of the population suffers from depression as of 2016.Thus, recognizing stress or depression at an early stage is critical for reducing suicidal deaths and deliberate self-harm across the spectrum.A range of sensors and non-textual methods have been developed to detect stress as sensors can offer data regarding the intensity and quality of a person's internal affect Experience [7].For examples, H. Kurniawan presents stress detection from speech and galvanic skin response signals [3], and [6].S. Greene describes the stress detection method utilizing sensor which monitors Heart Rate Variability (HRV) [3], [5], and [7].These sensors are expensive and most of the Traditional stress detection methods are mainly based on one-on-one interviews conducted by psychologists or self-report questionnaires [13] or wearable sensors [16].However, these are usually labor & time-consuming and are reactive methods.The query is, are there any appropriate and practical methods for stress recognition?The rapid growth of Social Media is Changing People's Life, as Well as Research in Healthcare and Wellness.Report from Global Social Media research 2018 [4] shows there are in total 3.196 billion active social media users worldwide and among them Facebook is the most popular social network with total 2.01 billion active users monthly and Twitter is the fastest growing social networks with total of 328 million active users.With the quick advancement of social networking sites, individuals are excited to use it as a platform to express their moods and day-today life actions making it a popular platform to express thoughts via posts.People share thoughts, express emotions, record daily habits via textual and image posts and also interconnect with friends, a sample is depicted in Figure 1.As these social media data reveal user's reallife situations and emotions in a timely fashion, it offers new prospects for assessing, modeling, and mining user's activity patterns through social networks, such social data can find its theoretical basis in psychology research and Encoding emotional information in text is a common practice especially in online interactions [2].Thus, we can obtain linguistic and visual content from individual's posts that may indicate stress related symptoms that makes the detection of user's psychological stress feasible through their posts, posting behavior and social interaction on micro-blog or social media.The rest of paper is organized as follows: The Section 2 gives an overview of the related work and also describes the problem statement.Section 3 introduces the definitions of the proposed features and presents the hybrid model Psychological Stress Detection (PSD) for stress recognition and describes the algorithmic structure and modules of the proposed PSD model.Section 4 presents experimental results.Lastly, Section 5 provides the conclusion and discusses future scope.
I am feeling very sad and don't feel like doing anything.:( depressed.
I hate my manager and job schedule, very stressed.

Related Work And Problem Defination
There have been research efforts on harnessing social media data for developing mental and physical healthcare tools.K. Lee et.al's [8] proposed to leverage social media data for real-time disease investigation; while [10] tried to link the vocabulary gaps between health seekers and providers using the community generated health data.Psychological stress detection is interrelated to the topics of sentiment assessment and emotion recognition.Our recent work [25] presented relative analysis of various psychological stress recognition Approaches.Study on Emotion Detection in Social Networks, Computer-aided recognition, scrutiny, and application of emotion, especially in social media, have drawn considerable attention in current years [11], [12], and [15].Associations between psychological stress and personality traits can be a thought-provoking issue to contemplate [17], [18], and [20].Several studies on social networks based emotion analysis uses text-based linguistic features and typical classification approaches.As discussed above a range of sensors and non-textual methods have been developed to detect stress but they are expensive.With the development of social networks people are willing to share their daily events and moods, and interact with friends through the social networks, making it possible to leverage online social network data for stress detection.There are also some exploration [22], and [27] using user posting contents on social network to identify user's stress revealed that leveraging this social media data for healthcare, and in particular stress detection, is feasible.However, these mechanisms mainly leverage the textual contents and consider only single post of an individual in social networks.
In certainty, data in social networks are typically composed of chronological and inter-connected items from diverse sources i.e., it also contains images posts apart from textual posts, making it be actually multi-media data and single post reflects the instant emotion but people's emotion or psychological stress states are habitually more persistent, fluctuating over different time periods.
In recent years, extensive study started to focus on emotion detection in social networks from sequential post series [21], [23], and [24].Our work is to leverage sequential post (multimedia data) content over specific sampling period to detect stress.

Feature Categorizations
To cope with the problem of psychological stress detection we have defined set of stress-related textual 'F = {f1, f2, f3, f4}', visual 'vF = {vf1, vf2}', social 'sf' features.Table 1 shows categorization of these features.Textual Feature is a set of four features, F = {f1, f2, f3, f4}, these feature stores word count of a particular sentence comparing them with linguistic stress lexicon explained in Table 1.'f1' stores number of stress and negative words present in the sentence posted.'f2' stores number of positive emotion words present in the sentence posted, 'f3' stores negative emoticons if present and 'f4' stores negating words present in the sentence posted.These features help the probabilistic module in calculating the probability of each class and labeling the textual post to its particular class as shown in Figure 4. Visual Feature is a set of two features, vF = {vf1, vf2} they stores the mean value of the brightness and saturation, which is calculated in the visual module algorithm as shown in Figure 5.
Value of Image brightness and saturation will be compared with threshold value to categorize the particular image post as stress or non-stress as shown in Figure 3. Social Feature 'sf' stores the mean value of number of likes a particular post gets it give us the social attention degree that means when user posts are negative or stressful it gets more attention from the friends of that particular user via likes or comments.Thus, we will find the mean of number of likes and categories the posts accordingly as shown in Figure 3.
The detail Architecture of the model is shown in the Figure 2.
Psychological stress detection has many challenges to overcome them first have categorized features as discussed in section 3.1, now will design modules to utilize these features for stress detection.Probabilistic module fetches textual feature 'F = {f1, f2, f3, f4}', from content of user's post comparing with Linguistic Stress Lexicon.Visual module fetches visual features 'vF = {vf1, vf2}' from image post, and Social module will fetch social attention feature 'sf' as discussed in section 3.1.

Architecture
Figure 2 shows the architecture of our model.There are three types of information that we can use as the initial inputs, i.e., Textual features, Visual features and Social features whose detailed computation will be described later.We address the solution through the following three key modules:  Design the Probabilistic module to fetch textual feature 'F = {f1, f2, f3, f4}', from content of user's post comparing with Linguistic Stress Lexicon. Visual module fetches visual features 'vF = {vf1, vf2}' from image post, and  Social module fetches social attention feature 'sf' as discussed in section 3.1.Steps involved in the architecture of the proposed system are as follows: i.The first step captures the Post and messages sent between the users of Social Networking Site (SNS).These messages are saved in the database. ii.
In this step, a text post is taken and this text message is converted to plain text removing stop-words (such as articles, preposition).Next, will check if the user has also posted an image if yes, then will send it to visual module to process its mean value of brightness and saturation.Next, will check social attention feature to calculate the mean of likes.Negative Emoticons "%-(", ")-:", "):", ")o:" , ":(", ":*(", ":,(", ":-", ":-&", ":-("…….iii.This is the main step in which will compare the plain text with linguistic stress lexicon to find out the number of stress words, negative words present in the text and store the result into feature set denoted by 'f1-f4'.Next, will extract the image features and store the result into feature set denoted by 'vf1' and 'vf2'.Lastly will calculate the number of likes and stored it in feature 'sf' as shown in Table 1.iv. The Textual (linguistic) features are fed into the probabilistic module to predict the class of the text post.The module uses maximum likelihood estimates for the detection.The visual features are fed into the visual module, where the brightness and saturation of the image is calculated as shown in figure 5 and compared with the threshold value as shown Figure 3. Social feature is fed into social module to mean of likes and to compare it with a threshold value shown in Figure 3. v.
In the last step, results are displayed to the user if the user posts satisfies all the threshold values then only his/her posts will be tagged as stress as shown in Fig. 3 and if the user is status is stressed then will send the recommendation to the user regarding stress management.

Probabilistic Module
To fetch textual feature from the posts will make use of Naïve Bayes classifier which is based on Bayes theorem.This classifier algorithm uses conditional independence, means it assumes that an attribute value on a given class is independent of the values of other attributes.It is simple yet efficient classifier that is used to classify uncertain data.It works on the principles of Bayes Probability theorem that means it will be having a set of known results for some words and it compares those results with the words in a particular sentence.The Bayes theorem is as follows: Let X = { } be a set of 'n' attributes.In Bayesian, 'X' is considered as evidence and 'H' is some hypothesis means, the data of X belongs to specific class 'C'.We have to determine , the probability -are Number of words in the given sentence, here wordn = fn, where n = 4 refer Table 1.If textual feature fetched from the given sentence matches the words in the linguistic stress lexicon explained in Table 1 will assign class to that sentence according to the probability score of each class.The detail computation of probabilistic module is shown in Probabilistic Module Algorithm Figure 4.As discussed in section 3.1 the textual Feature are a set of four features, F = {f1, f2, f3, f4}, these feature stores word count of a particular sentence comparing them with linguistic stress lexicon explained in Table 1.'f1' stores number of stress and negative words present in the sentence posted.'f2' stores number of positive emotion words present in the sentence posted, 'f3' stores negative emoticons if present and 'f4' stores negating words present in the sentence posted.These features help the probability module in labeling the posted sentence is having stress or non-stress by calculating the probability of each class.

Visual Module
Based on previous work on color psychology theories [28], we combine the following features as the visual middle-level representation: -Saturation: the mean value of saturation and its contrast.It describes the colorfulness and the differences of an image.Psychological experiments in S. R. Ireland's [29] find out that people under stress and anxiety prefer lower saturation than normal states, revealing the correlation between stress and saturation of images.Brightness: the mean value of brightness and its contrast.It illustrates the perception elicited by the luminance and the related differences of an image (e.g., low brightness makes people feel negative, while high brightness elicits mainly positive emotional associations).To calculate image brightness first, we need to (briefly) analyze what is the result of the average value of the sum of the RGB channels.For humans, it is meaningless.Is pink brighter than green ?I.e., why would you want (0, 255, 0) to give a lower brightness value than (255, 0, 255) ?Also, is a mid gray (128, 128, 128) bright just like a mid green (128, 255, 0) ?To take this into consideration, I only deal with the color brightness of the channel as is done in the HSV color space [14].This is simply the maximum value of a given RGB triplet.The rest is heuristics.Let max_rgb = max (RGB-i) for some point i.If max_rgb is lower than 128 (assuming a 8bpp image), then we found a new point i that is dark, otherwise it is light.Doing this for every point i, we get A points that are light and B points that are dark.If (A -B)/(A + B) >= 0.5 then we say the image is light.Note that if every point is dark, then you get a value of -1, conversely if every point is light you get +1.The previous formula can be tweaked so you can accept images barely dark.In the code I named the variable as fuzzy, but it does no justice to the fuzzy field in Image Processing.So, we say the image is light if (A -B)/(A + B) + fuzzy >= 0.5.Following is the mathematical relationship between RGB space to HSI (hue, saturation, and intensity) or Hue, Saturation, Value (HSV) color space model.

Social Module
Besides the text content and image content of a post, some additional features such as likes can also imply one's stress state to some degree.We define a post's social attention degree based on these additional features into social attributes.As apparently stressful posts may attract more attention from friends.The number of comments, likes reveals how much attention a post attracts.To find out the changes of attention degree of one's post, we first calculate the sample mean sf shown in equation 5, all Mvalues and total number of items in the samples.
Social Feature 'sf' stores the mean value of number of likes a particular post gets it give us the social attention degree that means when user posts are negative or stressful it gets more attention from the friends of that particular user via likes or comments.Thus, we will find the mean of number of likes and categories the posts accordingly as shown in Figure 3.   ( (

Experimental Results
The proposed PSD model is compared with Mike Thelwall's TensiStrength [27] which uses a lexical approach with lists of terms related to stress and non-stress.TensiStrength's terms are not only synonyms for stress, anxiety and frustration but also terms related to anger and negative emotions because stress can be a response to negative events and can cause negative emotions.It also attempts to detect the opposite state to stress, non-stress, through a parallel approach.This method is mainly based on text data in the social media, whereas other correspondingly significant content, like images and social actions are ignored and it is capable of detecting stress only from a single textual post, while single post reflects the instant emotion expressed, people's emotion or psychological stress states are usually more persistent and changes over different time periods and social media allows users to express their mental thoughts or feelings through not only textual posts but also in visual post i.e., through images.  2 and based on these these parameters we have achieved highest F1-score of 94% and accuracy of 91.9%.For further test our model, we have considered data from our application's database which is real time content and Table 3 shows the details of that data.The collected data consists of 1,920 posts based on these data we have evaluated PSD model and it showed an increased accuracy compared to J. Huang's Teenchat [22] and M.thelwall's TensiStrength [27] system for stress detection.The figure 7 shows a comparison of efficiency of the PSD model from the result shown in table 4 we see that PSD model achieved better detection performance and showed F1-Score of 95.7% whereas TensiStrength shows 5.9% of less F1score i.e., 90.10% and J. Huang's Teenchat [22] shows 15.5% of less F1-score i.e., 80.20% and when compared to our PSD model.

Number of posts 1920
Number of Users 20

Number of weeks 8
Posts per week 12 Precision 95% parameters, as shown Tensistrength [27] achieved 89% of accuracy and it supports only text parameter for stress detection and for the same.J. Huang's Teenchat [22] model achieved 91.5% of accuracy based on all three parameters but the time taken by CPU is much higher than our PSD model which decreases its efficiency, our PSD model achieved 93% of accuracy when all parameters are considered and also take very less CPU time to process and detect stress from social media posts of individual users.

Conclusion
Psychological Stress is an increasing problem in the world, it is essential to monitor and control it regularly.In this paper, we presented a Model for detecting user's stress status from user's daily social media data, leveraging post's content along with user's social interactions.The inefficiency of the conventional methods [5], [16], [19], [22], and [27] motivates to adopt a novel technique which is transparent and inexpensive.Similar to our another ontological approach is developed for stress detection that works better on Textual stress words [31].Thus, the proposed strategy utilizes not only social media data (text and images), but also uses social interaction content to detect psychological stress employing a Novel Hybrid model i.e., Psychological Stress Detection (PSD).To completely leverage user's posts content, we proposed a hybrid PSD model which combines the probabilistic module (Bayes classifier) with the visual module (HSV model).In our model the user's stress states are monitored daily, weekly, bi-weekly or monthly so that it improves the accuracy of stress detection at different levels.
Experimental results show that the proposed model can improve the detection performance when compared to J. Huang's Teenchat [22], and TensiStrength [27], and achieved 95% of precision and 93% of accuracy.

Future Scope
Recent studies show apart from leveraging cross-media content and social interactions, Social Connectivity also plays a vital role in detecting stress states of an individual, which is a useful reference for future related studies.Some more useful references for future related studies:  Need to explore our research on stressor responsible for stress as it is noteworthy to know the type of stress the user is dealing. Support for short form and slang language used in text post to be included. Support for Multilingual languages to be included.

Fig 1 :
Fig 1: Sample Post containing text, image and social interactions (i.e., comments and likes)

Visual feature (vF): Rule 2 -
Algorithm 1 (Figure 3) will check threshold values to tag Visual posts as stress Brightness vf1 Mean of image pixels i.e., brightness Saturation vf2 Mean of saturation.Social feature(sf): Rule 3 -Algorithm 1 (Figure 3) will check threshold values Social attention sf Number of likes and mean value of it.

Likes ( 9 )
H holds given evidence i.e. data sample X [30].According to Bayes theorem the is expressed as shown in equation 1 or 2. or Where is any one of the class in the list i.e., 'No-Stress', 'Stress' and Neutral.

•
Brightness or Intensity formula • Saturation formula Where vf1 is brightness or Intensity and vf2 is saturation as shown in equation 3 and 4. The detail computation of visual module is shown in Visual Module Algorithm figure 5. Visual Feature is a set of two features, vF = {vf1, vf2} they stores the mean value of the brightness and saturation, which is calculated in the visual module algorithm.Value of Image brightness and saturation will be compared with threshold value to categorize the particular image post as stress or non-stress as shown in Figure 3.

Algorithm 1 :
Psychological stress Detector (PSD) Algorithm Require: pi, ui, (Social media posts P of users U).Ensure: 's' (Detection of Stress state i.e., stress or non-stress) 1 Initialize P, U, s = {stress , non-stress}, F ,vF ,sf . 2 u = getusername() //getusername function returns current user name 3 While(i < U and P != NULL ) do 4 F = Stress (P) //stress function of probability module calculates stress from textual posts and return stress state 5 vF = getImageLightness (image) //getImageLightness function of visual module calculates and return image brightness 6 sf = num_likes() //num_likes function of social module calculates and return number of likes and mean value of the likes.

Fig. 7 :
Fig.7: Comparison of efficiency of the models based on Accuracy and F1-score.

Table 1 :
Illustrates set of knowledge based pre-defined logical rules.

Table 2 :
Shows a scenario of user's posts and result analysis (Text + Emoji +Image).My life is finished very alone without my parents sad feeling, in pain.Its sad when someone you know becomes you knew depressed to This pain is never ending hate this feeling sad, dishearten.
2I am in pain, can't take it anymore hate life depressed.

Table 4 :
Comparison of efficiency & effectiveness using different model

Table 3 :
Details of Data taken for result analysis.