INTELLIGENT SYSTEMS AND APPLICATIONS IN ENGINEERING Diagnosis and Severity of Depression Disease in Individuals with Artificial Neural Networks Method

: Depression is a disease that causes physiological and psychological problems. Depression causes in individuals, sleep disturbance, constant fatigue, anorexia, inability to do daily activities, and feeling constantly tired and tired. Among the causes of depression; sociological, biological and psychological conditions are counted. The first step in treating depression is to make the correct diagnosis. Beck Depression Inventory (BDI) is a self-report scale consisting of 21 questions that evaluates the severity of depressive symptoms and the risk of depression. The purpose of BDI is not to define a diagnosis of depression, but to objectively quantify the degree of depression. The aim of this study is to determine the most successful algorithm from artificial neural network algorithms by using a data set of BDI scale. Random Forest, Decision Tree, Naive Bayes and Neural Network methods were used in the prediction model of diagnosis and severity of depression. The most appropriate estimation algorithm for problem solving has been determined. The best result; the training rate was 99.9%, the test rate was 98.5%, and the loss rate was 0.1% for training and 1.5% for testing, using the "Artificial Neural Network" algorithm. The lowest rates were obtained with the "Decision Tree" algorithm, with 90.8% training and 87.1% test rates. In addition, different results were obtained with Adam, SGD and L-BFGS-B optimizations used in ANN algorithms and the best success percentage was obtained as a test result in Adam technique.


Introduction
The mood of the individual also tends to change from time to time due to the variability of the environment in which he or she is. Therefore, the individual can switch from one mood to another in a short period of time, depending on the state of external stimuli. Although this short-term transition between emotional states is perceived as normal, the extension of this situation over a long period of time and the unfamiliar reactions to age are indicators of psychological health problems [1]. In addition to psychological health, emotions and behaviours, it is the ability to maintain an appropriate relationship and communication with the environment [2]. Depression, which is the mood change of psychological illness, is one of the most common disorders in psychological health screenings [3]. Today, it is seen that the most common problem related to psychological health worldwide is depression. According to the data of the World Health Organization (WHO) 2019, it affects three hundred and fifty million people worldwide [2,4]. Depression; in a troubled mood; stillness and slowness in movement and speech; It is a psychological illness that includes symptoms such as worthlessness, reluctance, weakness, pessimistic thoughts and feelings, stagnation and slowdown in physiological activities [5]. Depression is a condition that causes feelings of guilt, loss of selfconfidence, appetite, concentration and sleep disorders, and increases the general health burden of the individual [4]. Although depression can be seen at almost any age, its prevalence is higher in young people compared to other age groups. Depression, which starts especially at the end of youth, becomes chronic and can lead to psychosocial and occupational problems in the following years [6,7]. Depression highly triggers the loss of interest or sadness towards activities previously enjoyed [8]. In addition, it causes various physical and emotional problems, reducing the ability and performance of the person to work at home and at work [9]. Despite all these negative effects, the fact that it is among the most treatable psychological health disorders today. The first step in treating depression is to make the correct diagnosis. Blood tests, Brain Magnetic Resonance Imaging (MRI) techniques, Brain Mapping (Quantitative EEG) and some psychological tests are primarily used to determine the level of depression and to diagnose it [10]. When researches and clinical practices are examined, it is stated that one of the most widely used tools in the world to diagnose depression is the Beck Depression Inventory (BDI) [11,12]. Beck Depression Inventory (BDI) [13] is a scale measuring the cognitive, motivational and emotional symptoms seen in depression. BDI is a scale that asks questions about how the person has felt in the last week and aims to reveal the state of depression. Each question of the scale, which consists of 21 questions, takes a value between 0 and 3 points. The total score of the scale ranges from 0 to 63 points. A total score of "0-9" indicates minimal depression, "10-16 points" indicates mild depression, "17-29 points" means moderate depression, and "30-63 points" indicates severe depression. It has been observed that BDI, which has been translated into many languages today, has a high level of validity and reliability. In our country, the validity and reliability studies of BDI, which were carried out by Kahveci (2016), Küçük (2019) and Öztürk and Uluşahin (2015), are still used in researches and clinical applications [14,15,16,17,18,19,20]. The purpose of BDI is to objectively quantify the degree of depression. In the BDI scale, it is frequently used to measure the effects of different situations on each other and the relationships between them [19,20]. Deep learning techniques and artificial neural networks, which have proven their success in classification and diagnosis processes and being used in almost every field today, are among the examples in this field [21,22]. New methods based on machine learning methods are being developed especially for the evaluation of emotional state [23,24,25]. Studies such as modelling behavioural hopelessness with an artificial neural network [26], modelling and classifying data on psychiatric diagnoses with artificial neural networks [27], detecting bipolar disorders using deep learning methods [28] are among the best examples. In the statistical classification model created by combining the RFB artificial neural network and C4.5 algorithms, it is seen that South Korean caregivers play an important preventive role in predicting depression [29]. In this study, it is aimed to develop a self-training algorithm model for the diagnosis of depression disease and its severity. Alternative models were created using Random Forest, Decision Tree, Naive Bayes and Neural Network methods and the results obtained from these models were compared. In addition, the results obtained from the optimization techniques used in the Neural Network method were compared. The BDI scale support vector machines, artificial neural network and decision trees algorithms, which were carried out in a large mass in Henan, China, were predicted by the algorithms and as a result, the regression prediction error and error rate of the support vector machines was found to be the smallest [30].

MATERIALS AND METHODS
In this section, all the details of the method used in the study were given. First of all, the preparation processes of the data set were explained. Afterwards, alternative models were created using Random Forest, Decision Tree, Naive Bayes and Neural Network methods and the results obtained from these models were compared. In addition, the results obtained from the optimization techniques used in the Neural Network method were compared. During the evaluation phase of the models, indicators and calculations according to the selected performance were also included in this section.

Data
The data set used in this study includes the data of the "Beck Depression Inventory" and this questionnaire contains 21 questions with 4 answer choices. "Beck Depression Inventory", real and up-to-date data were obtained from Kaggle platform, which is an open platform for the detection and for depression disease. When the data set is examined, there are 1401 samples in total. Answer choices are listed from the most positive to the most negative. Each option has a numerical weight value. While the first option with the most positive answer has a value of 0, the other options take the values 1,2 and 3 towards the most negative, respectively. In other words, the most negative answer that can be given to a question is represented by the value 3. At the end of the test, the numerical weights of the answers given to each question in the test consisting of 21 questions are added. And the evaluation is made according to the resulting total value. Evaluation is made according to the ranges given in Table 1 and the degree of the disease is determined [16].

Algorithm Selection
Orange data analysis program was used to create the models. This study aims to predict whether the person is depressed or not. For this purpose, Random Forest, Decision Tree, Naive Bayes and Neural Network algorithms were used. In addition, Adam, SGD and L-BFGS-B, which are among the optimization parameters, were used in the Neural Network algorithm. Random Forest is a classification method developed by Leo Breiman and Adele Cutler and includes voting method. It is formed by collecting more than one decision tree. The class is determined by voting by individual trees, and in the formation of decision trees, samples taken from the data set with bootstrap technique are used independently. It is accepted as an advanced form of bagging method [31]. Compared to other algorithms, the Random Forest classifier is much faster in the training phase, especially compared to the acceleration method. Random Forest algorithm has taken its place as a very useful classifier with its efficiency and accuracy [32]. Decision Trees are built on a certain number of subjects who are portrayed to represent the universe. With the obtained outputs, it is tested on subjects called the universe and it is tried to reach the appropriate nodes with the control automation technique [33]. The decision tree includes stages such as defining the problem, structuring / drawing the decision tree, determining the sequence probabilities of events, calculating the result to be achieved (backwards), assigning the highest expected benefit to the relevant decision point (backwards), submitting the proposal, and comparing. Naive Bayes is a simple and effective statistical estimation technique that shows a high success rate in application areas [34]. As a working principle, it performs the classification process according to the most appropriate label based on the dependence of the attributes in the data only on a certain class [35]. Neural Network, A neural network consists of the weight values that relate this network to each other and the addition functions created by using these values. Artificial neural networks make use of the learning pathway, one of the main activities of the human brain. It is an algorithm technique developed to automatically realize the capabilities such as creating, discovering and deriving new data without any assistance [36].

Data Collection and Pre-processing
During the data pre-processing process, data processing was continued on a total of 500 samples by distinguishing it from noise. This data set has a multivariate structure that can be classified, consisting of 238 women and 262 men, with 23 characteristics.
Whether the individual has depression or not is classified in column 24 (1-0). "Orange" data analysis software was used for the analysis and interpretation of the data. Orange; It is a data science platform where data mining techniques and algorithms are used, data pre-processing, data analysis models are evaluated and created. The characteristics of the data related to the attributes in the data set, which is pre-processed and made ready for data mining techniques, using tools on the Orange platform, are given in Table 2.
There are 25 variables in the data set.

Purposed of the Model
In the process of creating the model, firstly all existing data were pre-processed, noise-free, and a total of 500 real data were obtained. These data are divided into two groups as training and testing. The data were obtained as a result of the survey. The "id" column was not included in the data preprocessing. Using the "fixed proportion of data" technique as the sampling type, including the "training" and "test" set of the data set, the entire data set was distinguished to form a model as 75% in the data set. "Replicable Sampling" and "Stratify Sample" methods were used in the use of the data set with the "Fixed Proportion of Data" technique. There are 375 (75%) data in the training set and 125 (25%) data in the test set (Table 3). The training process, including data processing, 24 data entries and one data output (1 classification), was initiated. During the training process, the maximum number of repetitions = 200, activation function = recipients and learning rate (Learning Rate) was determined as 0.005. Then the testing process was carried out. The outputs of the model are 0 (no depression) and 1 (depression). Artificial neural network model in which the best performance is achieved after many applications; It has 23 inputs, 1 output, and 3 hidden layers (Figure 4) Figure 1 shows the model created with Neural Network. According to the model, there are 23 data in the input layer, as well as 3 hidden layers, 3,5,7 neurons. The best result has been achieved according to this model The Naive Bayes algorithm, which works according to the Bayes theorem, calculates the probability of each state for an element in this study and classifies it according to the output with the highest probability value. For the decision tree algorithm model, the data set is divided into smaller or even smaller parts. The decision node in the study contains more than one branch. The first node is called the root node. Decision tree algorithm makes categorical classification according to nodes. In the decision tree model, the model was created with Nodes = 81, Leaves: 41, Depth (Levels): 9 and Side Width: Fixed. According to Figure 2, in the Random Forest algorithm model, the total number of trees is 20 and the depth is 15. Pythagoras tree graph was obtained by selecting no target classification. The graph was drawn according to the target class of depression.

Results and Discussion
"Beck Depression Inventory", models were created with different methods and techniques for the detection of depression disease and the results were analyzed. Comparison of models for the diagnosis of depression in terms of performance is given in Table 4. In Table  4, Figure 4 and Figure 5, the comparison of training and test results of all algorithms used in the study is given. Looking at Table 4, Figure 3 and Figure 4, the best results; The training rate was 99.9%, the test rate was 98.5%, the loss rate was 0.1% for training and 1.5% for testing, using the "Artificial Neural Network" method. The lowest rates were obtained by the "Decision Tree" technique with 90.8% training and 87.1% test rates. It is given with the training-test, accuracy and loss graph rates of other algorithms.   The information about the performance of the models prepared according to the Neural Network algorithm and optimizations are compared and given in Table 6. Optimizations with the highest success rate and the lowest margin of error are Adam and SGD, according to Table 5. According to the results in Table 5 of the Artificial Neural Network method, it is the most successful technique in terms of accuracy, with the lowest results in error rates. ROC Analysis graphics are given in Figure 5 and Figure 6 according to the algorithms used in the study.  In Figure 6, the accuracy graph of the patients with positive depression according to the estimation percentages of the algorithms is given. According to the graph, the best accuracy rate was obtained in the Neural Network technique. The gender distribution of the patients according to the data set used is given in Figure 7. When the data is analyzed, the distribution of the number of people by gender of the cases diagnosed can be seen in Figure 8. It is observed that more men are diagnosed with depression than women.
When the data set is correlated, the number of people who have suicidal thoughts according to gender is given in Figure 9 according to their degrees in the questionnaire. It is seen that people with suicidal ideation are mostly at a minimal rate. Table 6 shows the confusion matrix chart of the algorithms used in the study and the data used in the test. Using 125 test data, it is seen that the Neural Network algorithm provides the best performance with no depression (100%) and depression (95.7%) rates. By achieving more than 90% success in Random Forest and Naive Bayes algorithms, it has shown its performance positively. In the Decision Tree technique, it gave the lowest result among the algorithms used with a rate of approximately 87%.   Table  7. According to the experimental results (Table 7), the best performance and the lowest error rate were obtained with the Neural Network technique. According to 15 test samples in Table 8, Neural Network with 98% accuracy, Decision Tree with 87%, Random Forest with 91% and Naive Bayes with 93% accuracy.

Conclusion
The "Beck Depression Inventory" is now frequently used to measure the diagnosis of depression and its severity. The main purpose of this study is to realize an algorithm model that can train itself. Alternative models were created using Random Forest, Decision Tree, Naive Bayes and Neural Network methods and the results obtained from these models were compared. It was observed that the best performance on scale in data processing process has 23 inputs, 1 output and 3 hidden layers with the artificial neural network model. When the performance of all algorithm models for the diagnosis of depression is examined, the best results are; the training rate was 99.9%, the test rate was 98.5%, the loss rate was 0.1% for training and 1.5% for testing, using the "Artificial Neural Network" method. The lowest rates were obtained with the "Decision Tree" technique, with 90.8% training and 87.1% test rates. When the performance data of the models prepared according to the Artificial Neural Network algorithm and optimizations were examined, it was seen that the highest success rate was Adam and SGD optimizations with the lowest margin of error.
In the next phase of the current study, higher success rates can be achieved by using different algorithms and techniques. In addition, considering the general scope of the study subject, it is seen that it belongs to the field of health and it will be more correct to interpret the results by taking the opinions of the experts in the field. As a result, considering the success rates and detection processes of the applied algorithms, it reveals the potential of using depression prediction, artificial intelligence techniques and optimizations in the field of health.