Psycholinguistics words cause mental disorders

Absolutistic word usage and depression. Is the increased use of absolutist words a marker of self-reported depressive symptoms?


List of abbreviations


1 Introduction

2 theory
2.1 Cognitive Theory of Depression
2.2 dichotomous thinking
2.3 Development and psychometry of text analysis programs
2.3.1 Ethical perspective
2.5 Linguistic markers in the context of depression
2.6 Absolutistic thinking and depressive symptoms
2.6.1 Previous study and hypotheses

3 method
3.1 data collection
3.2 Text analysis
3.3 Experimental design

4 results
4.1 Hypothesis 1, 2 and 3
4.2 Hypothesis 4

5 discussion
5.1 Investigation of depression-related markers
5.1.1 Implications for Practice
5.2 Limitations and Outlook
5.3 Conclusion



For reasons of better readability of this scientific work, the feminine form of the word was largely omitted. This circumstance is not due to discriminatory motives, but rather serves only a more fluent use of language.

List of abbreviations

AV Dependent variable

BMI Body mass index

CBT Cognitive Behavioral Therapy

DSM-5 Diagnostic and Statistical Manual of Mental Disorders 5th Edition

HE Extreme responses

LIWC Linguistic Inquiry and Word Count

NLP Natural Language Processing

UV Independent variable

VIF Variance Inflation Factor


There is ample evidence for the manifestation of mental disorders in human speech. Dichotomous thinking related to depression can be operationalized at the linguistic level in the form of absolutistic words. In the present observational study, this was done using the computer-based analysis of natural language from online forums. For this purpose, forum contributions were collected from n = 329 test persons, who were divided into a depression-related test group and a control group. Using the Linguistic Inquiry and Word Count (LIWC) the contributions were examined with regard to the percentage occurrence of selected word categories. In addition to absolutistic words, personal pronouns in the first person singular and negative emotion words were also surveyed. Using t-tests and a logistic regression, the relationships between the word variable and the dependent variable were determined depression examined. Both absolutistic and negative emotion words and personal pronouns are significantly related to depression. A logistic regression, however, limits the predictive power of absolutistic words. These results indicate on the one hand the potential of linguistic analyzes and on the other hand the existing need for experimental research in the field of linguistic markers.

1 Introduction

Depression is a common mental disorder that affects both mental and physical health. The main symptoms of depression include a lack of interest in normal life activities, insomnia, inability to enjoy life, and suicidal thoughts (Cui, 2015). The Diagnostic and Statistical Manual of Mental Disorders (DSM-5) describes depressive disorders as characterized by the presence of sad, empty, or irritable moods, accompanied by somatic and cognitive changes that significantly impair the functioning of the individual (American Psychiatric Association, 2013) . There are also documented negative effects of depression on interpersonal relationships, educational attainment, and financial security (Kessler & Wang, 2009). In addition, patients with major depressive disorder have an increased risk of developing cardiovascular disease, as well as increased morbidity and mortality (Seligman & Nemeroff, 2015). According to this, depression is associated with enormous costs on both an individual and a societal level. It is estimated that around 300 million people around the world suffer from depression, which the World Health Organization classifies as the leading cause of global disability (Smith, 2014). One of the most worrying aspects is that adolescents with major depression are 30 times more likely to commit suicide than peer groups who are not depressed (Stringaris, 2017). Epidemiological surveys indicate that the lifetime prevalence of depression is 16.6%; in women it is estimated at 21.3% (Kessler & Bromet, 2013). It is important that this is a strongly recurring disease. Accordingly, every depressive episode increases the likelihood that people will develop recurrent depression (Solomon, 2000).

Although depression is one of the most present global health problems today, its complex pathogenesis is still not well understood, as cultural, psychological and biological factors contribute to the development of depression (Gross, 2014). The inadequate understanding of this disease could play a mediating role in the social stigmatization of depression. These are associated with a lack of willpower and unwillingness to deal with the disease. Because of this, those affected often hide their condition, refuse to seek help and consequently consolidate the symptoms.

The report of the European Branch of the World Health Organization (2016) paid particular attention to identifying signs of depression and personalizing online methods of preventing it. Linguistic analysis of specific online forums and social media is discussed as a way to identify symptoms of depression in a timely manner. This enables psychologists to suggest preventive and intervention measures to risk groups at an early stage.

According to Beck's cognitive model of depression, the biased collection and processing of information plays a primary role in the development and maintenance of depression (Beck, 1972). The mechanisms underlying the model, such as dysfunctional cognitions, have been linked in numerous studies to the onset and persistence of depression (Disner et al., 2011). There are various research approaches with regard to the investigation of individual constructs of the cognitive model of depression. There are also scientific efforts to integrate modern neurobiological and psychological findings into the model. In the present work, however, the focus is on the connection between a dichotomous style of thinking, which is colloquially known as Black and white thinking is called, and depression will go. In this context, Al-Mosaiwi and Johnstone (2018) examined the extent to which a dichotomous style of thinking becomes visible on the linguistic level. In a software-based linguistic analysis of experimental and control forums, they found an increased use of absolutistic words in the forums that were associated with affective and depressive disorders. Absolute words are understood here to mean terms that do not allow any nuance.

Based on the study by Al-Mosaiwi and Johnstone (2018), the present study deals with the question of the extent to which absolutistic word usage functions as a marker for self-reported depressive symptoms in online forums. This central question is checked using an analytical research design.

In chapter 2.1 the cognitive theory of depression according to Beck is presented first. Chapter 2.2 examines dichotomous thinking in more detail. Chapter 2.3 contains the development and psychometry of text analysis programs including the ethical discourse on the use of the software mentioned. This is followed by the presentation of relevant literature regarding linguistic markers in the context of depression. In chapter 2.6 the previous study situation on absolutistic word usage and depression is elucidated. Then the hypotheses of the present work are set up. Chapter 3 describes the methodical procedure with regard to the research design, the evaluation software and statistical key figures. Chapter 4 deals with both the descriptive description of evaluated data and the inferential statistical test of hypotheses. The conclusion of this work is the discussion taking place in Chapter 5, which is concluded with a conclusion.

2 theory

Absolutistic words are terms that by definition do not allow any nuances, such as the words always or never. It is hypothesized that they are considered to be a manifest variable of a dichotomous thinking style, which in turn is related to depressive symptoms and represents a cognitive bias (Al-Mosaiwi & Johnstone, 2018). For this reason, the present work speaks of an absolutist style of thinking. The basic understanding of cognitive distortions was created by (Beck, 1963) with his cognitive theory of depression. The findings of the present study are discussed in the context of Beck's depression model in order to work out the connection between language and depression.

2.1 Cognitive Theory of Depression

The cognitive theory or the model of depression was written and thematized by Beck - the development and maintenance of depression. The theory is based on various clinical and theoretical papers (Beck, 1963, 1964). Beck's theory differs fundamentally from previous theories in that he sees thought processes as the primary source of depression and not as a consequence of depression (Beck, 1963).

His theory is based on three main assumptions: (1) dysfunctional schemata, (2) automatic thoughts, (3) negative cognitive triad. According to Beck, experiences shape schemata, which in turn determine the cognitions and thus the way people think. According to this view, depressed people have dysfunctional schemata that lead to automated negative thoughts, which in turn involve the three constructs of the negative cognitive triad Self, environment and future circle. The negative cognitive triad is also maintained by so-called cognitive distortions, such as dichotomous thinking, selective abstraction or arbitrary conclusions. It's easy to see how each of these cognitive biases tend to maintain the negative cognitive triad. If a person's thoughts about their views about themselves, their environment, and the future are already negative and they - the person - tend to prefer negative events or jump to negative conclusions, those negative thoughts are unlikely to go away. Nonetheless, Beck's Cognitive Behavioral Therapy (CBT), which is used to treat depression, requires a critical analysis of exactly which cognitive patterns and distortions are associated with their symptoms.

Although Beck's theory has been supplemented by empirical evidence and in the course of time, and has been confirmed on many points (Clark, Beck, & Alford, 1999), there were also critical statements. Beck postulated that the dysfunctional schemata are considered a vulnerability factor for depression, which are activated when stressors occur. In fact, it is usually sufficient to create a depressed mood in a person who was previously depressed, for example by listening to sad music or recalling sad memories, in order to activate latent depressive schemata (Ingram et al., 2006 ). In a more recent study, Beck (2016) was able to integrate the controversial and diverse findings, so that his model still appears up-to-date today. Accordingly, there is a continuity of cognitive structure and function in all areas that are relevant to depression. Starting with the earliest stage of cognition, negative perceptions and assessments lead one after the other to negative thoughts and beliefs. The beliefs embedded in schemes further influence information processing and interpretation with regard to the predisposition and triggering of major depression. The fact that the model is still up-to-date gives rise to the assumption that dichotomous thinking is a relevant factor in maintaining depression.

2.2 dichotomous thinking

Consistent with cognitive models of psychopathology, a tendency towards dichotomous thinking has been associated with a range of psychiatric symptoms and disorders, suggesting a transdiagnostic process. For example, this dysfunctional thinking style is relatively common in people with mood disorders. As early as the 1960s it was established that suicide patients see life in black and white and are "locked up" in their perception without being able to imagine alternatives or consider new ways of solving their problems. Taken separately, cognitive rigidity, dichotomous thinking, and problem solving have shown consistent and strong associations with suicidal thinking and behavior, but are increasingly viewed as closely related processes (Ellis, 1987; Ellis & Rutherford, 2008).

A study by Teasdale et al. (2001) examined the cognitive mediation of relapse prevention through cognitive therapy with 158 patients with persistent depressive symptoms. Test scores based on agreement with the item contents of five questionnaires on depression-related cognition did not provide any evidence of cognitive transmission. A measure of the shape of the response to these questionnaires, that is, the number of times that patients have extreme response categories like totally agree and totally disagree showed a significant and substantial prediction of relapse, differential response to cognitive therapy, and compliance with mediation criteria. The therapy reduced the recurrence rate by reducing the absolutist or dichotomous style of thinking. Thus, cognitive therapy can prevent relapse by training patients to change the way they process depression-related material rather than belief in depressive thought-content. The Dichotomous Thinking Inventory (Oshio, 2009) assesses three dimensions: (1) preference for dichotomy, (2) dichotomous beliefs, and (3) profit and loss thinking. The preference for dichotomy reflects a preference for distinctness and unambiguous situations. Profit and loss thinking involves planning how to take advantage of situations and avoid disadvantages, such as the desire to clarify whether information is useful or useless. Oshio also provided preliminary evidence of the reliability of test-retest reliability, internal consistency, factor structure, and convergent validity. When looking at the Dichotomous Thinking Inventory, it becomes clear that this cognitive distortion has several facets that can be viewed in a differentiated manner in the respective context.

A study by Dove, Byrne, and Bruce (2009) showed that dichotomous thinking mitigated the association between depression and obesity in the treatment of weight loss in obese and overweight individuals. Individuals with a dichotomous thinking style tend to have similar levels of depressive symptoms regardless of whether they are obese or overweight. This suggests that any irregularity in what a dichotomous thinker thinks is acceptable body weight can increase the risk of depression. The aim of a study by Antoniou et al. (2017) was to investigate the mediating role of emotional eating and dichotomous thinking in depression and obesity. Data from 205 people from a community-based study carried out at Maastricht University in the Netherlands were used as the research material. Self-reported data on depression, emotional eating, body mass index (BMI) and dichotomous thinking were collected and the corresponding scores calculated in a cross-sectional research design. In the primary analysis, the hypothesis was tested that depression has a mediating effect on BMI through dichotomous thinking and emotional eating. A two-mediator model was used to predict the direct and indirect effects of emotional eating and dichotomous thinking on depression BMI. The depressive symptoms were positively correlated with BMI, emotional eating, and dichotomous thinking. Accordingly, a dichotomous style of thinking could partially explain the relationship between depression and BMI.

The investigation of connections between a dichotomous thinking style and human language requires different measuring instruments than, for example, those used by Teasdale et al. (2001) used the Likert scale. These include psycholinguistic methods that are used today in the form of computer programs and are based on a number of psychological principles.

2.3 Development and psychometry of text analysis programs

The roots of modern text analysis go back to the beginnings of psychology. Freud (1901) wrote about slip of the tongue in which a person's hidden intentions are revealed in obvious language errors.Rorschach (1921) developed projective tests for the detection of thoughts, intentions and motives from the description of ambiguous ink blots. McClelland and a generation of researchers around Thematic Apperception Tests (1979) found that the stories people tell in response to drawings provide important clues to their needs for belonging, achievement, and power. More general and less incentive-based approaches developed in the 1950s. Gottschalk and Gleser (1969) developed a content-analytical method with which topics based on Freud's theory could be traced in text samples.

The first computerized text analysis program for measuring general relationships in the field of psychology was developed by Stone, Dunphy, Smith and Ogilvie (1966). With the help of a mainframe computer, they built a complex program that adapts McClelland's demand-oriented coding scheme to any open text. The program, called General Inquirer, is based on a number of complex algorithms. The General Inquirer has proven to be a valuable measuring instrument in the differentiation of mental disorders, the assessment of personality dimensions and the assessment of speeches. One limitation of this program, however, is that they relied on the manipulation and weighting of language variables that were not visible to the user. The first transparent text analysis method was developed by Weintraub (1989). Over a decade, he hand-counted people's words in political speeches and medical interviews. He noted that the use of singular pronouns such as I or my is reliably related to the degree of depression. Although his methods were valid and his findings always correlated with currently relevant measured values, his work was largely ignored. His observation that the words in everyday language reflect psychological states were nevertheless far-sighted and correct.

In the 1980s, researchers discovered that people showed improvements in physical health when asked to write about emotional upheavals in their lives (Pennebaker & Beall, 1986). To find a more efficient method of grading, they turned to computerized text analysis programs to graduate these papers. There was no such program at the time. So they started with the task of developing suitable software. Her idea was to create a program that counted words in psychologically relevant categories and across several text files. The result is a constantly changing computer program (LIWC; Pennebaker & Francis, 1999). The LIWC program has two central features: the processing component and the associated dictionaries. The processing component is the program itself, which opens a series of text files from different sources. Each word in a particular text file is compared with the dictionary file and assigned to a category. The program is continuously revised (Pennebaker, Boyd, Jordan & Blackburn, 2015) and adapted to new findings (Tausczik & Pennebaker, 2010). Further methodical details of the software are examined in more detail in the method part of the present work, since LIWC serves as an evaluation method in this work.

The methods described here fall under the so-called Natural Language Processing (NLP). This means the computer-based processing of large data sets that contain human, natural language. The use of human language analysis programs not only gives rise to new opportunities, such as predicting academic success (Pennebaker et al., 2014), but also a social and ethical responsibility.

2.3.1 Ethical perspective

The medical sciences have long established a code of ethics for experiments to minimize the risk of harm to test subjects. NLP was initially used to examine mostly anonymous corpora, which should enrich the linguistic analysis. Therefore, it initially seemed unlikely to raise ethical concerns. As NLP continues to spread and use more and more data - especially social media - the situation has changed. The results of NLP experiments and applications can now have a direct impact on the lives of individual users. Until recently, the discourse on this topic was little followed in the field. The public discourse, on the other hand, focused on a disproportionate representation of danger (Trotzek et al., 2018). The scientific discussion is mainly about exclusion, over-generalization, exposure problems and dual use.

As a result of the situatedness of the language, each data set contains a demographic distortion, that is, latent information about the demographic data it contains. Overestimating this information can have serious implications for the applicability of findings. In psychology, most studies are carried out on the basis of Western, educated, industrialized and democratic research participants. The tacit assumption that knowledge about this group can be easily translated into other demographic areas has resulted in a distorted corpus of psychological data. The possible consequences include exclusion of groups of people or demographic misrepresentations. This in itself poses an ethical problem for research purposes and threatens the universality and objectivity of scientific knowledge (Merton, 1973).

Over-generalization is seen as a side effect of modeling. As an example, one can consider interferences of user attributes in the area of ​​NLP, the analysis of which leads to promising areas of application such as recommendation engines and fraud detection (Badaskar et al., 2008). At first glance, the costs of false positive results seem incidental. An incorrectly addressed email or an erroneous birthday congratulation are usually considered harmless. In scientific practice, however, reliance on models that produce false-positive results can lead to massive bias. A flawed system for predicting sexual orientation, religious beliefs, or psychological abnormalities would likely score more negatively. Depending on the sensitivity of the data, appropriate precautionary measures should be taken.

Overexposure to the subject creates distortions. Unlike exclusion and overexposure, which can be prevented through the use of algorithms, overexposure emerges from the research design. Such thematic overexposure can lead to a psychological effect known as the availability heuristic (Tversky & Kahneman, 1973). When people can remember a certain event or have specific knowledge, they automatically attach greater importance to it. Such heuristics are ethically charged when characteristics such as violence or negative emotions are more strongly associated with certain groups or ethnic groups (Slovic et al., 2007). If research repeatedly finds that the language of a particular demographic group is abnormal, it can create a situation in which that particular group is considered abnormal. Thus, the group carries a stigma through thematic overexposure.

Another concept is under the term Dual use subsumed. In this context, this means the misappropriation of linguistic methods, for example through the commercial use of acquired data (Oltmann, 2015). The main issue here is the extent to which scientists should make developed programs accessible to the general public. For example, software for the linguistic identification of symptoms of depression could be used by employers to check online profiles of potential applicants in advance.

The problems presented imply that advances in language analysis can have negative connotations. Countermeasures for exclusion include bias control techniques. To avoid over-generalization, dummy labels, error weightings or confidence intervals are used. Exposure problems can only be addressed through careful, objective research design. Dual-use problems seem difficult to solve at the level of the individual researcher, but require the concerted effort of the scientific community. In order to comply with the Code of Ethics, it should be a priority to emphasize the usefulness of NLP for humans. Text analysis related to depression, for example, produced numerous linguistic markers that can lead to better early detection. According to Brockmeyer et al. (2015) already counts the analysis of narrative language among the diagnostic instruments of psychotherapy. Linguistic indicators are considered to be psychopathological markers that can contribute to the success of psychotherapy for early detection or suicide prevention (Leiva & Freire, 2017).

2.5 Linguistic markers in the context of depression

James J. Bradac (1999) already emphasized the numerous ways in which scientists can study both language and human communication at the same time. He understood the value of highly controlled laboratory studies while also understanding the importance of natural human language. It was of particular importance to him, however, that linguistic research reproduced its theories and results in a large number of methods and samples so that a consistent picture emerged.

Negative emotion words are a widely studied marker of depression. This is mainly justified by the logic, which is obvious at first glance, that depressed people experience more negative emotions. There are some studies that describe a connection (Bucci & Freedman, 1981; Fekete, 2002; Rude et al., 2004; Weintraub, 1989). Despite these intuitive relationships, it is claimed that pronouns and absolutistic words are stronger predictors of depression than negative emotion words, which are content words. Pronouns and absolutistic words belong to the functional words, can be measured as implicit markers and reflect a person's way of thinking, regardless of the content (Pennebaker & Chung, 2013). Nevertheless, negative emotion words have long been considered a suitable marker, which is why they are also examined in this work.

A study by Rude, Gortner, and Pennebaker (2004) examined the language of depressed and depressed students in essays by college students who are currently depressed, have previously had and have recovered from depression, or have never had depression. These were examined for linguistic differences. The aim was to shed light on the cognitive operations associated with depression and susceptibility to depression. A text analysis program - in accordance with Beck's cognitive model of depression - calculated the incidence of words in predetermined categories, with depressed participants using more negative emotion and self-related words than never depressed participants. Participants who were previously depressed did not differ from the participants who were never depressed in these indices of depressive processing. Consistent with the prediction, the formerly depressed participants used the word I across all articles significantly more often than participants without depression. The frequent use of singular pronouns is commonly justified by a model by Pyszczynski and Greenberg (1987) in which heightened self-awareness is viewed as a consequence of the loss of a resource of self-worth. Edwards and Holtzman (2017) performed a meta-analysis of studies that measured correlations in the use of singular personal pronouns. This also included numerous unpublished studies. The analysis of the fixed effects showed a low correlation by modern standards (r = .13). The correlation was not moderated according to gender or whether the effect had been published. These results support the suspicion that the use of singular pronouns is considered a linguistic hallmark for depression, although the correlations are relatively low. In addition, Zimmermann, Brockmeyer, Hunn, Schauenberg and Wolf (2017) carried out a longitudinal study on 29 patients with clinical depression. Self-centered attention, measured by 1st person singular pronouns, predicted significant depressive symptoms. Exploratory analyzes showed that this effect was mainly due to the use of reflexive and possessive words such as me or my was caused. Consequently, it should be obvious to include these pronouns in a dictionary in the context of depression research. Lumontod (2020), who examined depressive symptoms in students, showed that these results are inconsistent. In doing so, he discovered that the word in particular I related to depression. Such contradictions show that no universally valid statements can be made about singular pronouns.


End of the reading sample from 36 pages