Sentiment Analysis

Sentiment analysis detects the emotions of the author contained in the text. Was the author

  • happy (I loved it.),

  • neutral (We went to London.),

  • unhappy (The lunch was not good at all.), or

  • ambivalent (The lunch was good, but expensive.)

about their experience?

You can detect sentiment of reviews, feedback or customer service inquiries. Analysing sentiment of news articles usually does not make much sense (with the exception of opinions), because sentiment analysis does not aim at distinguishing bad news from good news, but the opinion the author expresses about something.

Our sentiment analysis is driven by practical needs, not academic purity.

Textual sentiment

Since it is often hard to express sentiment as a single number, we actually give you three numbers (all three in the interval [-1.0, 1.0]):

  • mean - the average sentiment

  • pos - the positive sentiment of the text (ignoring its negative sentiment)

  • neg - the negative sentiment of the text (ignoring its positive sentiment)

{
    "language": {"detected": "en"},
    "docSentiment": {
        "mean": 0.6,
        "label": "positive",
        "positive": 0.8,
        "negative": -0.2
    },
    "usedChars": 100
}

For clearly negative or clearly positive texts, reporting a single number (mean) is enough:

mean

label

pos

neg

text

+0.5

positive

+0.5

+0.0

Staff was nice.

+0.7

very positive

+0.7

+0.0

Staff was absolutely astounding.

-0.5

negative

+0.0

-0.5

Food was overpriced.

-0.7

very negative

+0.0

-0.7

Food was absolutely terrible.

Here, the positive (pos) and negative (neg) components do not provide any extra information either, they are either zero or equal to the mean.

However, reviews are often ambivalent. There are some things that the author liked and some that they did not. For such texts, expressing their sentiment as a single number is typically not enough. This is when the positive and negative sentiment components come in handy.

Consider the following examples. In either case, the overall sentiment is 0. But the second text is more extreme (both in positive and negative judgment). This is expressed in the positive and negative component

mean

label

pos

neg

text

+0.0

ambivalent

+0.3

-0.3

Staff was nice. Food was overpriced.

+0.0

ambivalent

+0.4

-0.4

Staff was absolutely astounding. Food was absolutely terrible.

When calculating the components, passages with the opposite polarity are considered neutral. Therefore, the negative components of the following two texts are identical:

mean

label

pos

neg

text

+0.0

ambivalent

+0.4

-0.4

Staff was absolutely astounding. Food was absolutely terrible.

-0.4

negative

+0.0

-0.4

Today is Tuesday. Food was absolutely terrible.

The labels are provided just for convenience. By default, they are set as follows:

  • neutral is in interval [-0.1, 0.1]

  • ambivalent:
    • neither component is neutral and

    • neither component is significant (i.e., the mean sentiment is [-0.5, 0.5])

    • {mean: 0.0, pos: 0.4, neg: -0.4} and {mean: -0.1, pos: 0.4, neg: -0.6} are both ambivalent

    • {mean: -0.6, pos: 0.2, neg: -1.0} is negative, becuase there is far more negativity than positivity

Note that a typical text contains a lot of neutral passages that can “dilute” the sentiment of the whole text.

Item sentiment

In addition to sentiment of text, we can return so-called item sentiment – sentiment related to entities or relations. Note that at this moment, the sentiment of items is derived from the sentiment of sentences the items occur in. That obviously has some implication. For example, a positively judged entity in an sentence which is negative overall will be assigned a negative sentiment. Therefore actor would be assigned negative sentiment in the following sentence Even the talented actor could not make up for the absolutely disastrous script.

Sample call

You can easily try it yourself:

# On Windows, use \" instead of " and " instead of '
curl -X POST https://api.geneea.com/v3/analysis \
-H 'Authorization: user_key <YOUR USER KEY>' \
-H 'Content-Type: application/json' \
-d '{
    "id": "1",
    "text": "The trip to London was amazing. Only the food was weird. Especially the pizza was terrible.",
    "analyses": ["sentiment", "entities", "relations"],
    "returnItemSentiment": "true",
    "domain": "voc-hospitality"
}'

You should get the following response:

{
    "id": "1",
    "language": {"detected": "en"},
    "entities": [
        {"id": "E0", "gkbId": "HSP-1091", "stdForm": "pizza", "type": "food"}
    ],
    "relations": [
        {"id": "R0", "name": "amazing", "textRepr": "amazing(trip)", "type": "ATTR", "args": [{"type": "SUBJECT", "name": "trip"}], "feats": {"negated": "false", "modality": ""}},
        {"id": "R1", "name": "weird", "textRepr": "weird(food)", "type": "ATTR", "args": [{"type": "SUBJECT", "name": "food"}], "feats": {"negated": "false", "modality": ""}},
        {"id": "R2", "name": "terrible", "textRepr": "terrible(pizza)", "type": "ATTR", "args": [{"type": "SUBJECT", "name": "pizza", "entityId": "E0"}], "feats": {"negated": "false", "modality": ""}}
    ],
    "docSentiment": {"mean": -0.1, "label": "negative", "positive": 0.2, "negative": -0.3},
    "itemSentiments": {
        "E0": {"mean": -0.5, "label": "negative", "positive": 0.0, "negative": -0.5},
        "R0": {"mean": 0.5, "label": "positive", "positive": 0.5, "negative": 0.0},
        "R1": {"mean": -0.4, "label": "negative", "positive": 0.0, "negative": -0.4},
        "R2": {"mean": -0.5, "label": "negative", "positive": 0.0, "negative": -0.5}
    },
    "usedChars": 100
}

You can see that

  • the whole document is judged as slighly negative ("docSentiment": {"mean": -0.1, "label": "negative", ...}),

  • with some positive ("docSentiment": { ... "positive": 0.2 ... }) and some negative aspects ("docSentiment": { ... "negative": -0.3})

  • pizza is judged as negative ("E0": {"mean": -0.5, "label": "negative", "positive": 0.0, "negative": -0.5})

  • amazing trip is judged as positive ("R0": {"mean": 0.5, "label": "positive", "positive": 0.5, "negative": 0.0})

  • weird food is judged as negative ("R1": {"mean": -0.4, "label": "negative", "positive": 0.0, "negative": -0.4})

  • terrible pizza is judged as negative ("R2": {"mean": -0.5, "label": "negative", "positive": 0.0, "negative": -0.5})

Customization

We can customize our sentiment analysis to your needs.