Keboola App

Our Keboola app makes it easy to use our General API in Keboola Connection, a cloud ETL.

The app can be used to analyze any text, but the standard models are optimized for three domains: news articles, hospitality customer care and transportation customer care. If used in other areas, the obtained results will not be as good as they could be. In order to ensure the best possible outcome for your domain, we will be happy to provide you with a customized model. We offer a basic customization for free. Contact us at info@geneea.com.

Output Tables

When you run the app, it creates the following tables:

  • analysis-result-documents – document-level results

  • analysis-result-entities – entity-level results

  • analysis-result-relations – contains relations and attributes found

  • analysis-result-sentences – contains information about individual sentences

analysis-result-documents table

The analysis-result-documents table contains document-level results in the following columns:

  • id – all id columns from the input table (used as primary keys)

  • language – detected language of the document, as ISO 639-1 language code

  • sentimentValue – detected sentiment of the document (a decimal number between 1 and 1)

  • sentimentPolarity – detected sentiment of the document (1, 0` or -1`)

  • sentimentLabel – sentiment of the document as a label (positive, neutral or negative)

  • usedChars – the number of characters used by this document

For I ordered a good pizza., the table will contain the following information:

id_article

language

sentimentValue

sentimentPolarity

sentimentLabel

usedChars

123

en | 0.5

1

positive

22

analysis-result-entities table

The analysis-result-entities table contains entity-level results has the following columns:

  • id – all id columns from the input table (used as primary keys)

  • type – type of the found entity, e.g. person, time, number, organization, tag (main topic of the document)

  • text – disambiguated and standardized form of the entity, e.g. John Smith, Keboola, safe carseat

  • score – expresses the importance of a tag in the text

  • entityUid – ID of the entity

  • sentimentValue – detected sentiment of the document (a decimal number between 1 and 1)

  • sentimentPolarity – detected sentiment of the document (1, 0` or -1`)

  • sentimentLabel – sentiment of the document as a label (positive, neutral or negative)

For I ordered a good pizza., the table will contain the following information:

id_article

type

text

score

entityUid

sentimentValue

sentimentPolarity

sentimentLabel

123

food

pizza

1.0

HSP-1091

0.5

1

positive

Notes:

  • There can be multiple rows per one document - each entity will be on a separate row. All columns are part of the primary key.

  • The variety of entity types depends on chosen domain. For all domains we distinguish entities such as person, time, number, organization and more. For specific domains we add other types, e.g. food and restaurant for voc-hospitality.

  • For some entities, we perform ontology expansion. For example, when the text mentions beer, the table will contain multiple entities: beer, alcoholic drink, drink. The exact set is domain and work-flow dependent).

  • Entity sentiment is calculated from the sentiment of the sentence.

analysis-result-relations table

The analysis-result-relations table contains relations and attributes found in the text. For example, good in a good pizza or the pizza is good is an attribute of pizza, while eat in John ate a pizza is a relation between John and pizza.

The table has the following columns:

  • id – all id columns from the input table (used as primary keys)

  • typeATTR for an attribute relation, VERB for a verb relation

  • name – the standard form of the relation

  • negatedtrue for negated relations, false otherwise

  • subject – the subject of the relation or target of the attribute

  • object – the object of the relation, if any

  • subjectType – when the subject is an entity, its type (e.g. organization, food)

  • objectType – when the object is an entity, its type

  • subjectUid – id of the entitty

  • objectUid – id of the entitty

  • sentimentValue – detected sentiment of the document (a decimal number between 1 and 1)

  • sentimentPolarity – detected sentiment of the document (1, 0` or -1`)

  • sentimentLabel – sentiment of the document as a label (positive, neutral or negative)

For I ordered a good pizza., the table will contain the following information:

id_article

type

name

negated

subject

object

subjectType

objectType

subjectUid

objectUid

sentimentValue

sentimentPolarity

sentimentLabel

123

VERB

order

false

I

pizza

food

HSP-1091

0.1

1

positive

123

ATTR

good

false

pizza

food

HSP-1091

0.8

1

positive

There can be multiple relations per one document.

analysis-result-sentences table

The analysis-result-sentences table containing information about individual sentences in the documents. These results are in beta.

  • id_article – all id columns from the input table (used as primary keys)

  • index – a zero-based index of the sentence in the document

  • segment – segment of the document - text. title or lead

  • text – text of the sentence

  • sentimentValue – detected sentiment of the document (a decimal number between 1 and 1)

  • sentimentPolarity – detected sentiment of the document (1, 0` or -1`)

  • sentimentLabel – sentiment of the document as a label (positive, neutral or negative)

For I ordered a good pizza., the table will contain the following information:

id_article

index

segment

text

sentimentValue

sentimentPolarity

sentimentLabel

123

1

text

I ordered a good pizza.

0.5

1

positive