Keboola App
Our Keboola app makes it easy to use our General API in Keboola Connection, a cloud ETL.
The app can be used to analyze any text, but the standard models are
optimized for three domains: news articles, hospitality customer care
and transportation customer care. The quality of the results will not be
as high if used outside of these domains. In order to ensure the best
possible outcome for your domain, we will be happy to provide you with a
customized model. We offer a basic customization for free. Contact us at
info@geneea.com.
Output Tables
When you run the app, it creates the following tables:
analysis-result-documents– document-level resultsanalysis-result-entities– entity-level resultsanalysis-result-relations– contains relations and attributes foundanalysis-result-sentences– contains information about individual sentences
analysis-result-documents table
The analysis-result-documents table contains document-level results in the following columns:
id– all id columns from the input table (used as primary keys)language– detected language of the document, as ISO 639-1 language codesentimentValue– detected sentiment of the document (a decimal number between-1and1)sentimentPolarity– detected sentiment of the document (1,0or-1)sentimentLabel– sentiment of the document as a label (positive,neutral,negative, orambivalent)sentimentDetailedLabel– similar tosentimentLabelbut addingvery positiveandvery negativelabels for extreme sentiment.usedChars– the number of characters used by this document
For We bought some excellent wine., the table will contain the following information:
| id_article | languate | sentimentValue | sentimentPolarity | sentimentLabel | sentimentDetailedLabel | usedChars |
|---|---|---|---|---|---|---|
| 123 | en | 0.5 | 1 | positive | positive | 100 |
analysis-result-entities table
The analysis-result-entities table contains entity-level results has
the following columns:
id– all id columns from the input table (used as primary keys)type– type of the found entity, e.g.,person,time,number,organization,tag(main topic of the document)text– disambiguated and standardized form of the entity, e.g., John Smith, Keboola, safe carseatscore– expresses the importance of a tag in the textentityUid– ID of the entitysentimentValue– detected sentiment of the document (a decimal number between-1and1)sentimentPolarity– detected sentiment of the document (1,0or-1)sentimentLabel– sentiment of the document as a label (positive,neutralornegative)sentimentDetailedLabel– similar tosentimentLabelbut addingvery positiveandvery negativelabels for extreme sentiment.
For We bought some excellent wine. and the hospitality domain, the table will contain the following information:
| id_article | type | text | score | entityUid | sentimentValue | sentimentPolarity | sentimentLabel | sentimentDetailedLabel |
|---|---|---|---|---|---|---|---|---|
| 2 | voc-topic | food & drink | 0 | HSP-6009 | ||||
| 2 | voc-topic | food & drink > quality | 0 | HSP-6015 | ||||
| 2 | food | drink | 0 | HSP-1477 | 0 | 0 | neutral | neutral |
| 2 | food | alcoholic drink | 0 | HSP-190 | 0 | 0 | neutral | neutral |
| 2 | food | wine | 1.9 | HSP-521 | 0.5 | 1 | positive | positive |
| 2 | tag | wine | 6 | HSP-521 | ||||
| 2 | tag | alcoholic drink | 6 | HSP-190 | ||||
| 2 | tag | drink | 6 | HSP-1477 | ||||
| 2 | tag | buy(we,wine) | 3.75 | |||||
| 2 | tag | wine: excellent | 3.75 |
Notes:
- There can be multiple rows per one document - each entity will be on a separate row. In some cases when the entity is detected as important and becomes a tag, the same entity will appear on two rows.
- The variety of entity types depends on the chosen domain.
For all domains we distinguish entities such as
person,time,number,organizationand more. For specific domains we add other types, e.g.,foodandrestaurantforvoc-hospitality. - For some entities, we perform ontology expansion.
For example, in the example above, the text mentions wine,
but the table contain multiple entities:
wine,alcoholic drink,drink. The exact set is domain and work-flow dependent. - Entity sentiment is calculated from the sentiment of the sentence.
- In short documents, tags are similar to entities.
analysis-result-relations table
The analysis-result-relations table contains relations and attributes found in the text.
For example,
good in a good pizza or the pizza is good is an attribute of pizza, while
eat in John ate a pizza is a relation between John and pizza.
The table has the following columns:
id– all id columns from the input table (used as primary keys)type–ATTRfor an attribute relation,VERBfor a verb relation,EXTERNALfor knowledgebase relationsname– the standard form of the relation (e.g.,expensivefortype=ATTR,buyfortype=VERBandparentfortype=EXTERNAL)negated–truefor negated relations,falseotherwisesubject– the subject of the relation or target of the attributeobject– the object of the relation, if anysubjectType– when the subject is an entity, its type (e.g.,organization,food)objectType– when the object is an entity, its typesubjectUid– id of the entityobjectUid– id of the entitysentimentValue– detected sentiment of the document (a decimal number between-1and1)sentimentPolarity– detected sentiment of the document (1,0or-1)sentimentLabel– sentiment of the document as a label (positive,neutralornegative)sentimentDetailedLabel– similar tosentimentLabelbut addingvery positiveandvery negativelabels for extreme sentiment.
For We bought some excellent wine., the table will contain the following information:
| id_article | type | name | negated | subject | object | subjectType | objectType | subjectUid | objectUid | sentimentValue | sentimentPolarity | sentimentLabel | sentimentDetailedLabel |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 123 | VERB | buy | false | we | wine | food | HSP-521 | 0.5 | 1 | positive | positive | ||
| 123 | ATTR | excellent | false | wine | food | HSP-521 | 0.5 | 1 | positive | positive | |||
| 123 | EXTERNAL | parent | false | wine | drink | food | food | HSP-521 | HSP-1477 | 0 | 0 | neutra | neutra |
| 123 | EXTERNAL | parent | false | wine | alcoholic drink | food | food | HSP-521 | HSP-190 | 0 | 0 | neutra | neutra |
There can be multiple relations per one document.
analysis-result-sentences table
The analysis-result-sentences table contains information about individual sentences in the documents.
These results are in beta.
id_article– all id columns from the input table (used as primary keys)index– a zero-based index of the sentence in the documentsegment– segment of the document -text,titleorleadtext– text of the sentencesentimentValue– detected sentiment of the document (a decimal number between-1and1)sentimentPolarity– detected sentiment of the document (1,0or-1)sentimentLabel– sentiment of the document as a label (positive,neutralornegative)sentimentDetailedLabel– similar tosentimentLabel, but addingvery positiveandvery negativelabels for extreme sentiment.
For We bought some excellent wine., the table will contain the following information:
| id_article | index | segment | text | sentimentValue | sentimentPolarity | sentimentLabel | sentimentDetailedLabel |
|---|---|---|---|---|---|---|---|
| 123 | 1 | text | We bought some excellent wine. | 0.5 | 1 | positive | positive |