Feedback
If you provide feedback to us, our models will learn from it. This means the quality is automatically tuned, and even your preferences are taken into account. In the following examples, we show how to provide feedback on tags and entities. Providing feedback for sentiment is analogous, see API reference.
Feedback on Tags
Opt-in/Opt-out/Mixed
The feedback form is determined by the way in which tag suggestions are accepted or rejected by the journalists. There are three basic scenarios for using Geneea's tag suggestions:
- opt-in: journalists accept some of the tags suggested by the API; the rest is not used.
- opt-out: journalists reject some of the tags suggested by the API; the rest is used.
- mixed: tags are separated into two groups by their score: the most relevant tags are in the opt-out mode (journalists reject rare errors), and the rest are in the opt-in mode (journalists pick rare omissions).
In the opt-in case (when no tag is used without an explicit approval), the following type of feedback is typically sent:
accepted-actively
: the tag was explicitly selected from the suggestions as correct and relevant,rejected-passively
: the tag was not selected (it is either incorrect or not relevant),rejected-actively
: a stronger version ofrejected-passively
, when the tag is explicitly marked as non-desirable.expected
: the tag was not part of the suggestions, but should have been.
In the opt-out case (when all tags are used, unless explicitly forbidden), the following type of feedback is typically sent:
rejected-actively
: the tag was explicitly marked as wrong (it is either incorrect or not relevant).accepted-passively
: the tag was not removed from the suggestions,expected
: the tag was not part of the suggestions but should have been.
In all three scenarios, the rejected-actively
can be split into three more informative values:
rejected-actively-wrong
: the tag has been explicitly rejected because the article does not mention the concept at all (e.g., it might mention a different person or place with the same name).rejected-actively-marginal
: the tag has been explicitly rejected because despite being correct, is not relevant enough to be selected, for example, it is mentioned only marginally.blocked
: the tag should never be returned (e.g., the tag is too specific and a more general tag should be returned)
The rejected-actively
value always works in their place,
but providing the more specific values helps the system to improve its predictions faster.
Technical details
Feedback on tags consists of two main pieces of information: a tag and its status. The tag is the GKB id and/or the standard form of a tag that was returned or should have been returned. The status expresses whether the tag was returned correctly, incorrectly or not at all. The subset of reasonable tags depents on
Specifying the GKB id for a tag is optional but highly desirable as it allows automatic processing of the feedback. The tags without a GKB ids are regularly reviewed by our data scientists.
To summarize the most common status
values:
accepted-actively
: the tag has been explicitly accepted as correct,accepted-passively
: the tag has been accepted, but not explicitly. Typically, this means it was not rejected in the opt-out mode.rejected-actively
: the tag has been explicitly rejected. Optionally, a more specific value can be used instead:rejected-actively-wrong
: the tag has been explicitly rejected because the article does not mention the concept at all (e.g., it might mention a different person or place with the same name); more specific thanrejected-actively
.rejected-actively-marginal
: the tag has been explicitly rejected because despite being correct, is not relevant enough to be selected, for example, it is mentioned only marginally (more specific thanrejected-actively
)blocked
: the tag should never be returned (e.g., the tag is too specific and a more general tag should be returned)
rejected-passively
: the tag has been rejected but not explicitly. Typically, this means it was not accepted in the opt-in mode.expected
: the tag should have been returned but wasn't,
We also accept the following legacy values: correct
(prefer
accepted-actively
), accepted
(accepted-passively
), wrong
(rejected-actively
), and rejected
(rejected-passively
).
Obligatory, Recommended, and Optional Information
- It is mandatory to provide the ID of the originally analyzed
document (e.g.,
article-123
). Additionally, it is highly recommended to include thereferenceKey
that identifies the specific analysis (e.g.,211201-103000-d64a0290
). - For each tag, you should specify either its GKB ID, its standard form, or both. While it is recommended to provide the GKB ID, it may be unknown for expected but missing items. In such cases, you should specify only the standard form.
- We recommend sending information for every returned tag. If a tag is
not mentioned in the feedback, we will assume it was processed in
the default manner:
rejected-passively
in the opt-in setup andaccepted-passively
in the opt-out setup. In a mixed setup, feedback with missing tags is considered incorrect. - If multiple feedback calls are made for the same analysis (or
document if the
referenceKey
is not provided), only the most recent call will be taken into account.
Example
Basic code common for all the guide pages
Basic code
To use the API, you need a valid API key with appropriate authorizations.
Please get in touch with us if you do not have it here.
In the code below, replace <YOUR_API_KEY>
with the API key.
Note that we do not provide SDKs for the API yet, but our G3 SDKs can be used to perform NLP analysis.
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
No special setup necessary
No special setup necessary
// HTTP client; see https://github.com/axios/axios
const axios = require('axios');
const config = {
baseURL: 'https://media-api.geneea.com/v2/',
headers: {
'X-API-KEY': '<YOUR_API_KEY>'
}
};
// A simple function to report the returned json objects
const report = (output) => console.dir(output, { depth: null });
// In production environment, the API should always be called from the backend,
// otherwise you run into CORS problems
# http client; see https://docs.python-requests.org/en/latest/
import requests
BASE_URL = 'https://media-api.geneea.com/v2/'
HEADERS = {
'content-type': 'application/json',
'X-API-Key': '<YOUR_API_KEY>'
}
## We do not provide a dedicated SDK for the Media API yet.
## However, the SDK for General API can be used for the content analysis (i.e. NLP) part
## Geneea NLP client SDK; see https://help.geneea.com/sdk/index.html
from geneeanlpclient import g3
BASE_URL = 'https://media-api.geneea.com/v2/'
API_KEY = '<YOUR_API_KEY>'
As an example, consider the following article (source: Reuters). When we use the API to obtain tags:
- cURL
- Python
- Python SDK
POST https://media-api.geneea.com/v2/nlp/analyze
{
"id": "article-123",
"title": "Tesla to accept Dogecoin as payment for merchandise, says Musk",
"text": "Dec 14 (Reuters) - Tesla Inc (TSLA.O) chief Elon Musk said on Tuesday the electric carmaker will accept Dogecoin as payment for merchandise on a test basis, sending the meme-based cryptocurrency up 24%.",
"language": "en"
}
def analyze(input):
return requests.post(f'{BASE_URL}nlp/analyze', json=input, headers=HEADERS).json()
input = {
'id': 'article-123',
'title': 'Tesla to accept Dogecoin as payment for merchandise, says Musk',
'text': 'Dec 14 (Reuters) - Tesla Inc (TSLA.O) chief Elon Musk said on Tuesday the electric carmaker will accept Dogecoin as payment for merchandise on a test basis, sending the meme-based cryptocurrency up 24%.',
'language': 'en'
}
analyze(input)
requestBuilder = g3.Request.Builder()
with g3.Client.create(url=f'{BASE_URL}nlp/analyze') as analyzer:
analyzer.session.headers.update({'X-API-Key': API_KEY})
request = requestBuilder.build(
id=str('article-1234'),
title='Tesla to accept Dogecoin as payment for merchandise, says Musk.',
text='Dec 14 (Reuters) - Tesla Inc (TSLA.O) chief Elon Musk said on Tuesday the electric carmaker will accept Dogecoin as payment for merchandise on a test basis, sending the meme-based cryptocurrency up 24%.'
)
result = analyzer.analyze(request)
print('Tags:')
for t in result.tags:
print(f' \t{t.type}: {t.stdForm} ({t.gkbId}) relevance: {t.relevance}')
we receive the following results (depending on your account, there might be other features than tags):
{
"id": "article-123",
"language": { "detected": "en" },
"tags": [
{
"id": "t0",
"gkbId": "G478214",
"stdForm": "Tesla, Inc.",
"type": "media",
"relevance": 24.33
},
{
"id": "t1",
"gkbId": "G317521",
"stdForm": "Elon Musk",
"type": "media",
"relevance": 22.5
},
{
"id": "t2",
"gkbId": "G15377916",
"stdForm": "Dogecoin",
"type": "media",
"relevance": 19.24
},
{
"id": "t3",
"gkbId": "G130879",
"stdForm": "Reuters",
"type": "media",
"relevance": 9.275
}
],
"metadata": { "referenceKey": "211201-103000-d64a0290" }
}
Let's assume that the journalist is using the opt-in system, i.e., they select some tags. Namely, they:
- select
Tesla
andElon Musk
, - explicitly mark
Reuters
as not relevant (they could have just left it alone, and it would get the statusrejected-actively
, but they decided to be more specific, which is nice) - are missing
cryptocurrency
(GKB idG13479982
) among the results.
To provide this feedback, we can use the following call:
- cURL
- Python
POST https://media-api.geneea.com/v2/nlp/analyze/feedback
{
"docId": "article-123",
"referenceKey": "211201-103000-d64a0290",
"tags": [
{ "gkbId": "G478214", "stdForm": "Tesla, Inc.", "status": "accepted-actively"},
{ "gkbId": "G317521", "stdForm": "Elon Musk", "status": "accepted-actively"},
{ "gkbId": "G130879", "stdForm": "Reuters", "status": "rejected-actively-marginal", "comment": "Reuters not wanted" },
{ "gkbId": "G13479982", "stdForm": "cryptocurrency", "status": "expected", "comment": "cryptocurrency is missing"}
]
}
def feedback(input):
return requests.post(f'{BASE_URL}nlp/analyze/feedback', json=input, headers=HEADERS).json()
input = {
'docId': 'article-123',
'referenceKey': '211201-103000-d64a0290',
'tags': [
{ 'gkbId': 'G478214', 'stdForm': 'Tesla, Inc.', 'status': 'accepted-actively'},
{ 'gkbId': 'G317521', 'stdForm': 'Elon Musk', 'status': 'accepted-actively'},
{ 'gkbId': 'G130879', 'stdForm': 'Reuters', 'status': 'rejected-actively-marginal', 'comment': 'Reuters not wanted' },
{ 'gkbId': 'G13479982', 'stdForm': 'cryptocurrency', 'status': 'expected', 'comment': 'cryptocurrency is missing'}
]
}
feedback(input)
Feedback on Entities
Providing feedback for entities is analogous:
- cURL
- Python
POST https://media-api.geneea.com/v2/nlp/analyze/feedback
{
"docId": "article-123",
"referenceKey": "211201-103000-d64a0290",
"entities": [
{ "gkbId": "G130879", "status": "accepted-actively", "comment": "Reuters is ok as an entity" },
{ "stdForm": "cryptocurrency", "status": "expected", "comment": "cryptocurrency is missing"},
{ "gkbId": "G478214", "stdForm": "Tesla, Inc.", "status": "accepted-actively"},
{ "gkbId": "G317521", "stdForm": "John Smith", "status": "accepted-actively"}
]
}
def feedback(input):
return requests.post(f'{BASE_URL}nlp/analyze/feedback', json=input, headers=HEADERS).json()
input = {
'docId': 'article-123',
'referenceKey': '211201-103000-d64a0290',
'entities': [
{ 'gkbId': 'G130879', 'status': 'accepted-actively', 'comment': 'Reuters is ok as an entity' },
{ 'stdForm': 'cryptocurrency', 'status': 'expected', 'comment': 'cryptocurrency is missing'},
{ 'gkbId': 'G478214', 'stdForm': 'Tesla, Inc.', 'status': 'accepted-actively'},
{ 'gkbId': 'G317521', 'stdForm': 'John Smith', 'status': 'accepted-actively'}
]
}
feedback(input)