Semantic Tagging
The Media API can perform semantic tagging of articles. Semantic tags represent entities, keywords, or concepts relevant to the article. We rank and standardize them based on their purpose and context.
For a non-technical overview, see this page and this case study.
Below, we discuss various technical topics related to obtaining semantic tags:
- First steps: basic setup for calling the API
- Basic tagging: a simple call to the API to obtain semantic tags
- Tag mentions: expressions in the article corresponding to the tags
- Other features: entities, sentiment, and more (depending on configuration)
- Paragraphs: handling structured content like lead or multiple paragraphs
- Topic categories: specify known topics or sections to guide analysis
- Presentation language: return results in a specific language
- Knowledge base properties: additional information drawn from the knowledge base
For full API reference, see the reference pages. Note that the exact output depends on your account plan and configuration.
Basic code common to all guide pages
Basic Code
To use the API, you'll need a valid API key with the appropriate permissions. If you don't have one, please contact us here.
In the code below, replace <YOUR_API_KEY> with your actual API key.
Note: We do not currently provide dedicated SDKs for this API, but our G3 SDKs can be used to perform NLP analysis.
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
# No special setup necessary
# No special setup necessary
// HTTP client; see https://github.com/axios/axios
const axios = require('axios');
const config = {
baseURL: 'https://media-api.geneea.com/v2/',
headers: {
'X-API-KEY': '<YOUR_API_KEY>'
}
};
// A simple function to report the returned json objects
const report = (output) => console.dir(output, { depth: null });
// In production environment, the API should always be called from the backend,
// otherwise you run into CORS problems
# http client; see https://docs.python-requests.org/en/latest/
import requests
BASE_URL = 'https://media-api.geneea.com/v2/'
HEADERS = {
'content-type': 'application/json',
'X-API-Key': '<YOUR_API_KEY>'
}
# We do not provide a dedicated SDK for the Media API yet.
# However, the SDK for General API can be used for the content analysis (i.e. NLP) part of the Media API.
# Geneea NLP client SDK; see https://help.geneea.com/sdk/index.html
# Use `pip install geneea-nlp-client`
from geneeanlpclient import g3
BASE_URL = 'https://media-api.geneea.com/v2/'
API_KEY = '<YOUR_API_KEY>'
Tags – Basic Analysis
To perform a basic semantic analysis and obtain tags (keywords), use the following request:
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
curl -X POST -H 'X-API-KEY: <YOUR_API_KEY>' -H 'accept: */*' -H 'content-type: application/json' 'https://media-api.geneea.com/v2/nlp/analyze' -d '{
"id": "1234",
"title": "Emmanuel Macron in Germany.",
"text": "Mr. Macron visited a trade show in Munich."
}'
curl -X POST -H "X-API-KEY: <YOUR_API_KEY>" -H "content-type: application/json" "https://media-api.geneea.com/v2/nlp/analyze" -d "{
\"id\": \"1234\",
\"title\": \"Emmanuel Macron in Germany.\",
\"text\": \"Mr. Macron visited a trade show in Munich.\"
}"
const analyze = async (config, input) => {
const response = await axios.post('nlp/analyze', input, config);
return response.data;
};
const input = {
id: '1234',
title: 'Emmanuel Macron in Germany.',
text: 'Mr. Macron visited a trade show in Munich.'
}
analyze(config, input).then(report);
def analyze(input):
return requests.post(f'{BASE_URL}nlp/analyze', json=input, headers=HEADERS).json()
input = {
'id': '1234',
'title': 'Emmanuel Macron in Germany.',
'text': 'Mr. Macron visited a trade show in Munich.'
}
analyze(input)
requestBuilder = g3.Request.Builder()
with g3.Client.create(url=f'{BASE_URL}nlp/analyze') as analyzer:
analyzer.session.headers.update({'X-API-Key': API_KEY})
request = requestBuilder.build(id=str('1234'), title='Emmanuel Macron in Germany.', text='Mr. Macron visited a trade show in Munich.')
result = analyzer.analyze(request)
print("Entities:") ## there will be no entities, unless they are specified in your plan and configuration
for e in result.entities:
print(f' {e.type}: {e.stdForm} ({e.gkbId}) relevance: {e.feats.get("relevance")}, derivedBy: {e.feats.get("derivedBy", "N/A")}, derivedOnly: {e.feats.get("derivedOnly", "false")}'))
print("Tags:")
for t in result.tags:
print(f'\t{t.type}: {t.stdForm} ({t.gkbId}) relevance: {t.relevance}')
The code above produces results similar to the example below. Your actual results may include additional features (e.g., relations, sentiment), depending on your configuration – see entities and sentiment below.
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
{
"version": "3.3.0",
"id": "1234",
"language": {"detected": "en"},
"tags": [
{"id": "t1", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "media", "relevance": 96.0, "feats": {"wikidataId": "Q3052772", "gkbEntityType": "person"}},
{"id": "t2", "gkbId": "G183", "stdForm": "Germany", "type": "media", "relevance": 94.0, "feats": {"wikidataId": "Q183", "gkbEntityType": "location"}},
{"id": "t3", "gkbId": "G1726", "stdForm": "Munich", "type": "media", "relevance": 66.0, "feats": {"wikidataId": "Q1726", "gkbEntityType": "location"}},
{"id": "t4", "gkbId": "IPTC-11000000", "stdForm": "politics", "type": "media-topic", "relevance": 68.51, "feats": {"MediaTopicId": "11000000", "wikidataId": "Q7163", "gkbEntityType": "general"}}
]
"usedChars": 100,
"metadata": {"referenceKey": "241014-164726-9bdaf485"},
}
{
"version": "3.3.0",
"id": "1234",
"language": {"detected": "en"},
"tags": [
{"id": "t1", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "media", "relevance": 96.0, "feats": {"wikidataId": "Q3052772", "gkbEntityType": "person"}},
{"id": "t2", "gkbId": "G183", "stdForm": "Germany", "type": "media", "relevance": 94.0, "feats": {"wikidataId": "Q183", "gkbEntityType": "location"}},
{"id": "t3", "gkbId": "G1726", "stdForm": "Munich", "type": "media", "relevance": 66.0, "feats": {"wikidataId": "Q1726", "gkbEntityType": "location"}},
{"id": "t4", "gkbId": "IPTC-11000000", "stdForm": "politics", "type": "media-topic", "relevance": 68.51, "feats": {"MediaTopicId": "11000000", "wikidataId": "Q7163", "gkbEntityType": "general"}}
]
"usedChars": 100,
"metadata": {"referenceKey": "241014-164726-9bdaf485"},
}
{
version: '3.3.0',
id: '1234',
language: {detected: 'en'},
tags: [
{id: 't1', gkbId: 'G3052772', stdForm: 'Emmanuel Macron', type: 'media', relevance: 96.0, feats: {wikidataId: 'Q3052772', gkbEntityType: 'person'}},
{id: 't2', gkbId: 'G183', stdForm: 'Germany', type: 'media', relevance: 94.0, feats: {wikidataId: 'Q183', gkbEntityType: 'location'}},
{id: 't3', gkbId: 'G1726', stdForm: 'Munich', type: 'media', relevance: 66.0, feats: {wikidataId: 'Q1726', gkbEntityType: 'location'}},
{id: 't4', gkbId: 'IPTC-11000000', stdForm: 'politics', type: 'media-topic', relevance: 68.51, feats: {MediaTopicId: '11000000', wikidataId: 'Q7163', gkbEntityType: 'general'}}
]
usedChars: 100,
metadata: {referenceKey: '241014-164726-9bdaf485'},
}
{
'version': '3.3.0',
'id': '1234',
'language': {'detected': 'en'},
'tags': [
{'id': 't1', 'gkbId': 'G3052772', 'stdForm': 'Emmanuel Macron', 'type': 'media', 'relevance': 96.0, 'feats': {'wikidataId': 'Q3052772', 'gkbEntityType': 'person'}},
{'id': 't2', 'gkbId': 'G183', 'stdForm': 'Germany', 'type': 'media', 'relevance': 94.0, 'feats': {'wikidataId': 'Q183', 'gkbEntityType': 'location'}},
{'id': 't3', 'gkbId': 'G1726', 'stdForm': 'Munich', 'type': 'media', 'relevance': 66.0, 'feats': {'wikidataId': 'Q1726', 'gkbEntityType': 'location'}},
{'id': 't4', 'gkbId': 'IPTC-11000000', 'stdForm': 'politics', 'type': 'media-topic', 'relevance': 68.51, 'feats': {'MediaTopicId': '11000000', 'wikidataId': 'Q7163', 'gkbEntityType': 'general'}}
]
'usedChars': 100,
'metadata': {'referenceKey': '241014-164726-9bdaf485'},
}
Entities: ## there are no entities here, unless they are enabled in your plan and configuration
Tags:
media: Emmanuel Macron (G3052772) relevance: 96.0
media: Germany (G183) relevance: 94.0
media: Munich (G1726) relevance: 66.0
media-topic: politics (IPTC-11000000) relevance: 68.51
This example includes two types of tags:
- Entity-based tags (
"type": "media"): most relevant entities, both names (e.g., people, locations, organizations) and keywords. See here. - IPTC Media Topics (
"type": "media-topic"): an industry taxonomy with over 1,200 categories organized hierarchically. The above result includespolitics; other examples aresport,basketball,music,classical music, etc.
For more detail, see this article.
Each tag includes:
- A unique ID (
"gkbId") linking it to the knowledge base. - A standardized name (
"stdForm"), optionally localized (see Presentation Language). - A relevance score (
"relevance") from 0 to 100, representing the importance of the tag in relation to both the article and the customer's needs.
This is distinct from entity relevance, which only considers the article itself when determining importance. - Third-party identifiers (e.g., Wikidata, IPTC Media Topics).
- The type of the knowledge base item (
person,organization,location,event,product,general). - An internal reference ID (e.g.,
"id": "t2") used for linking withing the system.
Tag Mentions
To receive mentions — text snippets in the article that correspond to tags — use the "returnMentions": "true" parameter.
Mentions help link tags to specific expressions in the article. Typically, entity-based tags (people, organizations, etc.) have mentions; abstract topics like IPTC Media Topics categories usually do not.
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
curl -X POST -H 'X-API-KEY: <YOUR_API_KEY>' -H 'accept: */*' -H 'content-type: application/json' 'https://media-api.geneea.com/v2/nlp/analyze' -d '{
"id": "1234",
"title": "Emmanuel Macron in Germany.",
"text": "Mr. Macron visited a trade show in Munich.",
"returnMentions": "true"
}'
curl -X POST -H "X-API-KEY: <YOUR_API_KEY>" -H "content-type: application/json" "https://media-api.geneea.com/v2/nlp/analyze" -d "{
\"id\": \"1234\",
\"title\": \"Emmanuel Macron in Germany.\",
\"text\": \"Mr. Macron visited a trade show in Munich.\",
\"returnMentions\": \"true\"
}"
const analyze = async (config, input) => {
const response = await axios.post('nlp/analyze', input, config);
return response.data;
};
const input = {
id: '1234',
title: 'Emmanuel Macron in Germany.',
text: 'Mr. Macron visited a trade show in Munich.'
}
analyze(config, input).then(report);
def analyze(input):
return requests.post(f'{BASE_URL}nlp/analyze', json=input, headers=HEADERS).json()
input = {
'id': '1234',
'title': 'Emmanuel Macron in Germany.',
'text': 'Mr. Macron visited a trade show in Munich.',
'returnMentions': True
}
analyze(input)
requestBuilder = g3.Request.Builder(returnMentions=True)
with g3.Client.create(url=f'{BASE_URL}nlp/analyze') as analyzer:
analyzer.session.headers.update({'X-API-Key': API_KEY})
request = requestBuilder.build(id=str('1234'), title='Emmanuel Macron in Germany.', text='Mr. Macron visited a trade show in Munich.')
result = analyzer.analyze(request)
print("Tags:")
for t in result.tags:
print(f'\t{t.type}: {t.stdForm} ({t.gkbId}) relevance: {t.relevance}')
for m in t.mentions:
## charSpan can be used for highlighting in the original text
print(f'\t{m.text}; {m.mwl}; {m.tokens.charSpan}')
In comparison with the previous response, this one includes mentions of the individual tags: their text and a reference to the relevant tokens. The full tokenized structure of the text—split into paragraphs, sentences, and tokens—is automatically added to the response.
Entities, Sentiment, etc.
By default, only tags are returned. Depending on your account plan and configuration, additional outputs may include:
- Entities
- Relations
- Document-level sentiment
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
{
"version": "3.3.0",
"id": "1234",
"language": {"detected": "en"},
"entities": [
{"id": "e0", "gkbId": "G57305", "stdForm": "trade fair", "type": "general", "feats": {"relevance": "11", "ranking": "11"}},
{"id": "e1", "gkbId": "G183", "stdForm": "Germany", "type": "location", "feats": {"derivedBy": "country", "relevance": "94", "ranking": "94"}},
{"id": "e2", "gkbId": "G1726", "stdForm": "Munich", "type": "location", "feats": {"derivedBy": "city", "relevance": "66", "ranking": "66"}},
{"id": "e3", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "person", "feats": {"relevance": "96", "ranking": "96"}},
{"id": "e4", "gkbId": "G980", "stdForm": "Bavaria", "type": "location", "feats": {"derivedBy": "region", "derivedOnly": "true", "relevance": "42", "ranking": "42"}},
{"id": "e5", "gkbId": "G10562", "stdForm": "Upper Bavaria", "type": "location", "feats": {"derivedBy": "district", "derivedOnly": "true", "relevance": "41", "ranking": "41"}}
]
"tags": [
{"id": "t1", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "media", "relevance": 96.0, "feats": {"wikidataId": "Q3052772", "gkbEntityType": "person"}},
{"id": "t2", "gkbId": "G183", "stdForm": "Germany", "type": "media", "relevance": 94.0, "feats": {"wikidataId": "Q183", "gkbEntityType": "location"}},
{"id": "t3", "gkbId": "G1726", "stdForm": "Munich", "type": "media", "relevance": 66.0, "feats": {"wikidataId": "Q1726", "gkbEntityType": "location"}},
{"id": "t4", "gkbId": "IPTC-11000000", "stdForm": "politics", "type": "media-topic", "relevance": 68.51, "feats": {"MediaTopicId": "11000000", "wikidataId": "Q7163", "gkbEntityType": "general"}}
]
"usedChars": 100,
"metadata": {"referenceKey": "241014-164726-9bdaf485"},
}
{
"version": "3.3.0",
"id": "1234",
"language": {"detected": "en"},
"entities": [
{"id": "e0", "gkbId": "G57305", "stdForm": "trade fair", "type": "general", "feats": {"relevance": "11", "ranking": "11"}},
{"id": "e1", "gkbId": "G183", "stdForm": "Germany", "type": "location", "feats": {"derivedBy": "country", "relevance": "94", "ranking": "94"}},
{"id": "e2", "gkbId": "G1726", "stdForm": "Munich", "type": "location", "feats": {"derivedBy": "city", "relevance": "66", "ranking": "66"}},
{"id": "e3", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "person", "feats": {"relevance": "96", "ranking": "96"}},
{"id": "e4", "gkbId": "G980", "stdForm": "Bavaria", "type": "location", "feats": {"derivedBy": "region", "derivedOnly": "true", "relevance": "42", "ranking": "42"}},
{"id": "e5", "gkbId": "G10562", "stdForm": "Upper Bavaria", "type": "location", "feats": {"derivedBy": "district", "derivedOnly": "true", "relevance": "41", "ranking": "41"}}
]
"tags": [
{"id": "t1", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "media", "relevance": 96.0, "feats": {"wikidataId": "Q3052772", "gkbEntityType": "person"}},
{"id": "t2", "gkbId": "G183", "stdForm": "Germany", "type": "media", "relevance": 94.0, "feats": {"wikidataId": "Q183", "gkbEntityType": "location"}},
{"id": "t3", "gkbId": "G1726", "stdForm": "Munich", "type": "media", "relevance": 66.0, "feats": {"wikidataId": "Q1726", "gkbEntityType": "location"}},
{"id": "t4", "gkbId": "IPTC-11000000", "stdForm": "politics", "type": "media-topic", "relevance": 68.51, "feats": {"MediaTopicId": "11000000", "wikidataId": "Q7163", "gkbEntityType": "general"}}
]
"usedChars": 100,
"metadata": {"referenceKey": "241014-164726-9bdaf485"},
}
{
version: '3.3.0',
id: '1234',
language: {detected: 'en'},
entities: [
{id: 'e0', gkbId: 'G57305', stdForm: 'trade fair', type: 'general', feats: {relevance: '11', ranking: '11'}},
{id: 'e1', gkbId: 'G183', stdForm: 'Germany', type: 'location', feats: {derivedBy: 'country', relevance: '94', ranking: '94'}},
{id: 'e2', gkbId: 'G1726', stdForm: 'Munich', type: 'location', feats: {derivedBy: 'city', relevance: '66', ranking: '66'}},
{id: 'e3', gkbId: 'G3052772', stdForm: 'Emmanuel Macron', type: 'person', feats: {relevance: '96', ranking: '96'}},
{id: 'e4', gkbId: 'G980', stdForm: 'Bavaria', type: 'location', feats: {derivedBy: 'region', derivedOnly: 'true', relevance: '42', ranking: '42'}},
{id: 'e5', gkbId: 'G10562', stdForm: 'Upper Bavaria', type: 'location', feats: {derivedBy: 'district', derivedOnly: 'true', relevance: '41', ranking: '41'}}
]
tags: [
{id: 't1', gkbId: 'G3052772', stdForm: 'Emmanuel Macron', type: 'media', relevance: 96.0, feats: {wikidataId: 'Q3052772', gkbEntityType: 'person'}},
{id: 't2', gkbId: 'G183', stdForm: 'Germany', type: 'media', relevance: 94.0, feats: {wikidataId: 'Q183', gkbEntityType: 'location'}},
{id: 't3', gkbId: 'G1726', stdForm: 'Munich', type: 'media', relevance: 66.0, feats: {wikidataId: 'Q1726', gkbEntityType: 'location'}},
{id: 't4', gkbId: 'IPTC-11000000', stdForm: 'politics', type: 'media-topic', relevance: 68.51, feats: {MediaTopicId: '11000000', wikidataId: 'Q7163', gkbEntityType: 'general'}}
]
usedChars: 100,
metadata: {referenceKey: '241014-164726-9bdaf485'},
}
{
'version': '3.3.0',
'id': '1234',
'language': {'detected': 'en'},
'entities': [
{'id': 'e0', 'gkbId': 'G57305', 'stdForm': 'trade fair', 'type': 'general', 'feats': {'relevance': '11', 'ranking': '11'}},
{'id': 'e1', 'gkbId': 'G183', 'stdForm': 'Germany', 'type': 'location', 'feats': {'derivedBy': 'country', 'relevance': '94', 'ranking': '94'}},
{'id': 'e2', 'gkbId': 'G1726', 'stdForm': 'Munich', 'type': 'location', 'feats': {'derivedBy': 'city', 'relevance': '66', 'ranking': '66'}},
{'id': 'e3', 'gkbId': 'G3052772', 'stdForm': 'Emmanuel Macron', 'type': 'person', 'feats': {'relevance': '96', 'ranking': '96'}},
{'id': 'e4', 'gkbId': 'G980', 'stdForm': 'Bavaria', 'type': 'location', 'feats': {'derivedBy': 'region', 'derivedOnly': 'true', 'relevance': '42', 'ranking': '42'}},
{'id': 'e5', 'gkbId': 'G10562', 'stdForm': 'Upper Bavaria', 'type': 'location', 'feats': {'derivedBy': 'district', 'derivedOnly': 'true', 'relevance': '41', 'ranking': '41'}}
]
'tags': [
{'id': 't1', 'gkbId': 'G3052772', 'stdForm': 'Emmanuel Macron', 'type': 'media', 'relevance': 96.0, 'feats': {'wikidataId': 'Q3052772', 'gkbEntityType': 'person'}},
{'id': 't2', 'gkbId': 'G183', 'stdForm': 'Germany', 'type': 'media', 'relevance': 94.0, 'feats': {'wikidataId': 'Q183', 'gkbEntityType': 'location'}},
{'id': 't3', 'gkbId': 'G1726', 'stdForm': 'Munich', 'type': 'media', 'relevance': 66.0, 'feats': {'wikidataId': 'Q1726', 'gkbEntityType': 'location'}},
{'id': 't4', 'gkbId': 'IPTC-11000000', 'stdForm': 'politics', 'type': 'media-topic', 'relevance': 68.51, 'feats': {'MediaTopicId': '11000000', 'wikidataId': 'Q7163', 'gkbEntityType': 'general'}}
]
'usedChars': 100,
'metadata': {'referenceKey': '241014-164726-9bdaf485'},
}
Entities: ## there are no entities here, unless they are enabled in your plan and configuration
general: trade fair (G57305) relevance: 11
location: Germany (G183) relevance: 94
location: Munich (G1726) relevance: 66
person: Emmanuel Macron (G3052772) relevance: 96
location: Bavaria (G980) relevance: 42
location: Upper Bavaria (G10562) relevance: 41
Tags:
media: Emmanuel Macron (G3052772) relevance: 96.0
media: Germany (G183) relevance: 94.0
media: Munich (G1726) relevance: 66.0
media-topic: politics (IPTC-11000000) relevance: 68.51
In addition to tags, we now also receive entities. Here are a few key points to keep in mind:
mediatags are a subset of the identified entities. In general, their relevance scores match those of the corresponding entities, meaning we can interpret them as the most relevant entities. However, this equivalence is not always guaranteed. Entity relevance is determined solely by the article's content, while tag relevance can also be influenced by other factors, such as editorial preferences or contextual weighing. Additionally, although tags may reflect the top N entities, we can adjust the relevance of specific tag types—or individual tags—based on their context. For example, location tags might carry more weight in a travel article than in a sports article.- Some entities are classified as derived. These are inferred from context rather than mentioned explicitly. For example, the state of Bavaria and the region of Upper Bavaria may be included because the text refers to Munich, even though those entities are not directly stated. An entity like Germany may combine both direct and indirect references: it may be explicitly mentioned while also implied through mentions of locations like Munich.
- Certain metadata is attached to entities and tags in the form of features
(e.g., relevance, derivation method). These are expressed as key-value pairs,
where both the keys and values are always strings. If a feature represents
a different semantic type—such as a number—it must be converted accordingly.
For example, a
"relevance":"94"feature should be interpreted as the number 94, not a string.
Paragraphs
The API and SDKs allow easy specification of an article's title and body.
To include other types of paragraphs—such as the lead paragraph—or multiple text block,
use the paraSpecs field. The public API currently recognizes three paragraph types:
title, abstract (also referred to as lead) and text (the body of the article).
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
curl -X POST -H 'X-API-KEY: <YOUR_API_KEY>' -H 'accept: */*' -H 'content-type: application/json' 'https://media-api.geneea.com/v2/nlp/analyze' -d '{
"id": "1234",
"paraSpecs": [
{"type": "title", "text": "Macron in Germany."},
{"type": "abstract", "text": "Emmanuel Macron is visiting Germany again."},
{"type": "text", "text": "Mr. Macron visited a trade show in Munich."}
]
}'
curl -X POST -H "X-API-KEY: <YOUR_API_KEY>" -H "content-type: application/json" "https://media-api.geneea.com/v2/nlp/analyze" -d "{
\"id\": \"1234\",
\"paraSpecs\": [
{\"type\": \"title\", \"text\": \"Macron in Germany.\"},
{\"type\": \"abstract\", \"text\": \"Emmanuel Macron is visiting Germany again.\"},
{\"type\": \"text\", \"text\": \"Mr. Macron visited a trade show in Munich.\"}
]
}"
const input = {
id: '1234',
paraSpecs: [
{type: 'title', text: 'Macron in Germany.'},
{type: 'abstract', text: 'Emmanuel Macron is visiting Germany again.'},
{type: 'text', text: 'Mr. Macron visited a trade show in Munich.'}
]
}
// see the definition of analyze above
analyze(config, input).then(report);
input = {
'id': '1234',
'paraSpecs': [
{'type': 'title', 'text': 'Macron in Germany.'},
{'type': 'abstract', 'text': 'Emmanuel Macron is visiting Germany again.'},
{'type': 'text', 'text': 'Mr. Macron visited a trade show in Munich.'}
]
}
## see the definition of analyze above
analyze(input)
requestBuilder = g3.Request.Builder()
with g3.Client.create(url=f'{BASE_URL}nlp/analyze') as analyzer:
analyzer.session.headers.update({'X-API-Key': API_KEY})
request = requestBuilder.build(
id='1234',
paraSpecs=[
g3.ParaSpec.title('Macron in Germany.'),
g3.ParaSpec.lead('Emmanuel Macron is visiting Germany again.'), ## g3.ParaSpec.abstract is equivalent
g3.ParaSpec.body('Mr. Macron visited a trade show in Munich.')
]
)
result = analyzer.analyze(request)
Topic Categories (Sections)
Often, the topic or section of an article is known in advance—for example, when
the article appears under a particular section of a website, such as sport or hobby.
Providing this information is optional, as the system will always attempt to detect
the topic automatically during analysis. However, if the category is known, including it
can improve the quality and accuracy of the results.
We support two types of topic categories:
- Standard IPTC Media Topics
- Custom categories or sections defined by the publisher (these must be configured on our side to have any effect)
These two types can be used together, as shown in the example below:
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
curl -X POST -H 'X-API-KEY: <YOUR_API_KEY>' -H 'accept: */*' -H 'content-type: application/json' 'https://media-api.geneea.com/v2/nlp/analyze' -d '{
"id": "1234",
"title": "Emmanuel Macron in Germany.",
"text": "Mr. Macron visited a trade show in Munich.",
"presentationLanguage": "fr",
"categories": [{"taxonomy": "MediaTopic", "code": "11000000"}, {"taxonomy": "Custom", "code": "politics"} ]
}'
curl -X POST -H "X-API-KEY: <YOUR_API_KEY>" -H "content-type: application/json" "https://media-api.geneea.com/v2/nlp/analyze" -d "{
\"id\": \"1234\",
\"title\": \"Emmanuel Macron in Germany.\",
\"text\": \"Mr. Macron visited a trade show in Munich.\",
\"presentationLanguage\": \"fr\",
\"categories\": [{\"taxonomy\": \"MediaTopic\", \"code\": \"11000000\"}, {\"taxonomy\": \"Custom\", \"code\": \"politics\"} ]
}"
const categories = [
{taxonomy: 'MediaTopic', code: '11000000'}, // IPTC category
{taxonomy: 'Custom', code: 'politics'} // custom category
]
const input = {
id: '1234',
title: 'Emmanuel Macron in Germany.',
text: 'Mr. Macron visited a trade show in Munich.',
categories: categories
}
// see the definition of analyze above
analyze(config, input).then(report);
categories = [
{'taxonomy': 'MediaTopic', 'code': '11000000'}, ## IPTC category
{'taxonomy': 'Custom', 'code': 'politics'}, ## custom category
]
input = {
'id': '1234',
'title': 'Emmanuel Macron in Germany.',
'text': 'Mr. Macron visited a trade show in Munich.',
'categories': categories,
}
## see the definition of analyze above
analyze(input)
requestBuilder = g3.Request.Builder()
with g3.Client.create(url=f'{BASE_URL}nlp/analyze') as analyzer:
analyzer.session.headers.update({'X-API-Key': API_KEY})
categories = [
{'taxonomy': 'MediaTopic', 'code': '11000000'}, ## IPTC category
{'taxonomy': 'Custom', 'code': 'politics'}, ## custom category
]
request = requestBuilder.build(
id='1234',
title='Emmanuel Macron in Germany.',
text='Mr. Macron visited a trade show in Munich.'
)
request.setCustomConfig(categories=categories)
result = analyzer.analyze(request)
Presentation Language
By default, entities and tags are presented in the language of the document—typically
English. However, you can request that they be returned in a different language by
specifying the presentationLanguage parameter using the appropriate ISO code.
Supported languages include Czech, Dutch, English, French, German, Polish, Portuguese, Slovak, and Spanish.
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
curl -X POST -H 'X-API-KEY: <YOUR_API_KEY>' -H 'accept: */*' -H 'content-type: application/json' 'https://media-api.geneea.com/v2/nlp/analyze' -d '{
"id": "1234",
"title": "Emmanuel Macron in Germany.",
"text": "Mr. Macron visited a trade show in Munich.",
"presentationLanguage": "fr"
}'
curl -X POST -H "X-API-KEY: <YOUR_API_KEY>" -H "content-type: application/json" "https://media-api.geneea.com/v2/nlp/analyze" -d "{
\"id\": \"1234\",
\"title\": \"Emmanuel Macron in Germany.\",
\"text\": \"Mr. Macron visited a trade show in Munich.\",
\"presentationLanguage\": \"fr\"
}"
const analyze = async (config, input) => {
const response = await axios.post('nlp/analyze', input, config);
return response.data;
};
const input = {
id: '1234',
title: 'Emmanuel Macron in Germany.',
text: 'Mr. Macron visited a trade show in Munich.'
presentationLanguage: 'fr'
}
analyze(config, input).then(report);
def analyze(input):
return requests.post(f'{BASE_URL}nlp/analyze', json=input, headers=HEADERS).json()
input = {
'id': '1234',
'title': 'Emmanuel Macron in Germany.',
'text': 'Mr. Macron visited a trade show in Munich.',
'presentationLanguage': 'fr'
}
analyze(input)
requestBuilder = g3.Request.Builder(customConfig={'presentationLanguage': 'fr'})
with g3.Client.create(url=f'{BASE_URL}nlp/analyze") as analyzer:
analyzer.session.headers.update({'X-API-Key': API_KEY})
request = requestBuilder.build(id=str('1234'), title='Emmanuel Macron in Germany.', text='Mr. Macron visited a trade show in Munich.')
result = analyzer.analyze(request)
print("Entities:")
for e in result.entities:
print(f' {e.type}: {e.stdForm} ({e.gkbId}) relevance: {e.feats.get("relevance")}, derivedBy: {e.feats.get("derivedBy", "N/A")}, derivedOnly: {e.feats.get("derivedOnly", "false")}'))
print("Tags:")
for t in result.tags:
print(f' {t.type}: {t.stdForm} ({t.gkbId}) relevance: {t.relevance}')
The following is an example response.
For an explanation of each field, see the Analysis reference page.
Note that we've omitted the relations field for brevity.
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
{
"version": "3.3.0",
"id": "1234",
"language": {"detected": "en"},
"entities": [
{"id": "e0", "gkbId": "G57305", "stdForm": "salon", "type": "general", "feats": {"relevance": "11", "ranking": "11"}},
{"id": "e1", "gkbId": "G183", "stdForm": "Allemagne", "type": "location", "feats": {"derivedBy": "country", "relevance": "94", "ranking": "94"}},
{"id": "e2", "gkbId": "G1726", "stdForm": "Munich", "type": "location", "feats": {"derivedBy": "city", "relevance": "66", "ranking": "66"}},
{"id": "e3", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "person", "feats": {"relevance": "96", "ranking": "96"}},
{"id": "e4", "gkbId": "G980", "stdForm": "Bavière", "type": "location", "feats": {"derivedBy": "region", "derivedOnly": "true", "relevance": "42", "ranking": "42"}},
{"id": "e5", "gkbId": "G10562", "stdForm": "Haute-Bavière", "type": "location", "feats": {"derivedBy": "district", "derivedOnly": "true", "relevance": "41", "ranking": "41"}}
]
"tags": [
{"id": "t1", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "media", "relevance": 96.0, "feats": {"wikidataId": "Q3052772", "gkbEntityType": "person"}},
{"id": "t2", "gkbId": "G183", "stdForm": "Allemagne", "type": "media", "relevance": 94.0, "feats": {"wikidataId": "Q183", "gkbEntityType": "location"}},
{"id": "t3", "gkbId": "G1726", "stdForm": "Munich", "type": "media", "relevance": 66.0, "feats": {"wikidataId": "Q1726", "gkbEntityType": "location"}},
{"id": "t4", "gkbId": "IPTC-11000000", "stdForm": "Politique", "type": "media-topic", "relevance": 68.51, "feats": {"MediaTopicId": "11000000", "wikidataId": "Q7163", "gkbEntityType": "general"}}
]
"usedChars": 100,
"metadata": {"referenceKey": "241014-164726-ab2eaf07"},
}
{
"version": "3.3.0",
"id": "1234",
"language": {"detected": "en"},
"entities": [
{"id": "e0", "gkbId": "G57305", "stdForm": "salon", "type": "general", "feats": {"relevance": "11", "ranking": "11"}},
{"id": "e1", "gkbId": "G183", "stdForm": "Allemagne", "type": "location", "feats": {"derivedBy": "country", "relevance": "94", "ranking": "94"}},
{"id": "e2", "gkbId": "G1726", "stdForm": "Munich", "type": "location", "feats": {"derivedBy": "city", "relevance": "66", "ranking": "66"}},
{"id": "e3", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "person", "feats": {"relevance": "96", "ranking": "96"}},
{"id": "e4", "gkbId": "G980", "stdForm": "Bavière", "type": "location", "feats": {"derivedBy": "region", "derivedOnly": "true", "relevance": "42", "ranking": "42"}},
{"id": "e5", "gkbId": "G10562", "stdForm": "Haute-Bavière", "type": "location", "feats": {"derivedBy": "district", "derivedOnly": "true", "relevance": "41", "ranking": "41"}}
]
"tags": [
{"id": "t1", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "media", "relevance": 96.0, "feats": {"wikidataId": "Q3052772", "gkbEntityType": "person"}},
{"id": "t2", "gkbId": "G183", "stdForm": "Allemagne", "type": "media", "relevance": 94.0, "feats": {"wikidataId": "Q183", "gkbEntityType": "location"}},
{"id": "t3", "gkbId": "G1726", "stdForm": "Munich", "type": "media", "relevance": 66.0, "feats": {"wikidataId": "Q1726", "gkbEntityType": "location"}},
{"id": "t4", "gkbId": "IPTC-11000000", "stdForm": "Politique", "type": "media-topic", "relevance": 68.51, "feats": {"MediaTopicId": "11000000", "wikidataId": "Q7163", "gkbEntityType": "general"}}
]
"usedChars": 100,
"metadata": {"referenceKey": "241014-164726-ab2eaf07"},
}
{
version: '3.3.0',
id: '1234',
language: {detected: 'en'},
entities: [
{id: 'e0', gkbId: 'G57305', stdForm: 'salon', type: 'general', feats: {relevance: '11', ranking: '11'}},
{id: 'e1', gkbId: 'G183', stdForm: 'Allemagne', type: 'location', feats: {derivedBy: 'country', relevance: '94', ranking: '94'}},
{id: 'e2', gkbId: 'G1726', stdForm: 'Munich', type: 'location', feats: {derivedBy: 'city', relevance: '66', ranking: '66'}},
{id: 'e3', gkbId: 'G3052772', stdForm: 'Emmanuel Macron', type: 'person', feats: {relevance: '96', ranking: '96'}},
{id: 'e4', gkbId: 'G980', stdForm: 'Bavière', type: 'location', feats: {derivedBy: 'region', derivedOnly: 'true', relevance: '42', ranking: '42'}},
{id: 'e5', gkbId: 'G10562', stdForm: 'Haute-Bavière', type: 'location', feats: {derivedBy: 'district', derivedOnly: 'true', relevance: '41', ranking: '41'}}
]
tags: [
{id: 't1', gkbId: 'G3052772', stdForm: 'Emmanuel Macron', type: 'media', relevance: 96.0, feats: {wikidataId: 'Q3052772', gkbEntityType: 'person'}},
{id: 't2', gkbId: 'G183', stdForm: 'Allemagne', type: 'media', relevance: 94.0, feats: {wikidataId: 'Q183', gkbEntityType: 'location'}},
{id: 't3', gkbId: 'G1726', stdForm: 'Munich', type: 'media', relevance: 66.0, feats: {wikidataId: 'Q1726', gkbEntityType: 'location'}},
{id: 't4', gkbId: 'IPTC-11000000', stdForm: 'Politique', type: 'media-topic', relevance: 68.51, feats: {MediaTopicId: '11000000', wikidataId: 'Q7163', gkbEntityType: 'general'}}
]
usedChars: 100,
metadata: {referenceKey: '241014-164726-ab2eaf07'},
}
{
'version': '3.3.0',
'id': '1234',
'language': {'detected': 'en'},
'entities': [
{'id': 'e0', 'gkbId': 'G57305', 'stdForm': 'salon', 'type': 'general', 'feats': {'relevance': '11', 'ranking': '11'}},
{'id': 'e1', 'gkbId': 'G183', 'stdForm': 'Allemagne', 'type': 'location', 'feats': {'derivedBy': 'country', 'relevance': '94', 'ranking': '94'}},
{'id': 'e2', 'gkbId': 'G1726', 'stdForm': 'Munich', 'type': 'location', 'feats': {'derivedBy': 'city', 'relevance': '66', 'ranking': '66'}},
{'id': 'e3', 'gkbId': 'G3052772', 'stdForm': 'Emmanuel Macron', 'type': 'person', 'feats': {'relevance': '96', 'ranking': '96'}},
{'id': 'e4', 'gkbId': 'G980', 'stdForm': 'Bavière', 'type': 'location', 'feats': {'derivedBy': 'region', 'derivedOnly': 'true', 'relevance': '42', 'ranking': '42'}},
{'id': 'e5', 'gkbId': 'G10562', 'stdForm': 'Haute-Bavière', 'type': 'location', 'feats': {'derivedBy': 'district', 'derivedOnly': 'true', 'relevance': '41', 'ranking': '41'}}
]
'tags': [
{'id': 't1', 'gkbId': 'G3052772', 'stdForm': 'Emmanuel Macron', 'type': 'media', 'relevance': 96.0, 'feats': {'wikidataId': 'Q3052772', 'gkbEntityType': 'person'}},
{'id': 't2', 'gkbId': 'G183', 'stdForm': 'Allemagne', 'type': 'media', 'relevance': 94.0, 'feats': {'wikidataId': 'Q183', 'gkbEntityType': 'location'}},
{'id': 't3', 'gkbId': 'G1726', 'stdForm': 'Munich', 'type': 'media', 'relevance': 66.0, 'feats': {'wikidataId': 'Q1726', 'gkbEntityType': 'location'}},
{'id': 't4', 'gkbId': 'IPTC-11000000', 'stdForm': 'Politique', 'type': 'media-topic', 'relevance': 68.51, 'feats': {'MediaTopicId': '11000000', 'wikidataId': 'Q7163', 'gkbEntityType': 'general'}}
]
'usedChars': 100,
'metadata': {'referenceKey': '241014-164726-ab2eaf07'},
}
Entities: ## there are no entities here, unless they are enabled in your plan and configuration
general: salon (G57305) relevance: 11
location: Allemagne (G183) relevance: 94
location: Munich (G1726) relevance: 66
person: Emmanuel Macron (G3052772) relevance: 96
location: Bavière (G980) relevance: 42
location: Haute-Bavière (G10562) relevance: 41
Tags:
media: Emmanuel Macron (G3052772) relevance: 96.0
media: Allemagne (G183) relevance: 94.0
media: Munich (G1726) relevance: 66.0
media-topic: Politique (IPTC-11000000) relevance: 68.51
If you need tags and entities translated into more than one language, see Multiple Presentation Languages.
Knowledge Base Properties
Additional information from the Geneea Knowledge Base can be returned along with tags and entities.
The specific set of properties is configurable.
In the example below, the description property is returned for each tag or entity.
A GKB property has three types of attributes:
name: a language-independent identifier. Multiple properties may share the same name (e.g., severaloccupationvalues).label: a human-readable label of the property in the presentation language of the analysis.- One of the following value fields:
boolValuefloatValueintValuestrValueExactly one of these fields will be present for each property.
If a property is not available for a specific tag or entity, it will not be included in the output.
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
curl -X POST -H 'X-API-KEY: <YOUR_API_KEY>' -H 'accept: */*' -H 'content-type: application/json' 'https://media-api.geneea.com/v2/nlp/analyze' -d '{
"id": "1234",
"title": "Emmanuel Macron in Germany.",
"text": "Mr. Macron visited a trade show in Munich."
}'
curl -X POST -H "X-API-KEY: <YOUR_API_KEY>" -H "content-type: application/json" "https://media-api.geneea.com/v2/nlp/analyze" -d "{
\"id\": \"1234\",
\"title\": \"Emmanuel Macron in Germany.\",
\"text\": \"Mr. Macron visited a trade show in Munich.\"
}"
const analyze = async (config, input) => {
const response = await axios.post('nlp/analyze', input, config);
return response.data;
};
const input = {
id: '1234',
title: 'Emmanuel Macron in Germany.',
text: 'Mr. Macron visited a trade show in Munich.'
}
analyze(config, input).then(report);
def analyze(input):
return requests.post(f'{BASE_URL}nlp/analyze', json=input, headers=HEADERS).json()
input = {
'id': '1234',
'title': 'Emmanuel Macron in Germany.',
'text': 'Mr. Macron visited a trade show in Munich.'
}
analyze(input)
requestBuilder = g3.Request.Builder()
def value(prop: g3.GkbProperty) -> Union[bool, float, int, str]:
for v in [prop.boolValue, prop.floatValue, prop.intValue, prop.strValue]:
if v is not None:
return v
with g3.Client.create(url=f'{BASE_URL}nlp/analyze') as analyzer:
analyzer.session.headers.update({'X-API-Key': API_KEY})
request = requestBuilder.build(id=str('1234'), title='Emmanuel Macron in Germany.', text='Mr. Macron visited a trade show in Munich.')
## request.custom['returnGkbProperties'] = False ## optionally disable the feature
result = analyzer.analyze(request)
print("Tags:")
for t in result.tags:
print(f'\t{t.type}: {t.stdForm} ({t.gkbId}) relevance: {t.relevance}')
for prop in t.gkbProperties:
print(f'\t\t{prop.name} - {prop.label}: {value(prop)}')
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
{
"version": "3.3.0",
"id": "1234",
"language": { "detected": "en" },
"tags": [
{ "id": "t0", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "media", "relevance": 22.605,
"feats": { "wikidataId": "Q3052772" },
"gkbProperties": [{"name": "description", "label": "description", "strValue": "President of France and Co-Prince of Andorra since 2017"}]
},
{ "id": "t1", "gkbId": "G183", "stdForm": "Germany", "type": "media", "relevance": 18.365,
"feats": { "wikidataId": "Q183" },
"gkbProperties": [{"name": "description", "label": "description", "strValue": "country in Central Europe"}]
},
{ "id": "t2", "gkbId": "G1726", "stdForm": "Munich", "type": "media", "relevance": 7.57,
"feats": { "wikidataId": "Q1726" },
"gkbProperties": [{"name": "description", "label": "description", "strValue": "capital and most populous city of Bavaria, Germany"}]
}
],
"usedChars": 100,
"metadata": {"referenceKey": "311441-120020-a24f0281"}
}
{
"version": "3.3.0",
"id": "1234",
"language": { "detected": "en" },
"tags": [
{ "id": "t0", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "media", "relevance": 22.605,
"feats": { "wikidataId": "Q3052772" },
"gkbProperties": [{"name": "description", "label": "description", "strValue": "President of France and Co-Prince of Andorra since 2017"}]
},
{ "id": "t1", "gkbId": "G183", "stdForm": "Germany", "type": "media", "relevance": 18.365,
"feats": { "wikidataId": "Q183" },
"gkbProperties": [{"name": "description", "label": "description", "strValue": "country in Central Europe"}]
},
{ "id": "t2", "gkbId": "G1726", "stdForm": "Munich", "type": "media", "relevance": 7.57,
"feats": { "wikidataId": "Q1726" },
"gkbProperties": [{"name": "description", "label": "description", "strValue": "capital and most populous city of Bavaria, Germany"}]
}
],
"usedChars": 100,
"metadata": {"referenceKey": "311441-120020-a24f0281"}
}
{
version: '3.3.0',
id: '1234',
language: { detected: 'en' },
tags: [
{ id: 't0', gkbId: 'G3052772', stdForm: 'Emmanuel Macron', type: 'media', relevance: 22.605,
feats: { wikidataId: 'Q3052772' },
gkbProperties: [{name: 'description', label: 'description', strValue: 'President of France and Co-Prince of Andorra since 2017'}]
},
{ id: 't1', gkbId: 'G183', stdForm: 'Germany', type: 'media', relevance: 18.365,
feats: { wikidataId: 'Q183' },
gkbProperties: [{name: 'description', label: 'description', strValue: 'country in Central Europe'}]
},
{ id: 't2', gkbId: 'G1726', stdForm: 'Munich', type: 'media', relevance: 7.57,
feats: { wikidataId: 'Q1726' },
gkbProperties: [{name: 'description', label: 'description', strValue: 'capital and most populous city of Bavaria, Germany'}]
}
],
usedChars: 100,
metadata: {referenceKey: '311441-120020-a24f0281'}
}
{
'version': '3.3.0',
'id': '1234',
'language': { 'detected': 'en' },
'tags': [
{ 'id': 't0', 'gkbId': 'G3052772', 'stdForm': 'Emmanuel Macron', 'type': 'media', 'relevance': 22.605,
'feats': { 'wikidataId': 'Q3052772' },
'gkbProperties': [{'name': 'description', 'label': 'description', 'strValue': 'President of France and Co-Prince of Andorra since 2017'}]
},
{ 'id': 't1', 'gkbId': 'G183', 'stdForm': 'Germany', 'type': 'media', 'relevance': 18.365,
'feats': { 'wikidataId': 'Q183' },
'gkbProperties': [{'name': 'description', 'label': 'description', 'strValue': 'country in Central Europe'}]
},
{ 'id': 't2', 'gkbId': 'G1726', 'stdForm': 'Munich', 'type': 'media', 'relevance': 7.57,
'feats': { 'wikidataId': 'Q1726' },
'gkbProperties': [{'name': 'description', 'label': 'description', 'strValue': 'capital and most populous city of Bavaria, Germany'}]
}
],
'usedChars': 100,
'metadata': {'referenceKey': '311441-120020-a24f0281'}
}
Tags:
media: Emmanuel Macron (G3052772) relevance: 22.605
description - description: President of France and Co-Prince of Andorra since 2017
media: Germany (G183) relevance: 18.365
description - description: country in Central Europe
media: Munich (G1726) relevance: 7.57
description - description: capital and most populous city of Bavaria, Germany