Knowledge Base
All returned tags and entities are linked to the Geneea Knowledge Base (GKB, Geneea KB). The GKB combines open data (e.g., Wikidata, DBpedia, OpenStreetMap, company registries) with Geneea's proprietary data.
It also supports custom properties such as your internal IDs and custom items.
This section covers:
- Entity info-boxes: retrieve info-boxes for entities by their IDs
- Entity details: retrieve details about entities by their IDs
- Multiple presentation languages: get the standard form (title) in a specific language
- Deprecated and duplicate IDs: how duplicates are handled
- Knowledge base search: search for entities by name
Basic code common to all guide pages
Basic Code
To use the API, you'll need a valid API key with the appropriate permissions. If you don't have one, please contact us here.
In the code below, replace <YOUR_API_KEY> with your actual API key.
Note: We do not currently provide dedicated SDKs for this API, but our G3 SDKs can be used to perform NLP analysis.
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
# No special setup necessary
# No special setup necessary
// HTTP client; see https://github.com/axios/axios
const axios = require('axios');
const config = {
baseURL: 'https://media-api.geneea.com/v2/',
headers: {
'X-API-KEY': '<YOUR_API_KEY>'
}
};
// A simple function to report the returned json objects
const report = (output) => console.dir(output, { depth: null });
// In production environment, the API should always be called from the backend,
// otherwise you run into CORS problems
# http client; see https://docs.python-requests.org/en/latest/
import requests
BASE_URL = 'https://media-api.geneea.com/v2/'
HEADERS = {
'content-type': 'application/json',
'X-API-Key': '<YOUR_API_KEY>'
}
# We do not provide a dedicated SDK for the Media API yet.
# However, the SDK for General API can be used for the content analysis (i.e. NLP) part of the Media API.
# Geneea NLP client SDK; see https://help.geneea.com/sdk/index.html
# Use `pip install geneea-nlp-client`
from geneeanlpclient import g3
BASE_URL = 'https://media-api.geneea.com/v2/'
API_KEY = '<YOUR_API_KEY>'
Entity Info Boxes
Sometimes, it is useful to show more than just an entity's name. The following code retrieves brief descriptions and links (e.g., Wikipedia, Wikidata) for specific entities. More features will be added soon.
- JavaScript
- Python
const kbInfoBoxes = async (input, config) => {
const response = await axios.post('knowledgebase/infoboxes', input, config)
return response.data;
};
input = {
ids: ['G458', 'G567'],
language: 'fr'
};
kbInfoBoxes(input, config).then(report);
def kbInfoBoxes(input: Mapping[str, Any]):
return requests.post(f'{BASE_URL}knowledgebase/infoboxes', json=input, headers=HEADERS).json()
input = {
'ids': ['G458', 'G567'],
'language': 'fr',
}
kbInfoBoxes(input)
- JavaScript
- Python
{
G458: {
value: {
title: "Union européenne",
header: "union politico-économique sui generis d'États européens",
body: "",
footer: {
cswiki: "https://cs.wikipedia.org/wiki/Evropská_unie",
enwiki: "https://en.wikipedia.org/wiki/European_Union",
wikidata: "https://www.wikidata.org/entity/Q458",
},
},
language: "fr",
},
G567: {
value: {
title: "Angela Merkel",
header: "chancelière fédérale allemande",
body: "",
footer: {
cswiki: "https://cs.wikipedia.org/wiki/Angela_Merkelová",
enwiki: "https://en.wikipedia.org/wiki/Angela_Merkel",
facebook: "https://www.facebook.com/AngelaMerkel",
instagram: "https://www.instagram.com/bundeskanzlerin",
wikidata: "https://www.wikidata.org/entity/Q567",
},
},
language: "fr",
},
}
{
'G458': {
'value': {
'title': 'Union européenne',
'header': 'union politico-économique sui generis d\'États européens',
'body': '',
'footer': {
'cswiki': 'https://cs.wikipedia.org/wiki/Evropská_unie',
'enwiki': 'https://en.wikipedia.org/wiki/European_Union',
'wikidata': 'https://www.wikidata.org/entity/Q458',
},
},
'language': 'fr',
},
'G567': {
'value': {
'title': 'Angela Merkel',
'header': 'chancelière fédérale allemande',
'body': '',
'footer': {
'cswiki': 'https://cs.wikipedia.org/wiki/Angela_Merkelová',
'enwiki': 'https://en.wikipedia.org/wiki/Angela_Merkel',
'facebook': 'https://www.facebook.com/AngelaMerkel',
'instagram': 'https://www.instagram.com/bundeskanzlerin',
'wikidata': 'https://www.wikidata.org/entity/Q567',
},
},
'language': 'fr',
},
}
Entity Details
It's also possible to retrieve details about an entity by its external ID, such as Wikidata ID.
- JavaScript
- Python
const kbDetails = async (input, config) => {
const response = await axios.post('knowledgebase/details', input, config)
return response.data;
};
input = {
ids: ['Q42'],
language: 'en',
externalSource: 'wikidata'
};
kbDetails(input, config).then(report);
def kbDetails(input: Mapping[str, Any]):
return requests.post(f'{BASE_URL}knowledgebase/details', json=input, headers=HEADERS).json()
input = {
'ids': ['Q42'],
'language': 'en',
'externalSource': 'wikidata',
}
kbDetails(input)
- JavaScript
- Python
{
Q42: {
gkbId: "G42",
stdForm: {
value: "Douglas Adams",
language: "en"
},
description: {
value: "English science fiction writer and humorist (1952–2001)",
language: "en"
},
type: "person",
externalIds: {
wikidata: "Q42"
},
externalLinks: {
"enwiki": "https://en.wikipedia.org/wiki/Douglas_Adams",
"wikidata": "https://www.wikidata.org/entity/Q42"
}
}
}
{
'Q42': {
'gkbId': 'G42',
'stdForm': {
'value': 'Douglas Adams',
'language': 'en',
},
'description': {
'value': 'English science fiction writer and humorist (1952–2001)',
'language': 'en',
},
'type': 'person',
'externalIds': {
'wikidata': 'Q42',
},
'externalLinks': {
'enwiki': 'https://en.wikipedia.org/wiki/Douglas_Adams',
'wikidata': 'https://www.wikidata.org/entity/Q42',
}
}
}
Multiple Presentation Languages
The knowledge base supports presenting entity names (or tags) in multiple languages.
While this can be achieved by running analyses with different Presentation Language values, you can also retrieve translated forms directly.
Here's how to get Polish names for entities/tags from the example in the photo recommendation guide:
- JavaScript
- Python
const kbStdForms = async (input, config) => {
const response = await axios.post('knowledgebase/stdforms', input, config)
return response.data;
};
input = {
ids: ['G458', 'G567'],
language: 'pl',
};
kbStdForms(input, config).then(report);
def kbStdForms(input: Mapping[str, Any]):
return requests.post(f'{BASE_URL}knowledgebase/stdforms', json=input, headers=HEADERS).json()
input = {
'ids': ['G458', 'G567'],
'language': 'pl',
}
kbStdForms(input)
- JavaScript
- Python
{
G458: { value: 'Unia Europejska', language: 'pl' },
G567: { value: 'Angela Merkel', language: 'pl' }
}
{
'G458': {'value': 'Unia Europejska', 'language': 'pl'},
'G567': {'value': 'Angela Merkel', 'language': 'pl'}
}
Deprecated and Duplicate IDs
Over time, the knowledge base may accumulate duplicate items—entities that have different IDs but represent the same real-world concept.
For example, the items with IDs G22262439 and G8ad70d13-E both refer to the same person:
Jiří Kulhánek, a Czech local politician. These items are considered duplicates.
There are several reasons why such duplicates may occur:
- Crowdsourced sources (such as Wikidata) may include independently created items for the same entity.
- Multiple data sources (e.g., a business register and Wikipedia) may each include the same real-world entity, but reconciling them is not always straightforward.
- Automated data analysis may produce noisy results.
For instance, our system might detect a John Doe, actor, and a John Doe, politician as two separate individuals. Only later might it become clear that both refer to the same person—John Doe, who is both and actor and a politician.
Whenever such duplicates are detected, one of them is selected as
the primary (canonical) entry, and the others are marked as inactive.
Inactive items are no longer used in new outputs, and only the primary item will appear in results.
How the API Communicates Duplicates
Our API communicates this information in two ways:
- As part of the NLP response: If an entity or tag has had alternative (now deprecated) IDs, these are included in the response.
- Via a dedicated redirects API: This API allows you to query the status of any knowledge base ID and see whether it has bee marked as a duplicate of another.
See below for more information.
Duplicates in NLP Response
Information about deprecated IDs is included in the NLP response (see Article Content Analysis).
If an entity or tag has any deprecated duplicate IDs, they are listed under the field [feats.duplicateGkbIds]. If multiple deprecated IDs exist, they are separated by commas.
Example
In this example below, the person Jiří Kulhánek is associated with the active ID [G22262439].
Two additional IDs—[G8ad70d13-E] and [Gfd6d708c-C] are listed as deprecated duplicates.
Note: the response is simplified; irrelevant fields have been omitted.
- JavaScript
- Python
{
...
entities: [
{
id: "e1",
stdForm: "Jiří Kulhánek",
gkbId: "G22262439",
feats: {
duplicateGkbIds: "G8ad70d13-E,Gfd6d708c-C",
},
},
],
tags: [
{
id: "t1",
stdForm: "Jiří Kulhánek",
gkbId: "G22262439",
feats: {
duplicateGkbIds: "G8ad70d13-E,Gfd6d708c-C",
},
},
],
...
}
{
...
'entities': [
{
'id': 'e1',
'stdForm': 'Jiří Kulhánek',
'gkbId': 'G22262439',
'feats': {
'duplicateGkbIds': 'G8ad70d13-E,Gfd6d708c-C',
},
},
],
'tags': [
{
'id': 't1',
'stdForm': 'Jiří Kulhánek',
'gkbId': 'G22262439',
'feats': {
'duplicateGkbIds': 'G8ad70d13-E,Gfd6d708c-C',
},
},
],
...
}
Knowledge Base Item Redirects
You can also query the status of knowledge base IDs directly using the redirect API:
This is useful for checking whether an ID is still active or has been replaced.
- JavaScript
- Python
const itemRedirects = async (ids, config) => {
const response = await axios.post('knowledgebase/redirects', {gkbIds: ids}, config)
return response.data;
};
ids = ['G1', 'G22262439', 'G8ad70d13-E', 'Gfd6d708c-C'];
itemRedirects(ids, config).then(report);
def itemRedirects(ids: List[str]):
return requests.post(f'{BASE_URL}knowledgebase/redirects', json={'gkbIds': ids}, headers=HEADERS).json()
ids = ['G1', 'G22262439', 'G8ad70d13-E', 'Gfd6d708c-C']
itemRedirects(ids)
Example response
In the example below, we can see:
G8ad70d13-Eis inactive and replaced byG22262439Gfd6d708c-Cis also inactive and replaced byG22262439G22262439is the active ID that replaces both deprecated IDsG1remains an active ID with no replacements
- JavaScript
- Python
{
G1: {status: 'active'},
G22262439: {status: 'active', replaces: ['G8ad70d13-E', 'Gfd6d708c-C']},
G8ad70d13-E: {status: 'inactive', replacedBy: 'G22262439'},
Gfd6d708c-C: {status: 'inactive', replacedBy: 'G22262439'}
}
{
'G1': {'status': 'active'},
'G22262439': {'status': 'active', 'replaces': ['G8ad70d13-E', 'Gfd6d708c-C']},
'G8ad70d13-E': {'status': 'inactive', 'replacedBy': 'G22262439'},
'Gfd6d708c-C': {'status': 'inactive', 'replacedBy': 'G22262439'}
}
Knowledge Base Search
We provide an API for searching the knowledge base using text queries.
This is especially useful during feedback and innovation workflows—for example, when the correct knowledge base ID is unknown but can be inferred from the entity's name.
Since multiple entities can share the same name, search results may include several candidates.
Usage example
Let's define a function to perform the search:
- JavaScript
- Python
const itemSearch = async (query, lang) => {
const response = await axios.post('knowledgebase/search', {query: query, language: lang}, config)
return response.data;
};
def itemSearch(query: str, lang: str):
return requests.post(
f'{BASE_URL}knowledgebase/search',
json={'query': query, 'language': lang},
headers=HEADERS
).json()
Now, let's search for Michael Jordan in English:
- JavaScript
- Python
itemSearch('Michael Jordan', 'en').then(report);
itemSearch('Michael Jordan', 'en')
Example search results
- JavaScript
- Python
{
query: "Michael Jordan",
hits: 8,
itemDetails: [
{
gkbId: "G41421",
stdForm: { value: "Michael Jordan", language: "en" },
description: {
value: "American basketball player and businessman",
language: "en",
},
type: "person",
},
{
gkbId: "G3308285",
stdForm: { value: "Michael I. Jordan", language: "en" },
description: {
value: "American computer scientist, University of California, Berkeley",
language: "en",
},
type: "person",
},
{
gkbId: "G6831716",
stdForm: { value: "Michael Jordan", language: "en" },
description: {
value: "English footballer (born 1984)",
language: "en",
},
type: "person",
},
{
gkbId: "G65029442",
stdForm: { value: "Michael Jordan", language: "en" },
description: {
value: "American football offensive lineman",
language: "en",
},
type: "person",
},
{
gkbId: "G27069141",
stdForm: { value: "Michael Jordan", language: "en" },
description: {
value: "American football cornerback",
language: "en",
},
type: "person",
},
{
gkbId: "G6831715",
stdForm: { value: "Michael Jordan", language: "en" },
description: { value: "Irish politician", language: "en" },
type: "person",
},
{
gkbId: "G1928047",
stdForm: { value: "Michael Jordan", language: "en" },
description: {
value: "German draughtsperson, artist and comics artist",
language: "en",
},
type: "person",
},
{
gkbId: "G6831719",
stdForm: { value: "Michael Jordan", language: "en" },
description: { value: "British mycologist", language: "en" },
type: "person",
},
],
}
{
'query': 'Michael Jordan',
'hits': 8,
'itemDetails': [
{
'gkbId': 'G41421',
'stdForm': {'value': 'Michael Jordan', 'language': 'en'},
'description': {
'value': 'American basketball player and businessman',
'language': 'en',
},
'type': 'person',
},
{
'gkbId': 'G3308285',
'stdForm': {'value': 'Michael I. Jordan', 'language': 'en'},
'description': {
'value': 'American computer scientist, University of California, Berkeley',
'language': 'en',
},
'type': 'person',
},
{
'gkbId': 'G6831716',
'stdForm': {'value': 'Michael Jordan', 'language': 'en'},
'description': {
'value': 'English footballer (born 1984)',
'language': 'en',
},
'type': 'person',
},
{
'gkbId': 'G65029442',
'stdForm': {'value': 'Michael Jordan', 'language': 'en'},
'description': {
'value': 'American football offensive lineman',
'language': 'en',
},
'type': 'person',
},
{
'gkbId': 'G27069141',
'stdForm': {'value': 'Michael Jordan', 'language': 'en'},
'description': {'value': 'American football cornerback', 'language': 'en'},
'type': 'person',
},
{
'gkbId': 'G6831715',
'stdForm': {'value': 'Michael Jordan', 'language': 'en'},
'description': {'value': 'Irish politician', 'language': 'en'},
'type': 'person',
},
{
'gkbId': 'G1928047',
'stdForm': {'value': 'Michael Jordan', 'language': 'en'},
'description': {
'value': 'German draughtsperson, artist and comics artist',
'language': 'en',
},
'type': 'person',
},
{
'gkbId': 'G6831719',
'stdForm': {'value': 'Michael Jordan', 'language': 'en'},
'description': {'value': 'British mycologist', 'language': 'en'},
'type': 'person',
},
],
}
At this time, searches match full entity names, not substrings. This means that searching for Jordan will not return results for Michael Jordan.
Here's what a search for Jordan returns:
- JavaScript
- Python
itemSearch('Jordan', 'en').then(report);
{
query: 'Jordan',
hits: 1,
itemDetails: [
{
gkbId: 'G810',
stdForm: {value: 'Jordan', language: 'en'},
description: {value: 'constitutional monarchy in Western Asia', language: 'en'},
type: 'location'
}
]
}
itemSearch('Jordan', 'en')
{
'query': 'Jordan',
'hits': 1,
'itemDetails': [
{
'gkbId': 'G810',
'stdForm': {'value': 'Jordan', 'language': 'en'},
'description': {'value': 'constitutional monarchy in Western Asia', 'language': 'en'},
'type': 'location'
}
]
}