Skip to main content

Knowledge Base

All returned tags and entities are linked to the Geneea Knowledge Base (GKB, Geneea KB). The GKB combines open data (e.g., Wikidata, DBpedia, OpenStreetMap, company registries) with Geneea's proprietary data.

It also supports custom properties such as your internal IDs and custom items.

This section covers:

Basic code common to all guide pages

Basic Code

To use the API, you'll need a valid API key with the appropriate permissions. If you don't have one, please contact us here.

In the code below, replace <YOUR_API_KEY> with your actual API key.

Note: We do not currently provide dedicated SDKs for this API, but our G3 SDKs can be used to perform NLP analysis.

# No special setup necessary

Entity Info Boxes

Sometimes, it is useful to show more than just an entity's name. The following code retrieves brief descriptions and links (e.g., Wikipedia, Wikidata) for specific entities. More features will be added soon.

const kbInfoBoxes = async (input, config) => {
const response = await axios.post('knowledgebase/infoboxes', input, config)
return response.data;
};

input = {
ids: ['G458', 'G567'],
language: 'fr'
};

kbInfoBoxes(input, config).then(report);
{
G458: {
value: {
title: "Union européenne",
header: "union politico-économique sui generis d'États européens",
body: "",
footer: {
cswiki: "https://cs.wikipedia.org/wiki/Evropská_unie",
enwiki: "https://en.wikipedia.org/wiki/European_Union",
wikidata: "https://www.wikidata.org/entity/Q458",
},
},
language: "fr",
},
G567: {
value: {
title: "Angela Merkel",
header: "chancelière fédérale allemande",
body: "",
footer: {
cswiki: "https://cs.wikipedia.org/wiki/Angela_Merkelová",
enwiki: "https://en.wikipedia.org/wiki/Angela_Merkel",
facebook: "https://www.facebook.com/AngelaMerkel",
instagram: "https://www.instagram.com/bundeskanzlerin",
wikidata: "https://www.wikidata.org/entity/Q567",
},
},
language: "fr",
},
}

Entity Details

It's also possible to retrieve details about an entity by its external ID, such as Wikidata ID.

const kbDetails = async (input, config) => {
const response = await axios.post('knowledgebase/details', input, config)
return response.data;
};

input = {
ids: ['Q42'],
language: 'en',
externalSource: 'wikidata'
};

kbDetails(input, config).then(report);
{
Q42: {
gkbId: "G42",
stdForm: {
value: "Douglas Adams",
language: "en"
},
description: {
value: "English science fiction writer and humorist (1952–2001)",
language: "en"
},
type: "person",
externalIds: {
wikidata: "Q42"
},
externalLinks: {
"enwiki": "https://en.wikipedia.org/wiki/Douglas_Adams",
"wikidata": "https://www.wikidata.org/entity/Q42"
}
}
}

Multiple Presentation Languages

The knowledge base supports presenting entity names (or tags) in multiple languages.

While this can be achieved by running analyses with different Presentation Language values, you can also retrieve translated forms directly.

Here's how to get Polish names for entities/tags from the example in the photo recommendation guide:

const kbStdForms = async (input, config) => {
const response = await axios.post('knowledgebase/stdforms', input, config)
return response.data;
};

input = {
ids: ['G458', 'G567'],
language: 'pl',
};

kbStdForms(input, config).then(report);
{
G458: { value: 'Unia Europejska', language: 'pl' },
G567: { value: 'Angela Merkel', language: 'pl' }
}

Deprecated and Duplicate IDs

Over time, the knowledge base may accumulate duplicate items—entities that have different IDs but represent the same real-world concept.

For example, the items with IDs G22262439 and G8ad70d13-E both refer to the same person: Jiří Kulhánek, a Czech local politician. These items are considered duplicates.

There are several reasons why such duplicates may occur:

  • Crowdsourced sources (such as Wikidata) may include independently created items for the same entity.
  • Multiple data sources (e.g., a business register and Wikipedia) may each include the same real-world entity, but reconciling them is not always straightforward.
  • Automated data analysis may produce noisy results.
    For instance, our system might detect a John Doe, actor, and a John Doe, politician as two separate individuals. Only later might it become clear that both refer to the same person—John Doe, who is both and actor and a politician.

Whenever such duplicates are detected, one of them is selected as the primary (canonical) entry, and the others are marked as inactive.
Inactive items are no longer used in new outputs, and only the primary item will appear in results.

How the API Communicates Duplicates

Our API communicates this information in two ways:

  • As part of the NLP response: If an entity or tag has had alternative (now deprecated) IDs, these are included in the response.
  • Via a dedicated redirects API: This API allows you to query the status of any knowledge base ID and see whether it has bee marked as a duplicate of another.

See below for more information.

Duplicates in NLP Response

Information about deprecated IDs is included in the NLP response (see Article Content Analysis).

If an entity or tag has any deprecated duplicate IDs, they are listed under the field [feats.duplicateGkbIds]. If multiple deprecated IDs exist, they are separated by commas.

Example
In this example below, the person Jiří Kulhánek is associated with the active ID [G22262439]. Two additional IDs—[G8ad70d13-E] and [Gfd6d708c-C] are listed as deprecated duplicates.

Note: the response is simplified; irrelevant fields have been omitted.

{
...
entities: [
{
id: "e1",
stdForm: "Jiří Kulhánek",
gkbId: "G22262439",
feats: {
duplicateGkbIds: "G8ad70d13-E,Gfd6d708c-C",
},
},
],
tags: [
{
id: "t1",
stdForm: "Jiří Kulhánek",
gkbId: "G22262439",
feats: {
duplicateGkbIds: "G8ad70d13-E,Gfd6d708c-C",
},
},
],
...
}

Knowledge Base Item Redirects

You can also query the status of knowledge base IDs directly using the redirect API:

This is useful for checking whether an ID is still active or has been replaced.

const itemRedirects = async (ids, config) => {
const response = await axios.post('knowledgebase/redirects', {gkbIds: ids}, config)
return response.data;
};

ids = ['G1', 'G22262439', 'G8ad70d13-E', 'Gfd6d708c-C'];

itemRedirects(ids, config).then(report);

Example response
In the example below, we can see:

  • G8ad70d13-E is inactive and replaced by G22262439
  • Gfd6d708c-C is also inactive and replaced by G22262439
  • G22262439 is the active ID that replaces both deprecated IDs
  • G1 remains an active ID with no replacements
{
G1: {status: 'active'},
G22262439: {status: 'active', replaces: ['G8ad70d13-E', 'Gfd6d708c-C']},
G8ad70d13-E: {status: 'inactive', replacedBy: 'G22262439'},
Gfd6d708c-C: {status: 'inactive', replacedBy: 'G22262439'}
}

We provide an API for searching the knowledge base using text queries.

This is especially useful during feedback and innovation workflows—for example, when the correct knowledge base ID is unknown but can be inferred from the entity's name.

Since multiple entities can share the same name, search results may include several candidates.

Usage example
Let's define a function to perform the search:

const itemSearch = async (query, lang) => {
const response = await axios.post('knowledgebase/search', {query: query, language: lang}, config)
return response.data;
};

Now, let's search for Michael Jordan in English:

itemSearch('Michael Jordan', 'en').then(report);

Example search results

{
query: "Michael Jordan",
hits: 8,
itemDetails: [
{
gkbId: "G41421",
stdForm: { value: "Michael Jordan", language: "en" },
description: {
value: "American basketball player and businessman",
language: "en",
},
type: "person",
},
{
gkbId: "G3308285",
stdForm: { value: "Michael I. Jordan", language: "en" },
description: {
value: "American computer scientist, University of California, Berkeley",
language: "en",
},
type: "person",
},
{
gkbId: "G6831716",
stdForm: { value: "Michael Jordan", language: "en" },
description: {
value: "English footballer (born 1984)",
language: "en",
},
type: "person",
},
{
gkbId: "G65029442",
stdForm: { value: "Michael Jordan", language: "en" },
description: {
value: "American football offensive lineman",
language: "en",
},
type: "person",
},
{
gkbId: "G27069141",
stdForm: { value: "Michael Jordan", language: "en" },
description: {
value: "American football cornerback",
language: "en",
},
type: "person",
},
{
gkbId: "G6831715",
stdForm: { value: "Michael Jordan", language: "en" },
description: { value: "Irish politician", language: "en" },
type: "person",
},
{
gkbId: "G1928047",
stdForm: { value: "Michael Jordan", language: "en" },
description: {
value: "German draughtsperson, artist and comics artist",
language: "en",
},
type: "person",
},
{
gkbId: "G6831719",
stdForm: { value: "Michael Jordan", language: "en" },
description: { value: "British mycologist", language: "en" },
type: "person",
},
],
}

At this time, searches match full entity names, not substrings. This means that searching for Jordan will not return results for Michael Jordan.

Here's what a search for Jordan returns:

itemSearch('Jordan', 'en').then(report);

{
query: 'Jordan',
hits: 1,
itemDetails: [
{
gkbId: 'G810',
stdForm: {value: 'Jordan', language: 'en'},
description: {value: 'constitutional monarchy in Western Asia', language: 'en'},
type: 'location'
}
]
}