NLP API
Overview
The NLP API provides endpoints for semantic analysis of articles. It can extract tags, entities, sentiment, and other semantic information from the text.
Endpoints
| Endpoint | Description | |
|---|---|---|
| Request | 200 Response | |
| POST /v2/nlp/analyze | Analyzes an article and returns tags, entities, sentiment, and more. | |
| NlpRequest | Analysis | |
| POST /v2/nlp/analyze/feedback | Sends feedback on a specific analysis result. | |
| AnalyzeFeedback | ||
POST /v2/nlp/analyze
Analyzes an article and returns tags, entities, sentiment, and more.
- Request: NlpRequest
- Response: Analysis (same as G3 response)
Examples
Basic analysis
Request - send an article for the default analysis configured for the customer, typically semantic tagging:
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
curl -X POST -H 'X-API-KEY: <YOUR_API_KEY>' -H 'accept: */*' -H 'content-type: application/json' 'https://media-api.geneea.com/v2/nlp/analyze' -d '{
"id": "1234",
"title": "Emmanuel Macron in Germany.",
"text": "Mr. Macron visited a trade show in Munich."
}'
curl -X POST -H "X-API-KEY: <YOUR_API_KEY>" -H "content-type: application/json" "https://media-api.geneea.com/v2/nlp/analyze" -d "{
\"id\": \"1234\",
\"title\": \"Emmanuel Macron in Germany.\",
\"text\": \"Mr. Macron visited a trade show in Munich.\"
}"
const analyze = async (config, input) => {
const response = await axios.post('nlp/analyze', input, config);
return response.data;
};
const input = {
id: '1234',
title: 'Emmanuel Macron in Germany.',
text: 'Mr. Macron visited a trade show in Munich.'
}
analyze(config, input).then(report);
def analyze(input):
return requests.post(f'{BASE_URL}nlp/analyze', json=input, headers=HEADERS).json()
input = {
'id': '1234',
'title': 'Emmanuel Macron in Germany.',
'text': 'Mr. Macron visited a trade show in Munich.'
}
analyze(input)
requestBuilder = g3.Request.Builder()
with g3.Client.create(url=f'{BASE_URL}nlp/analyze') as analyzer:
analyzer.session.headers.update({'X-API-Key': API_KEY})
request = requestBuilder.build(id=str('1234'), title='Emmanuel Macron in Germany.', text='Mr. Macron visited a trade show in Munich.')
result = analyzer.analyze(request)
print("Entities:") ## there will be no entities, unless they are specified in your plan and configuration
for e in result.entities:
print(f' {e.type}: {e.stdForm} ({e.gkbId}) relevance: {e.feats.get("relevance")}, derivedBy: {e.feats.get("derivedBy", "N/A")}, derivedOnly: {e.feats.get("derivedOnly", "false")}'))
print("Tags:")
for t in result.tags:
print(f'\t{t.type}: {t.stdForm} ({t.gkbId}) relevance: {t.relevance}')
Response (see Analysis for the description of the fields)
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
{
"version": "3.3.0",
"id": "1234",
"language": {"detected": "en"},
"tags": [
{"id": "t1", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "media", "relevance": 96.0, "feats": {"wikidataId": "Q3052772", "gkbEntityType": "person"}},
{"id": "t2", "gkbId": "G183", "stdForm": "Germany", "type": "media", "relevance": 94.0, "feats": {"wikidataId": "Q183", "gkbEntityType": "location"}},
{"id": "t3", "gkbId": "G1726", "stdForm": "Munich", "type": "media", "relevance": 66.0, "feats": {"wikidataId": "Q1726", "gkbEntityType": "location"}},
{"id": "t4", "gkbId": "IPTC-11000000", "stdForm": "politics", "type": "media-topic", "relevance": 68.51, "feats": {"MediaTopicId": "11000000", "wikidataId": "Q7163", "gkbEntityType": "general"}}
]
"usedChars": 100,
"metadata": {"referenceKey": "241014-164726-9bdaf485"},
}
{
"version": "3.3.0",
"id": "1234",
"language": {"detected": "en"},
"tags": [
{"id": "t1", "gkbId": "G3052772", "stdForm": "Emmanuel Macron", "type": "media", "relevance": 96.0, "feats": {"wikidataId": "Q3052772", "gkbEntityType": "person"}},
{"id": "t2", "gkbId": "G183", "stdForm": "Germany", "type": "media", "relevance": 94.0, "feats": {"wikidataId": "Q183", "gkbEntityType": "location"}},
{"id": "t3", "gkbId": "G1726", "stdForm": "Munich", "type": "media", "relevance": 66.0, "feats": {"wikidataId": "Q1726", "gkbEntityType": "location"}},
{"id": "t4", "gkbId": "IPTC-11000000", "stdForm": "politics", "type": "media-topic", "relevance": 68.51, "feats": {"MediaTopicId": "11000000", "wikidataId": "Q7163", "gkbEntityType": "general"}}
]
"usedChars": 100,
"metadata": {"referenceKey": "241014-164726-9bdaf485"},
}
{
version: '3.3.0',
id: '1234',
language: {detected: 'en'},
tags: [
{id: 't1', gkbId: 'G3052772', stdForm: 'Emmanuel Macron', type: 'media', relevance: 96.0, feats: {wikidataId: 'Q3052772', gkbEntityType: 'person'}},
{id: 't2', gkbId: 'G183', stdForm: 'Germany', type: 'media', relevance: 94.0, feats: {wikidataId: 'Q183', gkbEntityType: 'location'}},
{id: 't3', gkbId: 'G1726', stdForm: 'Munich', type: 'media', relevance: 66.0, feats: {wikidataId: 'Q1726', gkbEntityType: 'location'}},
{id: 't4', gkbId: 'IPTC-11000000', stdForm: 'politics', type: 'media-topic', relevance: 68.51, feats: {MediaTopicId: '11000000', wikidataId: 'Q7163', gkbEntityType: 'general'}}
]
usedChars: 100,
metadata: {referenceKey: '241014-164726-9bdaf485'},
}
{
'version': '3.3.0',
'id': '1234',
'language': {'detected': 'en'},
'tags': [
{'id': 't1', 'gkbId': 'G3052772', 'stdForm': 'Emmanuel Macron', 'type': 'media', 'relevance': 96.0, 'feats': {'wikidataId': 'Q3052772', 'gkbEntityType': 'person'}},
{'id': 't2', 'gkbId': 'G183', 'stdForm': 'Germany', 'type': 'media', 'relevance': 94.0, 'feats': {'wikidataId': 'Q183', 'gkbEntityType': 'location'}},
{'id': 't3', 'gkbId': 'G1726', 'stdForm': 'Munich', 'type': 'media', 'relevance': 66.0, 'feats': {'wikidataId': 'Q1726', 'gkbEntityType': 'location'}},
{'id': 't4', 'gkbId': 'IPTC-11000000', 'stdForm': 'politics', 'type': 'media-topic', 'relevance': 68.51, 'feats': {'MediaTopicId': '11000000', 'wikidataId': 'Q7163', 'gkbEntityType': 'general'}}
]
'usedChars': 100,
'metadata': {'referenceKey': '241014-164726-9bdaf485'},
}
Entities: ## there are no entities here, unless they are enabled in your plan and configuration
Tags:
media: Emmanuel Macron (G3052772) relevance: 96.0
media: Germany (G183) relevance: 94.0
media: Munich (G1726) relevance: 66.0
media-topic: politics (IPTC-11000000) relevance: 68.51
Analysis with mentions
To receive mentions — text snippets in the article that correspond to tags —
use the "returnMentions": "true" parameter.
Request
- cURL
- cURL (Windows)
- JavaScript
- Python
- Python SDK
curl -X POST -H 'X-API-KEY: <YOUR_API_KEY>' -H 'accept: */*' -H 'content-type: application/json' 'https://media-api.geneea.com/v2/nlp/analyze' -d '{
"id": "1234",
"title": "Emmanuel Macron in Germany.",
"text": "Mr. Macron visited a trade show in Munich.",
"returnMentions": "true"
}'
curl -X POST -H "X-API-KEY: <YOUR_API_KEY>" -H "content-type: application/json" "https://media-api.geneea.com/v2/nlp/analyze" -d "{
\"id\": \"1234\",
\"title\": \"Emmanuel Macron in Germany.\",
\"text\": \"Mr. Macron visited a trade show in Munich.\",
\"returnMentions\": \"true\"
}"
const analyze = async (config, input) => {
const response = await axios.post('nlp/analyze', input, config);
return response.data;
};
const input = {
id: '1234',
title: 'Emmanuel Macron in Germany.',
text: 'Mr. Macron visited a trade show in Munich.'
}
analyze(config, input).then(report);
def analyze(input):
return requests.post(f'{BASE_URL}nlp/analyze', json=input, headers=HEADERS).json()
input = {
'id': '1234',
'title': 'Emmanuel Macron in Germany.',
'text': 'Mr. Macron visited a trade show in Munich.',
'returnMentions': True
}
analyze(input)
requestBuilder = g3.Request.Builder(returnMentions=True)
with g3.Client.create(url=f'{BASE_URL}nlp/analyze') as analyzer:
analyzer.session.headers.update({'X-API-Key': API_KEY})
request = requestBuilder.build(id=str('1234'), title='Emmanuel Macron in Germany.', text='Mr. Macron visited a trade show in Munich.')
result = analyzer.analyze(request)
print("Tags:")
for t in result.tags:
print(f'\t{t.type}: {t.stdForm} ({t.gkbId}) relevance: {t.relevance}')
for m in t.mentions:
## charSpan can be used for highlighting in the original text
print(f'\t{m.text}; {m.mwl}; {m.tokens.charSpan}')
POST /v2/nlp/analyze/feedback
Sends feedback on a specific analysis result.
- Request: AnalyzeFeedback
- Response: (empty)
Example
Tags to send feedback on
The request sending an article with ID article-123 for tagging:
- cURL
- Python
- Python SDK
POST https://media-api.geneea.com/v2/nlp/analyze
{
"id": "article-123",
"title": "Tesla to accept Dogecoin as payment for merchandise, says Musk",
"text": "Dec 14 (Reuters) - Tesla Inc (TSLA.O) chief Elon Musk said on Tuesday the electric carmaker will accept Dogecoin as payment for merchandise on a test basis, sending the meme-based cryptocurrency up 24%.",
"language": "en"
}
def analyze(input):
return requests.post(f'{BASE_URL}nlp/analyze', json=input, headers=HEADERS).json()
input = {
'id': 'article-123',
'title': 'Tesla to accept Dogecoin as payment for merchandise, says Musk',
'text': 'Dec 14 (Reuters) - Tesla Inc (TSLA.O) chief Elon Musk said on Tuesday the electric carmaker will accept Dogecoin as payment for merchandise on a test basis, sending the meme-based cryptocurrency up 24%.',
'language': 'en'
}
analyze(input)
requestBuilder = g3.Request.Builder()
with g3.Client.create(url=f'{BASE_URL}nlp/analyze') as analyzer:
analyzer.session.headers.update({'X-API-Key': API_KEY})
request = requestBuilder.build(
id=str('article-1234'),
title='Tesla to accept Dogecoin as payment for merchandise, says Musk.',
text='Dec 14 (Reuters) - Tesla Inc (TSLA.O) chief Elon Musk said on Tuesday the electric carmaker will accept Dogecoin as payment for merchandise on a test basis, sending the meme-based cryptocurrency up 24%.'
)
result = analyzer.analyze(request)
print('Tags:')
for t in result.tags:
print(f' \t{t.type}: {t.stdForm} ({t.gkbId}) relevance: {t.relevance}')
A sample result (in this case, the Assistant is configured to return tags only):
{
"id": "article-123",
"language": { "detected": "en" },
"tags": [
{
"id": "t0",
"gkbId": "G478214",
"stdForm": "Tesla, Inc.",
"type": "media",
"relevance": 24.33
},
{
"id": "t1",
"gkbId": "G317521",
"stdForm": "Elon Musk",
"type": "media",
"relevance": 22.5
},
{
"id": "t2",
"gkbId": "G15377916",
"stdForm": "Dogecoin",
"type": "media",
"relevance": 19.24
},
{
"id": "t3",
"gkbId": "G130879",
"stdForm": "Reuters",
"type": "media",
"relevance": 9.275
}
],
"metadata": { "referenceKey": "211201-103000-d64a0290" }
}
Feedback on the returned tags
The feedback references:
- the ID of the article (
article-123) - the
referenceKey(211201-103000-d64a0290) identifying the particular result - individual tags:
- some were accepted (
accepted-actively), - some were correct but not important enough (
rejected-actively-marginal) - some were not expected in the result but had to be entered manually (
expected).
- some were accepted (
- cURL
- Python
POST https://media-api.geneea.com/v2/nlp/analyze/feedback
{
"docId": "article-123",
"referenceKey": "211201-103000-d64a0290",
"tags": [
{ "gkbId": "G478214", "stdForm": "Tesla, Inc.", "status": "accepted-actively"},
{ "gkbId": "G317521", "stdForm": "Elon Musk", "status": "accepted-actively"},
{ "gkbId": "G130879", "stdForm": "Reuters", "status": "rejected-actively-marginal", "comment": "Reuters not wanted" },
{ "gkbId": "G13479982", "stdForm": "cryptocurrency", "status": "expected", "comment": "cryptocurrency is missing"}
]
}
def feedback(input):
return requests.post(f'{BASE_URL}nlp/analyze/feedback', json=input, headers=HEADERS).json()
input = {
'docId': 'article-123',
'referenceKey': '211201-103000-d64a0290',
'tags': [
{ 'gkbId': 'G478214', 'stdForm': 'Tesla, Inc.', 'status': 'accepted-actively'},
{ 'gkbId': 'G317521', 'stdForm': 'Elon Musk', 'status': 'accepted-actively'},
{ 'gkbId': 'G130879', 'stdForm': 'Reuters', 'status': 'rejected-actively-marginal', 'comment': 'Reuters not wanted' },
{ 'gkbId': 'G13479982', 'stdForm': 'cryptocurrency', 'status': 'expected', 'comment': 'cryptocurrency is missing'}
]
}
feedback(input)