Language Detection
Language detection can be called separately or as part of other functions.
Recognized languages
Our default mode distinguishes 31 languages:
ar - Arabic | el - Greek | he - Hebrew | it - Italian | nl - Dutch | sk - Slovak | zh - Chinese |
bg - Bulgarian | en - English | hi - Hindi | ja - Japanese | pa - Punjabi | sv - Swedish | |
cs - Czech | es - Spanish | hr - Croatian | ko - Korean | pl - Polish | tr - Turkish | |
da - Danish | fi - Finnish | hu - Hungarian | lt - Lithuanian | pt - Portuguese | uk - Ukrainian | |
de - German | fr - French | id - Indonesian | nl - Dutch | ru - Russian | vi - Vietnamese |
Sample call
- cURL
- Python SDK
- Python
curl -X POST https://api.geneea.com/v3/analysis \
-H 'Authorization: user_key <YOUR USER KEY>' \
-H 'Content-Type: application/json' \
-d '{
"id": "1",
"text": "The trip to Innsbruck was great.",
"analyses": ["language"]
}'
## On Windows, use \" instead of " and " instead of '
from geneeanlpclient import g3
requestBuilder = g3.Request.Builder(analyses=[g3.AnalysisType.LANGUAGE])
with g3.Client.create(userKey=<YOUR USER KEY>) as analyzer:
result = analyzer.analyze(requestBuilder.build(
id=str(1),
text='The trip to Innsbruck was great.'
))
print(result.language.detected)
import requests
def callGeneea(input):
url = 'https://api.geneea.com/v3/analysis'
headers = {
'content-type': 'application/json',
'Authorization': 'user_key <your user key>'
}
return requests.post(url, json=input, headers=headers).json()
responseObj = callGeneea({
'id': '1',
'text': 'The trip to Innsbruck was great.',
'analyses': ['language']
})
print(responseObj)
Priors
If you know your texts can be only in certain languages, you can specify a prior: a single language or a combination of several languages. Currently, the supported priors are:
cs,de | cs,en,sk | cs,de,es,nl,pl | de,en,es,nl | es,nl,pl |
cs,en | cs,es,nl | cs,en,es,nl,pl | de,en,es,pl | nl,pl |
cs,es | cs,es,pl | cs,de,en,es,nl,pl | de,en,nl,pl | en,zh |
cs,nl | cs,nl,pl | de,en | de,es,nl,pl | |
cs,pl | cs,de,en,es | de,es | en,es | |
cs,sk | cs,de,en,nl | de,nl | en,nl | |
cs,de,en | cs,de,en,pl | de,pl | en,pl | |
cs,de,es | cs,de,en,sk | de,en,es | en,es,nl | |
cs,de,nl | cs,en,es,nl | de,en,nl | en,es,pl | |
cs,de,pl | cs,en,es,pl | de,en,pl | en,nl,pl | |
cs,en,es | cs,es,nl,pl | de,es,nl | en,es,nl,pl | |
cs,en,nl | cs,de,en,es,nl | de,es,pl | es,nl | |
cs,en,pl | cs,de,en,es,pl | de,nl,pl | es,pl | EU |
Use the prior exactly as written above (the same order, no spaces) and
pass it via the options
parameter:
- cURL
- Python SDK
- plain Python
curl -X POST https://api.geneea.com/v3/analysis \
-H 'Authorization: user_key <YOUR USER KEY>' \
-H 'Content-Type: application/json' \
-d '{
"id": "1",
"text": "The trip to Innsbruck was great.",
"options": {"lang_prior":"en,nl"}',
"analyses": ["language"]
}'
## On Windows, use \" instead of " and " instead of '
from geneeanlpclient import g3
requestBuilder = g3.Request.Builder(analyses=[g3.AnalysisType.LANGUAGE], customConfig={'options': {'lang_prior': 'en,nl'}})
with g3.Client.create(userKey=GENEEA_API_KEY) as analyzer:
result = analyzer.analyze(requestBuilder.build(
id=str(1),
text='The trip to Innsbruck was great.'
))
print(result.language.detected)
import requests
def callGeneea(input):
url = 'https://api.geneea.com/v3/analysis'
headers = {
'content-type': 'application/json',
'Authorization': 'user_key <your user key>'
}
return requests.post(url, json=input, headers=headers).json()
responseObj = callGeneea({
'id': '1',
'text': 'The trip to Innsbruck was great.',
'options': {'lang_prior':'en,nl'},
'analyses': ['language']
})
print(responseObj)
Customization
We can customize our language detection to your needs. Maybe your emails contain error messages in English, or product names sounding French, etc.