Language Detection
Language detection can be called separately or as part of other functions.
Recognized Languages
Our default mode distinguishes 31 languages:
| ar - Arabic | el - Greek | he - Hebrew | it - Italian | nl - Dutch | sk - Slovak | zh - Chinese |
| bg - Bulgarian | en - English | hi - Hindi | ja - Japanese | pa - Punjabi | sv - Swedish | |
| cs - Czech | es - Spanish | hr - Croatian | ko - Korean | pl - Polish | tr - Turkish | |
| da - Danish | fi - Finnish | hu - Hungarian | lt - Lithuanian | pt - Portuguese | uk - Ukrainian | |
| de - German | fr - French | id - Indonesian | nl - Dutch | ru - Russian | vi - Vietnamese |
Sample Call
- cURL
- Python SDK
- Python
curl -X POST https://api.geneea.com/v3/analysis \
-H 'Authorization: user_key <YOUR USER KEY>' \
-H 'Content-Type: application/json' \
-d '{
"id": "1",
"text": "The trip to Innsbruck was great.",
"analyses": ["language"]
}'
## On Windows, use \" instead of " and " instead of '
from geneeanlpclient import g3
requestBuilder = g3.Request.Builder(analyses=[g3.AnalysisType.LANGUAGE])
with g3.Client.create(userKey=<YOUR USER KEY>) as analyzer:
result = analyzer.analyze(requestBuilder.build(
id=str(1),
text='The trip to Innsbruck was great.'
))
print(result.language.detected)
import requests
def callGeneea(input):
url = 'https://api.geneea.com/v3/analysis'
headers = {
'content-type': 'application/json',
'Authorization': 'user_key <your user key>'
}
return requests.post(url, json=input, headers=headers).json()
responseObj = callGeneea({
'id': '1',
'text': 'The trip to Innsbruck was great.',
'analyses': ['language']
})
print(responseObj)
Priors
If your input is limited to specific languages, you can specify a language prior—either a single language or a set.
Supported priors include:
| cs,de | cs,en,sk | cs,de,es,nl,pl | de,en,es,nl | es,nl,pl |
| cs,en | cs,es,nl | cs,en,es,nl,pl | de,en,es,pl | nl,pl |
| cs,es | cs,es,pl | cs,de,en,es,nl,pl | de,en,nl,pl | en,zh |
| cs,nl | cs,nl,pl | de,en | de,es,nl,pl | |
| cs,pl | cs,de,en,es | de,es | en,es | |
| cs,sk | cs,de,en,nl | de,nl | en,nl | |
| cs,de,en | cs,de,en,pl | de,pl | en,pl | |
| cs,de,es | cs,de,en,sk | de,en,es | en,es,nl | |
| cs,de,nl | cs,en,es,nl | de,en,nl | en,es,pl | |
| cs,de,pl | cs,en,es,pl | de,en,pl | en,nl,pl | |
| cs,en,es | cs,es,nl,pl | de,es,nl | en,es,nl,pl | |
| cs,en,nl | cs,de,en,es,nl | de,es,pl | es,nl | |
| cs,en,pl | cs,de,en,es,pl | de,nl,pl | es,pl | EU |
Use the prior exactly as listed above (same order, no spaces), and
pass it via the options parameter:
- cURL
- Python SDK
- plain Python
curl -X POST https://api.geneea.com/v3/analysis \
-H 'Authorization: user_key <YOUR USER KEY>' \
-H 'Content-Type: application/json' \
-d '{
"id": "1",
"text": "The trip to Innsbruck was great.",
"options": {"lang_prior":"en,nl"}',
"analyses": ["language"]
}'
## On Windows, use \" instead of " and " instead of '
from geneeanlpclient import g3
requestBuilder = g3.Request.Builder(analyses=[g3.AnalysisType.LANGUAGE], customConfig={'options': {'lang_prior': 'en,nl'}})
with g3.Client.create(userKey=GENEEA_API_KEY) as analyzer:
result = analyzer.analyze(requestBuilder.build(
id=str(1),
text='The trip to Innsbruck was great.'
))
print(result.language.detected)
import requests
def callGeneea(input):
url = 'https://api.geneea.com/v3/analysis'
headers = {
'content-type': 'application/json',
'Authorization': 'user_key <your user key>'
}
return requests.post(url, json=input, headers=headers).json()
responseObj = callGeneea({
'id': '1',
'text': 'The trip to Innsbruck was great.',
'options': {'lang_prior':'en,nl'},
'analyses': ['language']
})
print(responseObj)
Customization
We can tailor language detection to your specific content. For example, if your emails contain Endlish error messages, or French-sounding product names, we can adjust detection accordingly.