Input preprocessing

The following preprocessing can be performed before analysis. The parameters

Text extraction - extract plain text from various data formats.
Content extraction - extract main content from an html page. Use the Request's htmlExtractor parameter with these values:
- default
- article
- keep-everything
Sentence segmentation - in the media domains, any linebreak (newline character) separates sentences. In the other domains, at least two linebreaks, possibly separated by other whitespace, are needed.
Spelling correction - fixes some common spelling errors. Correction is automatically run when the Request's textType parameter is set to casual.
Adding diacritics - adds diacritical marks to a text written without them. Currently only Czech is supported. Use the Request's parameter diacritization with these values:
- none - do nothing
- auto - add diacritics if necessary
- yes - add diacritics
- redo - remove and then add diacritics