Guide

Our Media API offers the following functionalities:

  • Article analysis: We perform detailed analysis of the text of the article, returning

    • Entities (see here): Important expressions, both named (e.g., organizations, cities) and unnamed (e.g., dates).

    • Derived Entities are entities not explicitly mentioned in the text, but they can be derived with the help of our knowledge base (see below). For example, we can return countries and regions for locations (returning Germany when the article mentions Berlin), industries for organization (returning automotive for Toyota), etc.

    • IPTC Media Topics are an industry-standard taxonomy used for categorizing articles by content. The current version consists of over 1200 categories (e.g., sport, basketball, music, classical music, etc.) organized into a hierarchy of up to 5 levels. For more detail, see this article.

    • Semantic tags (see here) : Semantic tags are entities, keywords or concepts relevant for the article. We rank and standardize them based on their purpose. For a non-technical overview, see this page and this case study.

    • Sentiment (see here): We perform sentiment analysis for the whole article, individual sentences and entities.

  • Recommendation of related content: based on the content of the article, we recommend

    • Relevant photos (see here) from multiple photobanks, including our partners’ photos, big and small third-party photobanks, public photos from Wikipedia, etc.

    • Related articles – articles about similar topics

  • Knowledge base (see here): All the returned tags and entities are linked to the Geneea Knowledge Base (GKB, Geneea KB). GKB combines existing open data (wikidata, DBpedia, OpenStreetMap, company registries, etc.) with our own private resources. GKB also supports custom properties (e.g. your internal ids) and items.

  • Localized Entities and Tags (see here): all entities and tags in analysis and recommendations can be presented in a language of your choice.

  • Feedback (see here): The quality and preferences can be automatically tuned based on the feedback for the user.

  • Integration: Typically our API is called from the publisher’s CMS. The journalists review the suggested tags. The API can be also used in a completely automated pipeline without any supervision.

  • Customization: Typically, the Media models are thoroughly customized (entity ids and labels, number of tags, preference of certain types of tags, creation of new tags, etc.)