Article Search Filters
By default, the research and search/recommendation agents use all articles in a dataset to find information relevant for answering questions.
This search can be restricted by specifying article filters in the articleFilters field
of ResearchRequest and RecommendRequest, respectively.
Article filters are constraints applied to the metadata fields indexed with the articles.
If the metadata fields satisfy the constraints, the article is eligible for information retrieval; otherwise, it is ignored.
- It is only possible to filter by fields that were actually indexed, and these fields must be configured for each API key before the articles are sent to the Indexing API.
- There is one exception: the filters may always contain the special
articleIdkey to enable filtering by article IDs, for example:"articleId": "(\\"1\\" OR \\"2\\" OR \\"3\\" OR \\"4\\")"
The articleFilters field in the API request is a map where keys are the names of the indexed article or paragraph metadata fields and values are constraints for the indexed field values in a Solr-like format.
Constraint types:
- Primitive value constraint
The constraint value must not start and end with brackets (
[,{,() and their closing counterparts (),},]). Such a value is interpreted as a primitive type value (e.g., a stringfooor a number1), and an exact match of the indexed field's value is required. All characters are interpreted literally, so you should not use escaping. - Date-range query
The constraint value is a date range such as:
[2025-01-01T00:00:00Z TO NOW\]The indexed field's value must be a date that falls within the range. - Negated query (values that must not match) Example: (-"europe")
- Query with logical OR
Example:
("Donald Tusk" OR "Donald Trump" OR "Donald Duck") - Complex query
Example:
(Donald AND \-"Trump" AND ("Walt Disney \\(company\\)" OR Hollywood))
Examples of constraints for various article metadata fields:
# requires publishDate filed to be after 2025-01-01 midnight UTC
"publishDate": "[2025-01-01T00:00:00Z TO NOW]"
# selects articles with "Donald Trump" entity
"entities": "Donald Trump"
# selects articles without "Donald Trump" entity
"entities": "(-\"Donald Trump\")"
# articles from "europe" but not "sport" sections
"sections": "(\"europe\" AND -\"sport\")"
Indexed fields types
In general, constraint values are assumed to be valid Solr filters and must be applicable to the type of the constrained field in the Solr index. For example, a date-range filter should only be used on date fields. A mismatch between constraints and the fields that they are applied to may result in API errors or unexpected behavior. Refer to the Solr documentation for more details.
Character escaping
Characters such as +, -, " and : have special meaning in constraint values and must be escaped when used literally.
Multiple filters
When filters for several indexed fields are specified, they must all be satisfied at the same time for an article to be selected. All field constraints are implicitly joined by the logical AND operator.
Advanced Filter
In addition to the flat articleFilters map described above, an advancedFilter field is available
on ResearchRequest, RecommendRequest,
and ArticleStatsRequest.
The advanced filter allows expressing complex, nested filter conditions using a recursive tree structure with logical operators.
The advancedFilter works in conjunction with articleFilters. When both are specified, an article must satisfy both the basic article filters and the advanced filter to be selected. The advancedFilter is mutually exclusive with articleIds.
Structure
An advanced filter is a recursive tree where each node is one of two types, distinguished by the type field:
Filter Literal (type: "literal")
A leaf node that applies a filter to a single metadata field.
| Property | Required | Type | Description |
|---|---|---|---|
type | True | String | Must be "literal". |
field | True | String | The name of the metadata field to filter on. |
filter | True | String | A Solr query applied to the field. This value is passed directly to the Solr query parser. |
negated | False | Boolean | Whether to negate this filter literal. Defaults to false. |
Filter Clause (type: "clause")
A branch node that combines multiple filter expressions with a logical operator.
| Property | Required | Type | Description |
|---|---|---|---|
type | True | String | Must be "clause". |
operator | True | String | Logical operator: "AND" or "OR". |
negated | False | Boolean | Whether to negate the entire clause. Defaults to false. |
clause | True | List[AdvancedFilter] | A non-empty list of nested filter expressions (literals or clauses). |
The clause list can contain any mix of filter literals and nested clauses, enabling arbitrary nesting depth.
The filter value in a filter literal is expected to be a Solr query. It is passed directly to the Solr query parser,
so it must follow the Solr query syntax.
The same character-escaping rules as for articleFilters apply.
Examples
Simple filter: single literal
Select only articles from the "europe" section:
{
"query": "What is the latest news?",
"advancedFilter": {
"type": "literal",
"field": "sections",
"filter": "europe"
}
}
Nested filter: combining conditions with AND / OR
Select articles from the "europe" section or published after 2025-01-01, and exclude articles tagged with the "sport" entity:
{
"query": "What is the latest news?",
"advancedFilter": {
"type": "clause",
"operator": "AND",
"clause": [
{
"type": "clause",
"operator": "OR",
"clause": [
{
"type": "literal",
"field": "sections",
"filter": "europe"
},
{
"type": "literal",
"field": "publishDate",
"filter": "[2025-01-01T00:00:00Z TO NOW]"
}
]
},
{
"type": "literal",
"field": "entities",
"filter": "sport",
"negated": true
}
]
}
}
Using advanced filter together with basic article filters
The advancedFilter and articleFilters can be used together. In this example, articleFilters restricts articles to those with the "economy" entity, while the advancedFilter adds a nested condition: the section must be "europe" or the publish date must be after 2025-01-01.
{
"query": "What is the economic outlook?",
"articleFilters": {
"entities": "economy"
},
"advancedFilter": {
"type": "clause",
"operator": "OR",
"clause": [
{
"type": "literal",
"field": "sections",
"filter": "europe"
},
{
"type": "literal",
"field": "publishDate",
"filter": "[2025-01-01T00:00:00Z TO NOW]"
}
]
}
}
Filter Validation
The POST /v1/articles/filter_check endpoint allows you to validate a filter configuration against a set of expected article IDs. This is useful for verifying that your filters match the articles you expect before using them in research or recommendation requests.
- Request: FilterValidationRequest
- Response: FilterValidationResponse
The request accepts both articleFilters and advancedFilter (which work in conjunction, as described above), along with a list of expectedIds — the article IDs you expect the filters to match.
The response contains:
matchedIds— the subset ofexpectedIdsthat matched the filters.totalCount— the total number of articles in the dataset matching the filters (not limited to the expected IDs).
Example
Request:
{
"expectedIds": ["article-1", "article-2", "article-3"],
"articleFilters": {
"sections": "europe"
},
"advancedFilter": {
"type": "literal",
"field": "entities",
"filter": "economy"
}
}
Response:
{
"matchedIds": ["article-1", "article-3"],
"totalCount": 42
}