Highlighting
Highlighting shows matching query terms in context, enabling you to display relevant snippets in search results with matched terms emphasized.
Solr Documentation: Highlighting
Configuration Approaches
from taiyo.parsers import ExtendedDisMaxQueryParser
from taiyo.params import HighlightParamsConfig
parser = ExtendedDisMaxQueryParser(
query="python programming",
query_fields={"title": 2.0, "content": 1.0},
configs=[
HighlightParamsConfig(
fields=["title", "content"], fragment_size=150, snippets_per_field=3
)
],
)
from taiyo.parsers import ExtendedDisMaxQueryParser
parser = ExtendedDisMaxQueryParser(
query="python programming", query_fields={"title": 2.0, "content": 1.0}
).highlight(fields=["title", "content"], fragment_size=150, snippets_per_field=3)
Basic Usage
parser = ExtendedDisMaxQueryParser(
query="python programming", query_fields={"title": 2.0, "content": 1.0}
).highlight(fields=["title", "content"], fragment_size=150, snippets_per_field=3)
results = client.search(parser)
# Access highlights
for doc in results.docs:
print(f"\n{doc.title}")
if results.highlighting and doc.id in results.highlighting:
highlights = results.highlighting[doc.id]
if "content" in highlights:
for snippet in highlights["content"]:
print(f" ...{snippet}...")
Key Parameters
HighlightParamsConfig(
fields=["title", "content"], # Fields to highlight
fragment_size=150, # Snippet size in characters
snippets_per_field=3, # Max snippets per field
simple_pre="<em>", # Opening tag
simple_post="</em>", # Closing tag
require_field_match=True, # Only highlight matched fields
method="unified", # Highlighter method: unified, original, fastVector
max_analyzed_chars=51200, # Maximum characters to analyze
# Unified Highlighter options
bs_type="SENTENCE", # Boundary scanner type
bs_language="en", # Language for boundary detection
# Multi-term support
multiterm=True, # Highlight wildcard/fuzzy matches
)
Refer to the Apache Solr documentation for the full list of parameters and defaults.
Handling Results
Highlighting responses are available through SolrResponse.highlighting:
SolrResponse.highlighting: Dictionary mapping document IDs to highlighted field snippets- Each document's highlights contain field names mapped to lists of snippet strings
- Query terms are wrapped in the configured tags (default:
<em>tags)
Example:
from taiyo.parsers import ExtendedDisMaxQueryParser
parser = ExtendedDisMaxQueryParser(
query="python programming", query_fields={"title": 2.0, "content": 1.0}
).highlight(
fields=["title", "content"],
fragment_size=150,
snippets_per_field=3,
simple_pre="<mark>",
simple_post="</mark>",
)
results = client.search(parser)
for doc in results.docs:
print(f"\nDocument: {doc.id}")
print(f"Title: {doc.title}")
if results.highlighting and doc.id in results.highlighting:
highlights = results.highlighting[doc.id]
# Title highlights
if "title" in highlights:
print("\nTitle Match:")
print(f" {highlights['title'][0]}")
# Content highlights
if "content" in highlights:
print("\nContent Snippets:")
for i, snippet in enumerate(highlights["content"], 1):
print(f" [{i}] ...{snippet}...")
Next Steps
- Learn about Faceting for aggregations
- Explore Grouping for result organization
- See Query Parsers for search options