[TXTREC-64] Support "by property name" in PageTextAggregator Created: 26/Aug/19 Updated: 26/Aug/22 |
|
| Status: | Open |
| Project: | Text Classification |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Neutral |
| Reporter: | Evzen Fochr | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Template: |
|
||||||||
| Acceptance criteria: |
Empty
|
||||||||
| Task DoD: |
[ ]*
Doc/release notes changes? Comment present?
[ ]*
Downstream builds green?
[ ]*
Solution information and context easily available?
[ ]*
Tests
[ ]*
FixVersion filled and not yet released
[ ] 
Architecture Decision Record (ADR)
|
||||||||
| Date of First Response: | |||||||||
| Epic Link: | Txt Classification | ||||||||
| Team: | |||||||||
| Description |
|
To overcome complexity of multifields and custom transformers we should implement "by property name" to PageTextAggregator |
| Comments |
| Comment by Antti Hietala [ 24/Sep/19 ] |
|
Today, you define by field type which content should be recognized. Basically you say "recognize content from all the richText fields" in your text classification config. What this ticket proposes a much more intuitive way. Instead, you would say "classify the Headline and Intro properties for pages" or "classify the Description and Body Text properties for tours". Content practitioners (marketers, authors) don't know about Magnolia field types. But they know what content matters for classification. |