[MGNLEESOLR-156] Make crawler maxOutgoingLinksToFollow configurable Created: 05/Mar/21 Updated: 20/Mar/23 Resolved: 08/Mar/21 |
|
| Status: | Closed |
| Project: | Solr Search Provider |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 5.5.5 |
| Type: | Improvement | Priority: | Normal |
| Reporter: | Milan Divilek | Assignee: | Milan Divilek |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Template: |
|
||||||||
| Acceptance criteria: |
Empty
|
||||||||
| Task DoD: |
[X]*
Doc/release notes changes? Comment present?
[X]*
Downstream builds green?
[X]*
Solution information and context easily available?
[X]*
Tests
[X]*
FixVersion filled and not yet released
[ ] 
Architecture Decision Record (ADR)
|
||||||||
| Story Points: | 1 | ||||||||
| Description |
|
Crawler4j by default limits parsed outgoing links from page to 5000, this doesn't have to be enough for example when using sitemap for all the urls to crawler. maxOutgoingLinksToFollow should be configurable from magnolia crawler configuration |