Details
-
Improvement
-
Resolution: Won't Do
-
Neutral
-
None
-
1.1.3, 1.2.1
-
None
-
None
Description
My fixes for MGNLRSSAGG-47 introduced spaces where there should be none. The attached screenshot is the "On the Blogs" part of the current (2.3/2.4) corporate website, http://www.magnolia-cms.com/. (One space is caused by the blog post, not by the tag removal, and thus not pointed at in the screenshot.) The problem is that all tags are converted to a space (" "). In some situations, that is not appropriate, and the result is ugly.
The original solution tackled the case of "... paragraph.</p><h1>Heading" becoming "... paragraph.Heading".
Solution proposal: Change:
?replace("</?[^>]*(>|$)| ?|&#?[0-9A-Za-z]*$", " ", "r")
into:
?replace("</[^>]*><[^/>]*>| ?", " ", "r")?replace("</?[^>]*(>|$)|&#?[0-9A-Za-z]*$", "", "r")
such that "</x><y>" and " " become " " (one space character), whereas any other tags, and end-of-string (partial) entities become "" (i.e. removed).
Checklists
Attachments
Issue Links
- supersedes
-
MGNLRSSAGG-47 Improve HTML removal in feed ftl's
-
- Closed
-
- mentioned in
-
Wiki Page Loading...