Details
-
Bug
-
Resolution: Workaround exists
-
Major
-
None
-
5.7.6
-
None
-
None
-
JBoss 7.2.4. MySQL 5.5.68-MariaDB
Description
Steps to reproduce
- Create page with a non-ASCII Unicode character like � in title.
🏠
- Publish page.
- In server.log there is following exception:
2020-11-16 10:55:41,310 INFO [stdout] (default task-55) Caused by: org.xml.sax.SAXParseException: Character reference "�" is an invalid XML character.
We have this problem also when we want to import this page.
Development notes
On tomcat and also with Magnolia 5.7.5 we haven't this problem. We found following link from redhat: https://issues.redhat.com/browse/JBEAP-14542?attachmentViewMode=list&_sscc=t
So we tried out to set DB to UTF-8 4 Byte. But this doesn't solve the problem.
Workaround
we have found that the issue comes from a bug in xalan library provided in JBoss XALANJ-2560. Then, we've been trying to exclude this library by excluding xalan module in the deployment descriptor as well as creating a new module with the correct xalan lib. None of them have worked for us though, perhaps from Jboss side you could get a working approach.
As a workaround to get it working till a working way to exclude xalan is found, you could:
- Replace the existing jars
- <JBOSS_HOME>/modules/org/apache/xalan/main/xalan-2.7.1.jbossorg-1.jar
- <JBOSS_HOME>/modules/org/apache/xalan/main/serializer-2.7.1.jbossorg-1.jar
with the equivalents in: https://ftp.cixug.es/apache/xalan/xalan-j/binaries/.
- Modify module.xml to point at the new jars:
<JBOSS_HOME>/modules/org/apache/xalan/main/module.xml
<?xml version="1.0" encoding="UTF-8"?> <module xmlns="urn:jboss:module:1.5" name="org.apache.xalan"> <resources> <resource-root path="serializer.jar"/> <------- <resource-root path="xalan.jar"/> <-------- </resources> <dependencies> <module name="javax.api"/> </dependencies> </module>
This way, we managed to get rid of the issue with the unicode characters.