[MAGNOLIA-585] export and import add new line breaks Created: 26/Oct/05  Updated: 23/Jan/13  Resolved: 26/Jan/06

Status: Closed
Project: Magnolia
Component/s: None
Affects Version/s: 2.1.3
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Philipp Bärfuss Assignee: Philipp Bärfuss
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Bug DoR:
[ ]* Steps to reproduce, expected, and actual results filled
[ ]* Affected version filled
Date of First Response:

 Description   

from the user-list:

there are some corrupt of data happens. Example:

Exported via gui content:

website.camelot.main.xml {
... skip ...
<sv:property sv:name="text" sv:type="String">
<sv:value>text text text text<span
id="Camelot">TextTex</span>&nbsp;&mdash; text
text text text text text text.</sv:value>
</sv:property>
... skip ...
}

And when i try to import this file, it's all fine, but i'v got this in
html:

{
text text text text<span <br/>
id="Camelot">TextTex</span>&nbsp&mdash text <br/>
text text text text text text.
}

Why magnolia inserts <br/> in every place, where string ends(\n)?
Is this bug, or may be feature?
I version 2.1 there was all good with this becouse, i think, exported
file wasn't contain any "\n".



 Comments   
Comment by Stojan Peshov [ 12/Dec/05 ]

I got this problem too...
And the problem is in the xml formatter definitely

When tried export via /.magnolia/mgnl-export without "format xml" the xml was OK,
but the newlines ( <br/> ) were gone, i guess it's because of the absence of formatter.

The one thing that bothers me is that this problem does not occur localy (with tomcat on my computer at home)
it only occurs on the magnolia that is hosted on sun application server
and it happened after a month of use, in the begining everything was OK

could this be some clue?

this bug is urgent for me cause it has slow down our development
because of the lack of export

Comment by Markus Strickler [ 12/Dec/05 ]

I haven't looked into this too deeply, but it seems that magnolia converts all <br/> to Newlines when storing them in the repository on content creation. That's why one has to convert newlines back to <br/>s on display.

This is generally a rather unclean solution, and sooner or later leads into trouble somewhere. Especially when it comes to import/export. In XML Whitespace is not preserved outside of CDATA sections, so one cannot rely on newlines staying in the right places on import/export.

I would suggest to always store the <br/>s in the repository. This most likely will require modifications to the dialogs/controls for editing textfields and to templates currently converting newlines into <br/>s.

Comment by Ramon Buckland [ 11/Jan/06 ]

My guess without looking at any code
is that it's to do with the JDK version in use and
that it is occuring in the JackRabbit impl and not in Magnolia.

We have had a problem with a libray that uses the XMLSerializer from Apache. (org.apache.xml...XMLSerlializer)
With the XMLSerializer, you have to supply an OutputFormat
By default, the indentation is set to 0 (No line wrapping). (as per the API docs)

When we ran this library in BEA JRockit (1.5.0) (Weblogic 9) it
defaulted the "indentation" to a value of 55 linewidth (instead of 0)

Our code had to change to force the linewidth 0.

OutputFormat of = new OutputFormat("xml", "UTF-8", true);
of.setIndent(1); // and indent please
of.setLineWidth(0); // and no line wrapping .. this was BAD.. in BEA, J 1.5.0, it defaulted to a 55 value!!!

My thought, (given the response from Stojan Peshov that it only happens on the Sun Server)
that there is a similar issue in the JDK being used. ie, the <br/> problem is environmental
and not "code" related per se.

This indentation problem described above also affects the export and Import of XML nodes
on BEA Weblogic 9 in Magnolia. We have lived with it by watching for it.

BTW, we have Magnolia running in Weblogic 9 in production. (what an effort that was

Hope that helps, may be a bit off centre.
ramon

Comment by Philipp Bracher [ 26/Jan/06 ]

for compatibility reasons we store the <br> and <p> tags as linebreaks. In future versions we should avoid this, but it is not done easely without corrupting content. There is no distinction between Strings and HTML in the repositoriy and therefore it's impossible to write a converter working in all scenarios.

For example we would still have this import export problmem if one stores text out of a textarea which can have pure linebreaks.

What we could do is to manipulate the imported and exported text (using /n for example) but since this bug report is related to the enviroment we won't fix it.

In general the import and export is working fine.

Comment by Michael Aemisegger [ 26/Jan/06 ]

So what will you do in the future? I didn't get it. Will you avoid it or not fix it?

Personally, I avoided this conversion from the beginning and used my own store logic. I cannot see, how magnolia can be smart enough to decide which conversion to apply in which scenario. I have users with all the variety of operating systems and browsers and never had a problem due to linebreaks. What goes in must come out, I think.

Would it be an option to fix the bug (avoid conversion) and do a only partially perfect one-time conversion? You have to write one-time converters for the 2.2 or 3.0 release anyway.

Comment by David Smith [ 26/Jan/06 ]

Personally I don't think this is really a environment problem. XML considers spaces, tabs, and carriage returns/line feeds as generic white space. So does HTML. If the line break is to be preserved from export to import, it needs to be encoded in some manner so as not to be confused with generic XML whitespace. The benefit is the file could be pushed though a XSLT transform without corupting the data or worrying about whether the line breaks are preserved. Maybe a <sv:br/> tag?

At least that's my two cents on the issue.

Generated at Mon Feb 12 03:18:53 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.