[MAGNOLIA-7022] Restore filtering of unwanted namespaces in export Created: 03/May/17  Updated: 06/Oct/17  Resolved: 07/Sep/17

Status: Closed
Project: Magnolia
Component/s: core
Affects Version/s: None
Fix Version/s: 5.5.7

Type: Task Priority: Major
Reporter: Roman Kovařík Assignee: Mikaël Geljić
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to MAGNOLIA-2756 Cleanup namespaces in some of our boo... Closed
relates to MAGNOLIA-6683 Offer YAML as alternative format for ... Closed
relates to MAGNOLIA-2960 Automatic cleanup of unwanted namespa... Closed
Template:
Acceptance criteria:
Empty
Task DoR:
Empty
Date of First Response:
Sprint: Saigon 111, Saigon 112
Story Points: 2

 Description   

The core node type definition contains following:

<nodeTypes
    xmlns:rep="internal"
    xmlns:nt="http://www.jcp.org/jcr/nt/1.0"
    xmlns:mix="http://www.jcp.org/jcr/mix/1.0"
    xmlns:mgnl="http://www.magnolia.info/jcr/mgnl"
    xmlns:jcr="http://www.jcp.org/jcr/1.0">

Some of the URIs seems obsolete/invalid.
This issue became more visible with MAGNOLIA-6683 which new export command doesn't go trough the magical filtering via info.magnolia.importexport.filters.MetadataUuidFilter#removeUnwantedNamespaces/validNs (as you can see, the filter name is completely unrelated) so the export now contains all the namespaces:

<sv:node xmlns:sv="http://www.jcp.org/jcr/sv/1.0" xmlns:rep="internal" xmlns:mix="http://www.jcp.org/jcr/mix/1.0" xmlns:jcr="http://www.jcp.org/jcr/1.0" xmlns:jcrfn="http://www.jcp.org/jcr/xpath-functions/1.0" xmlns:fn_old="http://www.w3.org/2004/10/xpath-functions" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:mgnl="http://www.magnolia.info/jcr/mgnl" xmlns:nt="http://www.jcp.org/jcr/nt/1.0" sv:name="travel">

Explanation about huge amount of namespaces when exporting might be find here MAGNOLIA-2756 although the solution there seems not correct as such filtering should not be hardcoded only by prefixes.
We should validate if the URIs are still valid and if should be still filtered out.

As a side note, new YAML bootstrap doesn't export any namespaces so there's not such problem.



 Comments   
Comment by Mikaël Geljić [ 06/Jul/17 ]

All of these are valid URIs, but not necessarily existing URLs. Some of them never existed; they don't have to. They can be URNs as well. —via https://www.w3.org/TR/REC-xml-names/#ns-decl

The attribute's normalized value MUST be either a URI reference — the namespace name identifying the namespace — or an empty string. The namespace name, to serve its intended purpose, SHOULD have the characteristics of uniqueness and persistence. It is not a goal that it be directly usable for retrieval of a schema (if any exists).

My understanding is that the old export command was filtering namespaces "incidentally" by using the MetadataUuidFilter (which was not this class' main job). I would still go back to the same approach for the new command, because for system-view, only sv & xsi make sense (doc-view would need to figure out which namespaces are actually used throughout the document). See description and comments on MAGNOLIA-2960, they're interesting in that regard.

For practical reasons, in my DocumentView command, I was doing the filtering within a jdom XMLOutputter; not sure where it fits best within the new command.

I would rephrase-repurpose this ticket to port the old 2960 fix to the new command.

Comment by Jan Haderka [ 22/Aug/17 ]

we filtered all namespaces out explicitly. As in "on purpose". Exactly because they were causing issues on import in instances with conflicting namespace registrations, installations where such namespaces were not allowed, instances where customers for whatever reason enabled validation of namespaces and lastly also because updating bootstrap files then generated unnecessary diffs when new namespaces appeared, just because they were registered in the instance from which the bootstrap file came. To keep it simple, I'd opt for again reintroducing strip-all "policy" as it worked quite well w/o any side effects as long as we adhered to it.

Comment by Mikaël Geljić [ 22/Aug/17 ]

Yes, all namespaces but sv and xsi were filtered explicitly; I meant "incidentally" because the MetadataUuidFilter was the one doing it—albeit not being its primary purpose (rather just because it was already there). The new command legitimately stopped using it, so the ns-filtering dropped off the wagon as well.

Got a PR incoming.

Generated at Mon Feb 12 04:20:00 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.