Page tree
Skip to end of metadata
Go to start of metadata

Purpose

The Problem

The fact that Magnolia stores MetaData on a separate subnode creates additional effort at various places

  • makes complex queries (JCR_SQL2/JQOM) more complicated (if not outright impossible) - we have to use joins to access MetaData (e.g. for sorting by values in metadata)
  • doubles the amount of nodes in repo
  • duplicates properties jcr:created and jcr:createdBy
  • forces to treat MetaData separately in Java
  • makes exported xml harder to read

Goals

Simplify accessing MetaData information.

Proposal

Design

The MetaData subnode hosts a range of properties - this chart details what will happen with them:

old propertynew property (defined by)remark
MetaData/mgnl:title<none>not in use anymore, will not be supported
MetaData/mgnl:creationdatemgnl:created (mgnl:created)can't use jcr:created -> it's AUTOCREATED and PROTECTED
<none>mgnl:createdBy (mgnl:created)new property keeping who originally created something (as with mgnl:created it will be set by us)
MetaData/mgnl:templatetype<none>not in use anymore, will not be supported
MetaData/mgnl:lastactionmgnl:lastActivated (mgnl:activatable)

 

MetaData/mgnl:activatoridmgnl:lastActivatedBy (mgnl:activatable) 
MetaData/mgnl:activatedmgnl:activationStatus (mgnl:activatable)boolean - jcr is not powerful enough to have it auto-generated
MetaData/mgnl:templatemgnl:template (mgnl:renderable)we'll create an additional mixin: mgnl:renderable
MetaData/mgnl:authoridmgnl:lastModifiedBy (mgnl:lastModified) 
MetaData/mgnl:lastmodifiedmgnl:lastModified (mgnl:lastModified) 
MetaData/mgnl:commentmgnl:comment (mgnl:versionable) 

 

Even though JCR defines a mixin, mix:lastModified, we use our own mixin to have greater control. The JCR specification does not clearly define when these properties should be updated, JCR 3.7.11.8.

So the MetaData subnode will be replaced by the following:

[mgnl:lastModified]
  MIXIN

  • mgnl:lastModified (DATE)
  • mgnl:lastModifiedBy (STRING)

[mgnl:activatable]

  MIXIN

  • mgnl:lastActivated (DATE)
  • mgnl:lastActivatedBy (STRING)
  • mgnl:activationStatus (STRING)

[mgnl:renderable]
  MIXIN

  • mgnl:template (STRING)

[mgnl:created]
  MIXIN

  • mgnl:created (DATE)
  • mgnl:createdBy (STRING)

[mgnl:versionable]
  MIXIN

  • mgnl:comment (STRING)

Implementation details

agreed

  • single class NodeTypes hosting our mixins as inner classes
    • Mixins define the constants for the names of the properties defined by the corresponding mixing
    • Mixins define convenience methods to set properties
      • Property setting methods on Mixins will check for appropriate node type - no checks on legacy MetaData + MetaDataUtil
  • constants for jcr-1.0 defined NodeTypes or Properties can be taken from JcrContants (jackrabbit common)
    • if we realize we'd need to keep jcr-2.0 defined stuff, we'd add an interface for it

to be discussed

  • MetaData will be adapted to be backwards compatible (use new property names and wrap workingNode - no longer a subnode)
  • Property-files for PropertyImportExport (e.g. for tests) have to be adapted -> provide script to detect?

Migration

This step would require a very well prepared migration. Hence we'd have to consider:

  • migration effort
    • update custom node types
    • add mixin
    • remove subnode
  • update utils and add filter to filter the mgnl:* properties
  • search in templates (and other places)
    • will have to be re-written
    • have a query-wrapper which warns
  • STK
    • we have to update all models and utils which used meta data (mainly templates)
  • Communication
    • this is considered a change in the templating
    • we have to communicate that
  • Provide a crawler
    • similar as in integration test
    • check there's no warnings (related to this change)

Content Migration

  • bootstrap files
    • filter which changes the files on the fly
    • put warnings in the log
    • Q: can we provide a tool for converting the bootstrap files and get them cleaned once and for all?
    • A: will try to do so
  • update task for migrating live content, in all workspaces
    • not using recursion (some clients have deep hierarchy)
    • safe after each page modification)

Hint: similar migration was done between Magnolia 3.5.9 and 3.6 - see 

Error rendering macro 'jira'

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

 for details.

Backwards compatibility in Templating

TemplatingFunctions.metaData(Content/Node)

JspTemplatingFunctions.metaDataProperty(Node/ContentMap, String)

JspTemplatingFunctions.metaDataTemplate(Node/ContentMap)

Templates query for properties using the old names, both with and without the prefixes.

Templates that access the MetaData nodes directly will not work.

Templates that access the MetaData via ContentModel will still work, they will get the old MetaData object.

Q: Do we need to support both styles of property names and map old ones to new names?
A: we'll support it but warn users there's old style templates. Consider refactoring the above three methods to use common code base.

Lifecycle operations

When a node is deleted the mgnl:deleted mixin is added, it has properties mgnl:deletedBy and mgnl:deletedOn.
Q: Should these properties be defined on the mixin in nodetypes xml?
A: Yes

Q: Can we rename mgnl:deletedOn to mgnl:deleted?
A: we'll use consistent naming for our Date properties (lastModified, lastActivated, created, deleted)

When a node is versioned or deleted we add a property mgnl:comment on the MetaData node

Q: Where do we store this now? Should it be part of the mixins?
A: mixin mgnl:versionable -> property mgnl:comment (String) (Do a text search to find occurencies)

Activation

info.magnolia.cms.exchange.ActivationUtil will be used to set and query properties related to activation.

Versioning

old propertynew property (defined by)remark
mgnl:commentmgnl:commentjust added explicitly

[mgnl:versionable]
  MIXIN

  • mgnl:comment (STRING)

Deleted nodes

old propertynew property (defined by)remark
mgnl:deletedOnmgnl:deletedconsistency - see other date properies
mgnl:deletedBymgnl:deletedByjust added explicitly
mgnl:commentmgnl:commentjust added explicitly

[mgnl:deleted]
  MIXIN

  • mgnl:deleted (DATE)
  • mgnl:deletedBy (STRING)
  • mgnl:comment (STRING)

Modules to check and update

<TO BE VERIFIED>

  • Observation: be sure to consider Jan's comment: we need a listener excluding mgnl:* (and maybe jcr:lastModified*)
  • STK
  • Form
  • Mail
  • InplaceTemplating
  • Data
  • Scheduler 

API Changes

  • info.magnolia.jcr.util.NodeUtil#ALL_NODES_EXCEPT_JCR_FILTER deprecated since there are node node types with jcr prefix
  • info.magnolia.jcr.util.NodeUtil#EXCLUDE_META_DATA_FILTER deprecated since the meta data nodes wont be there and there are no node types with jcr prefix
  • info.magnolia.jcr.MgnlNodeTypeNames introduced containing constants for node types
  • info.magnolia.jcr.MgnlPropertyNames introduced containing constants for property names
  • info.magnolia.cms.core.MgnlNodeType deprecated replaced with MgnlNodeTypeNames and MgnlPropertyNames
  • info.magnolia.cms.core.MetaData deprecated and replaced by NodeUtil and ActivationUtil

Q: What do we do with the possibility of setting arbitrary properties on the MetaData node? The API supports this. Is it enough to allow the supported parameters and fail on unknown names?
A: Will fail if someone passes in "unknown" property name.

Q: What is the state of audit logging? We log modification as we update modification on nodes
A: Needs to be kept

Note, there will places where we have to exclude jcr properties. Places where this has been done includes MgnlUserManager and MgnlGroupManager.

Note, the MetaData class will remain as long as we keep the content API. 

Won't do's

Note: This section only contains "decisions" that are not yet integrated in their proper section are cannot be integrated because they were dropped.

  • Wouldn't that be a good occasion to switch to CND notation (see  Unable to locate Jira server for this macro. It may be due to Application Link configuration. JR Website for additional information)?
    • will not be considered right now
  • Should we also introduce a more enhanced/flexible activation status using jcr calculated values (would finally enable things like sort per activationStatus)?
    • judging from a post on the jr mailing list it's not supported out-of-the-box
    • we won't solve it with the current step

Links

 

 

 

  • No labels

29 Comments

  1. mgnl:template should definitely be moved as well. If we want to be thorough, we could consider having that property be part of a different mixin, since it doesn't apply to data or dms nodes, for example.

    Switching to CND notation, while interesting, seems fairly out of scope, IMHO.

    Additionally, I'd like to point out there are more cases where we could/should use namespaced properties, i.e any property which isn't editor/author centric, but needed by the system. Can't find a good example right now, but … they exist.

    Ho, also: Proposal - prefixing or namespacing of certain properties in paragraphs

    1. Yes, I support having an extra mixin if a node is renderabel (for mgnl:template) --> mgnl:renderable

      Prefix: is a must, otherwise we will conflict with content properites

  2. Can we require a mixin? I mean can we say that each node of type mgnl:content needs the mixin mgnl:metaData?

    1. What I would like is such a node type definition.

    2. Yes. We already to that with mix:referenceable and others(wink)

  3. I'm not sure I understand why content migration would be that complicated - a visitor should visit all workspaces starting at all roots and copy/move the properties. Now the one thing that might be a little tricky is the update of the node type definitions, and we might have to go through a "temporary" definition where the MetaData node nor the md mixin are not mandatory, so we can remove and create them respectively. And that might require the usage of repo-specicific APIs ? (if that's the case, let's do the right thing and move that to the Provider interface - I think you'll find examples of similar changes in the forum module's 2.0 branch, where it's possible that I used Jackrabbit-specific code, which you'll want to avoid)

    1. Migrating the content is definitely doable. The bigger concern was about the changes it causes for queries. Some of the template models would need a change because of that. 

      1. Hmm indeed, that sounds like it'll be a 1-by-1 process...

  4. 2 more things to think about:

    • on activation, we should refuse activating from/to older versions due to content mismatch
    • restore from older version - do we update content on the fly or do we refuse it?
  5. Just realized there is one more issue that could really easily blow whole instance into banana pile when we do this change - observation. Quite often we update metadata after content modification and right now if you trigger content modification by observation, you just need to exclude metatada to not make your observer go into infinite loop. This will NOT work after the change. You would need to exclude properties individually (sad) 

    1. You mean if we had a meta data updating observation in place? do we have such a 'thing'?

      Would help to have a concrete use case for the discussion. 

      1. Most often used case for observation (and even in our code - OnAir) is automatic activation. Activation is triggered by page or paragraph/component modification, once page is activated, metadata are updated. To prevent infinite loop, one could right now just not react on MetaData modifications. All this code would be affected and would have to be modified after migration. The least we can do is provide listeners that would provide option to not react on metadata props (or ideally on all props brought in by any mixin)

  6. MetaData/mgnl:activatedmgnl:activationStatus (mgnl:activatable)boolean - TO BE CHECKED whether it can be jcr-calculated

     

    This flag is wrong. There's few issues in JIRA about how wrong it is. We need more then boolean, e.g. for scheduled activations, for things held by workflow, etc.

    1. We thought that this information would go into another mixin which would probably define a more complex structure (with sub node). something like mix:trackable or mix:audit
      The activationStatus property would hold the (gree, orange, red) value to make searching and sorting possible. We hope to make that a calculated property on jcr level. 

      1. Unfortunately there is only so much you can achieve with single property. The most common complains i hear from customers are:

        • status does not reflect reality for scheduled activations
        • status end up being wrong for concurrent activation/modification
        • there is no status to indicate workflow in progress (which indeed can be solved by other mixin)
        • on top of that (but has to solved elsewhere) status is often wrong: 
          • when restoring version
          • when importing content
          • when copying content
  7. MetaData/mgnl:authoridjcr:lastModifiedBy (mix:lastModified)use better, self-explanatory, name - be consistent with other field

     

    Did you consider that if we use jcr property instead, we run out of possibility to set correct value for all modifications done in system context?

    1. Yes, and importing might conflict too.

  8. mgnl:created mixin needs both date and user id not just the date

    1. Can u explain why? Right now we don't have a mgnl:creationBy or similar thing: Current mgnl:authorid will be replaced by jcr:lastModifiedBy and then there's the two protected ones introduced by mix:created - they would stay as is.

      1. Thought I already did in the meeting. We can't modify jcr:createdBy and by relying on it we can't set correct user ID when creating content in system context.

  9. yeah, now i remember - there is another task related to this change - we should exclude jcr:createdBy and other props we don't plan to use from indexes. Why? First to keep indexes leaner and faster and second to make sure that we don't get false positives when searching.

  10. @Tobias: Q: Should these properties be defined on the mixin in nodetypes xml? - yes

  11. Q: What is the state of audit logging? We log modification as we update modification on nodes

    We have customers who actively use audit logging to track modifications for legal purposes (half of bank/insurance industry customers)

    Note, there will places where we have to exclude jcr properties. Places where this has been done includes MgnlUserManager and MgnlGroupManager.

    more and more this comes up, more I'm convinces that we should not reuse jcr: props for anything but rely on ours exclusively

  12. "at" would only be more appropriate if the value was only a time. "on" is more correct for dates.

     

     

    1. yup - that's what we also realized just after I saved the above (wink) Thx!

    1. Good point - but I assume we would just get inspired by these and create our custom ones (as with all the others), no?

    2. Tracked as SCRUM-1879