Page tree
Skip to end of metadata
Go to start of metadata

Concepts for the DEV ticket - https://jira.magnolia-cms.com/browse/DEV-916

Requirements

  • Concept for migrating content to match the new content type definition. Such as:
    • Delete content type
    • Delete field
    • Change content type name
    • Change field name
    • Change field type
  • Could include tools and alerts for content that has become outdated with the active content type definitions.
  • Concept for how to push the migration to other servers.
  • Recommendations for how to store the JCR content of a content type (workspace names, field names) based on the concepts.

Ideas generation

Agreements

  • We should not run any migration to modify user data implicitly in the background without requiring user input because:
    • Migration runs with production data and requires fully understanding of side effects of any change
    • Production data are huge
    • In development phases, migrating data in the background has no benefit but poor performance on frequent development changes

Content type events require migration awareness

  • Information
    • Datasource could be JCR or RESTful or something else
    • May or may not have model definition
    • Model definition may or may not have nodeType. Have nodeType only if it’s a JCR model definition 
  • Content type definition file
    • Renaming
      • We're using content type name as the identifier
      • Renaming action requires updating references by manually including:
        • CT definition files which inherit that CT
        • App description which is declared in the CT by using `!withType` syntax
        • App name may or may not be the same as the CT identifier. If it is, then we need to modify the `appName` on referenced link fields in the system.
    • Deleting
      • Generated app will be deleted when changes are affected
      • Related data
        • Should not delete data automatically
        • Should notify user regarding abandoned data with suggestion actions
  • JCR Datasource
    • Namespace -  MGNLCT-36 - Getting issue details... STATUS
    • Workspace -  MGNLCT-35 - Getting issue details... STATUS
      • Data migration -  MGNLCT-42 - Getting issue details... STATUS
      • A content type can use a shared workspace with existing systems or with other content types, so deleting a workspace is not logically possible and it's not supported by Jackrabbit either.
      • Solution:
        • Search data based on model definition - node type
        • Copy old data to the new workspace
        • Delete old data from the old one - optional
    • Node type definition file -  MGNLCT-37 - Getting issue details... STATUS
  • JCR Model/SubModel
    • Node type changed on model or subModel
      • Migration Task
        • A custom task to search data matched the model/subModel and change the node type
        • Requires user to put the task in ModuleVersionHandler
        • See ChangeNodeTypeTask
        • Note: required properties on new node type may cause of updating failure
      • Advices app
        • Actions: Show affected items, Resolve/Replace node type, Mark as resolved, Permanently Delete (all or per item).
    • Deleting
      • A custom task to search data matched the model/subModel and report to the Advices app
      • Actions: Show affected items, Mark as resolved, Permanently Delete (all or per item).
    • Properties
      • Renaming
        • Property can be stored as a JCR property or a Node
        • See RenameNodeTask, RenamePropertyTask
      • Deleting
        • See RemoveNodeTask, RemovePropertyTask
      • Change property type
        • Can't migration data automatically. A property can be stored as a property or node.
        • Requires an user custom migration task or data will be deleted.
      • Adding new required property
        • Insert empty data?

Concerns

  • How to detects changes on runtime environment and when restarting/shutdown the server?
    • We may able to detect changes on runtime environment by comparing last valid CT definition vs new definition. But when the server is restarting or shutdown, we will not know without storing the old CT definition.
    • Can't detect renaming
    • A solution for this maybe is requiring user declare explicitly by taking appropriate actions for each changes. It should not be a problem with existing mechanism.

Content Type Advices app

Goal

  • Show out-of-sync status between definition and existing data
  • Suggest appropriate actions to resolve migration issue
  • Notice changes on data
  • Consistent state on runtime vs shutdown/restart of the instance 
  • Allow users to publish and revert changed data.
    • Without this feature, the Advices app is useless.
    • The app mainly use by Developer or people who is well-understand regarding side effects of making changes.
    • Changes on data should able to publish to other servers. Otherwise requires to run migration tasks when upgrades the system to newer version.

Option 1: Integrate with Definition app

  • Recognize Content Type problems
  • Custom actions
  • Benefits
    • A unique app for every definition
  • Drawbacks
    • Work in runtime
  • Conclusion: 
    • Definition app is not a suitable integration due to migration actions is not belong to the definition by itself.

Option 2: Pulse integration

  • Create a task with appropriate actions and push to Pulse app.
  • Allows user to decide
  • Preview changes with sample data

Option 3: New dedicated content app

Pushing migration data to other servers

Module Version Handler in YAML format

Instead of putting Java code in a ModuleVersionHandler, providing common migration tasks in a YAML approach to support hot migration in runtime environment.

When publishing a new CT definition from an author to public instances, migration process will be performed.

Benefits

  • Don't need to track updates on CT definition. User need to declare changes explicitly.
  • Easy to config inside a content type definition file or a separated VersionHandler yaml file.
  • Publishing new CT definition file, migration tasks will be run on runtime environment.
  • User still able to add more complex migration tasks in Java code and use it in YAML file if needed.

Drawback

  • Reimplement common tasks to receive configurable parameters.
  • Need to track CT version of definition file and created data to distinguish versions and performing migration.

Example:

  • myct_v1.yaml
model:
  - name: 'foo'
  • Data
/fooNode
  + mgnl:ctModelVersion = 1
  • myct_v2.yaml
model:
  - name: 'bar'
updates:
  - class: ...RenamePropertyTask
    fromPropertyName: foo
    toPropertyName: bar


Campaign Publisher integration

Changes on content type definition may need to be published to public instances and related data need to be re-structured as well. However, in order to speed up the process, migration doesn't need to run in every instance (assuming that public instances are in sync with the author instance).

A solution for this scenarios is Campaign Publisher module integration - https://documentation.magnolia-cms.com/display/DOCS57/Campaign+Publisher+module

  • Every migration should automatically create a campaign to contain changes.
  • Content Type related migration tasks should add changes to the created campaign.

Benefits:

  • Users are able to
    • see what the changes are and their reasons
    • decide to publish after finish the integration test

Drawback

  • Campaign Publisher is a EE module.
  • Requires publish huge data between servers

Conclusion

  • Running migration tasks still an effective way to manipulate data through migration process.

Actions

  • PoC for renaming a simple property -  DEV-1048 - Getting issue details... STATUS
    • Acceptance criteria
      • Configurable renaming action of a specified property on CT definition file
      • Detect out of sync data => A simple app to list out all of problems
        • Should display affected data by node path.
      • Suggested actions: Executing task (configurable by class)
      • Preview for one of reported problem. E.g: open detail contacts app, etc
      • Allows for scheduling the publishing changed data: Nice to have
  •  Research extraction of Version Handler in the YAML syntax -  DEV-1049 - Getting issue details... STATUS
    • Tasks should runnable without InstallContext
    • Goal
      • Rough estimation of refactoring
      • Keep minimal changes on any existing tasks
  • PoC for maintaining author experience when working with content from different models (versions) -  DEV-1050 - Getting issue details... STATUS
    • Acceptance criteria
      • Author doesn't perceive any data loss

References

  • No labels

5 Comments

  1. Rails content migrations are interesting: https://edgeguides.rubyonrails.org/active_record_migrations.html

    You create a separate file for each migration. The filenames have timestamps. At any time you can run a command-line tool "rails db:migrate", then it runs all the migrations that have not yet been run on that system. (I guess each system stores which migrations it has run.)

    1. Same concept as https://flywaydb.org/


      I'm quite sure I have a concept somewhere about it. It was the initial idea to handle migration cases for content-types in the POC.

    2. Re #Active Record migration: As my understanding, it's same like our current VersionHanlder which contains migration tasks excepts the syntax.

      Re #flywaydb: Ilgun Ilgun could you find again the PoC? that sounds promising to me

  2. Agree there are similarities! Some of the things that looked interesting with ActiveRecord were: Each migration step is in a different file (A script which does not need compilation). You can control the update steps manually. The steps are reversible. The very handy "generators" they have which we could also integrate in our CLI. The ability to "backfill" content when adding a new property. Not that we need to do any of these things! In general I just found it useful to look at how another system handled schema changes and data migration. 

    1. Yep, I see your point now. I've updated the ticket DEV-1048. We'll experiment a solution with its benefits.

      Thank you for pointing out.