GREYProposal to simplify updates of Magnolia instancesGREY
Module updates and Magnolia updates are currently a little cumbersome; one needs to either rebuild their webapp, and move/copy lots of jar files. Making sure superfluous modules are removed and extra modules re-added can be tedious.
Other webapps have shown it possible since a while to have plugins downloaded from the internet right into the webapp and deployed automatically. In some cases, they don't even require a restart of the application in the application server.
- ease deployment, maintenance and updates of Magnolia modules
- faster spread of updates
- if modules are easily updated, the core of Magnolia and the bundling don't need to be updated as much, and the modules can really start flying at their own pace.
- (as a positive side-effect, we might be able to slim the main download's sizes)
Features / dependencies
In approximate dependency order, here's a quick breakdown of independent features:
Features that can be implemented separately and before the actual "module downloader" features
I was never exactly satisfied with the "update mechanism" name (since it also provides for install, module lifecycle and so on), and this extra feature only makes it worse.
Splitting some components might shed some light on a better naming, and "module mechanism" might be the generic/overall name.
The Magnolia webapp should be markable as read-only; module jars should be loadable from outside the webapp folder (we don't want to(can not) write inside /WEB-INF/). Other files (repository, indexes, cache, ...) should also be written outside the webapp. See MAGNOLIA-2170@jira and the linked user-list thread for some background discussion.
This is independent of a library choice, but should be implemented prior to actual module download/upgrade. (we'll need to download the modules, extract bundles, ...)
Magnolia-core will probably have to remain in the webapp folder: extract more out of core, so that it can also benefit from easy updates. This will also facilitate a potential future migration to OSGi.
Other than obvious candidates components such as i18n, links, audit, Jackrabbit support, etc, we could also envision that core or one of its extracted modules would register the "main" workspaces (website, users and so on) instead of having them hardcoded in
repositories.xml. See MAGNOLIA-1666@jira.
Most of our modules are actually fairly easily uninstallable (see docu, it's mostly removing the jar and removing a few specific nodes), so this should be feasible. Provide backup of removed nodes for safety. Uninstalls could be "automatic" when a jar removal is detected (i.e when restarting the server) or done through some form of gui. Potential implementation would imply serializing the tasks-for-uninstalling at install time. (store result of ModuleVersionHandler.getUninstallTasks()). See MAGNOLIA-769@jira.
We're currently store the module related information in the configuration workspaces, in various places: current module version under
/module/xyz/version, backup of some config nodes under
/server/install/backup, information about extracted files under
/server/install/mgnl-files. We could instead use a specific workspace.
Splitting of module "configuration" class and ModuleLifecycle
These are IMO two separate concerns (starting 3rd party components, preparing resources, ...) and holding configuration. Of course, the lifecycle will need the configuration, so the former will be passed or made available to the latter.
Missing dependencies should be reported in the ui
Better feedback to user during installation process
Provide module status page
This is essentially inherently provided if we have a module downloader/installer feature.
I am currently not exactly sure, but I have recently seen "strange" things happen during installation. See MAGNOLIA-1663@jira.
Support for milestone and other -xyz versions
It it sometimes necessary to provide updates for such version; ideally, we should discourage it, and promote releases instead. This was needed for instance with the insm project; but it might also be fulfilled by providing a scripting interface (shell module) that has access to the module update tasks api.
The current update UI is totally open: provide a better/configurable page for public instances while updates are being performed. See MAGNOLIA-1629@jira.
For download and install, we'll want to display decent module descriptions etc, maybe even screenshots. This implies we might need to add some features to the module descriptor.
Once the above has been implemented or decided upon:
Modules need to be bundled with 3rd party libs (e.g Quartz with the Scheduler module). Current limitation of our module system is that we don't provide any "check" - e.g Scheduler module bluntly fails to start after installation; thus blocking the complete system.
We maybe need to differentiate between inter-module dependencies and dependencies towards 3rd party/external libraries.
Our current module system already provides for the former.
Our current build provides "bundles", zip files including 3rd party libraries.
We can't currently detect 3rd dependency them (our module descriptor don't mention them), but if we download module "bundles", this wouldn't be an issue at first; we would however need to take into account possible conflicts.
Modules could also do self-checks, maybe by simply checking for a given class.
Additionally, if the Maven team do the same work for other Maven components than what they've done for Mercury, there might be a simple and small API to read a module's pom, and thus derive the external dependencies from. (I was actually able to do just that using
Mercury provides for this.
OSGi's OBR as well.
Where and how we deploy and publish modules.
Proxies and multiple repositories
The whole mechanism will need configuration for
- proxy usage (i.e so the Magnolia instance can access the Internet)
- location of repositories, along with username and password. Are we satisfied with no encryption of this information (if so we only need the url.... http://username:email@example.com/restricted)
Splitting our repositories
If we're going to use our Maven repositories (which seems quite likely), we'll need to move away all the projects/3rd party stuff and closely monitor them
No need to restart the app/appserver to deploy a module update. ("updates are ready: [apply now, silently] or [click here for switching to the update UI when ready]"; alternatively, we could maybe make it so that the system still works while updates are being applied, and switches "atomically" at the end? - or at least provide a configurable temp page for public instances)
We can already restart a module. Missing points:
- restart this module's dependencies in the appropriate order. (restarting all is probably good enough)
- load modules in an isolated classloader, so hopefully, we can swap to the new jars once they're downloaded. We have some experience w/ classloaders, see magnolia-tools ! (+include classloader debug info in the diagnostics module !)
- what happens to the public site while updating (is Magnolia accessible, do we have a temp page)
Our current dependency system can state a minimal required version (i.e module X depends on Y 1.3 and up and on core 4.1 and up. Somehow, we'll have to be able to also say that module X 1.2 will not work anymore with Y 1.5, which is not something that we can determine when X is released. So this is probably something that needs to happen on the server-side of things.
Hudson has (or had) a feature that would let one restart the app.server at the click of a button. Perhaps this is a Winstone-only feature. Would like to see how they did this ...
Existing implementations, libraries, ...
Examples of existing implementations of such features
- Struts 2?
- Evaluation of OSGi : there are a whole bunch of good things that OSGi would bring us, but right now, there's also a lack of ready-made frameworks/tools/libraries for webapp. We'd have to change way too much things in Magnolia to get it working. See OSGi notes.
Main principles or goal remain the same, but Mercury is now called Aether (complete rewrite afaict, so examples below are probably useless)
- We could re-use (part of) the Maven infrastructure. This might be particularly interesting since our builds already deploy to Maven repositories; The current effort is going into "Mercury", which as far as I understand, will be a foundation block for Maven 3:
- http://repo2.maven.org/maven2/org/apache/maven/mercury/ (1.0-alpha-5 is available at the time of writing)