Implemented in 4.3
There are several obstacles to having full unicode support in Magnolia.
Also see MAGNOLIA-3009@jira and MAGNOLIA-2929@jira .
Official Documentation Available This topic is now covered in i18n and l10n > Authoring
.
WebDAV was the first "trigger" for all these issues, as we can't have "on-the-fly" node renaming as we have in AdminCentral's tree.
- some clients/OSs tend to use one or the other normal forms
- if we allow node names to be created with non-ascii characters in webdav, we must ensure all of Magnolia (tree, renderers, ...) are able to cope.
- the current pitfalls are:
- tree: see attached patch - we need to url-encode the node's path, but for the javascript to work as-is, we need to avoid encoding the slashes. The tree seems to work with full url-encoding of the paths (browsing, opening nodes work, for instance) but some features get broken (deletion of nodes, even those with plain ascii names)
- dialogs: if you url-encode the path (
mgnlPathas an hidden field in the dialog: the browser double-encodes it (which is the expected behavior, and thus we'd need to double-decode on the server as well, which seems quite flaky); if you do not encode this form parameter, it seems some browsers temper with it; Safari has been seen changing a value, which was originally in the NFD form, to NFD)
- having the new service/rest infrastructure in place should help with such issues, as we'll have better control on what gets encoded where, and how.
- the attached
test.jspshows that Safari swaps an NFD-formed string to its NFC form. (Tested Safari 4.0.3 and Chrome 4.0.237.0 on OSX, and Chromium 4.0.237.0 on Ubuntu, which all show this behaviour; Firefox 3.0.4 and Opera 10, on the other hand, seem better behaved and respect the given encoding)
When trying things out, one might need to manually create names in the NFC or NFD forms specifically
byte[] nfcBytes = new byte[]{103, 114, -61, -92, 103};
byte[] nfdBytes = new byte[]{103, 114, 97, -52, -120, 103};
String nfc = new String(nfcBytes, "UTF-8");
String nfd = new String(nfdBytes, "UTF-8");
// Then double check with java.text.Normalizer.isNormalized()
http://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms
http://www.w3.org/TR/html401/interact/forms.html#h-17.13.3.3
Seems we're not the only ones struggling: http://twiki.org/cgi-bin/view/Codev/UnicodeMac
And this confirms that WebKit is probably doing normalization to NFC on purpose! : https://bugs.webkit.org/show_bug.cgi?id=8769 (since 2006) http://www.w3.org/TR/charmod-norm/#sec-UnicodeNormalized