Page tree
Skip to end of metadata
Go to start of metadata

Your Rating: Results: 1 Star2 Star3 Star4 Star5 Star 132 rates

Implemented in 4.3

 

There are several obstacles to having full unicode support in Magnolia.

Also see MAGNOLIA-3009@jira and MAGNOLIA-2929@jira .


WebDAV was the first "trigger" for all these issues, as we can't have "on-the-fly" node renaming as we have in AdminCentral's tree.

  • some clients/OSs tend to use one or the other normal forms
  • if we allow node names to be created with non-ascii characters in webdav, we must ensure all of Magnolia (tree, renderers, ...) are able to cope.
  • the current pitfalls are:
    • tree: see attached patch - we need to url-encode the node's path, but for the javascript to work as-is, we need to avoid encoding the slashes. The tree seems to work with full url-encoding of the paths (browsing, opening nodes work, for instance) but some features get broken (deletion of nodes, even those with plain ascii names)
    • dialogs: if you url-encode the path (mgnlPath as an hidden field in the dialog: the browser double-encodes it (which is the expected behavior, and thus we'd need to double-decode on the server as well, which seems quite flaky); if you do not encode this form parameter, it seems some browsers temper with it; Safari has been seen changing a value, which was originally in the NFD form, to NFD)
  • having the new service/rest infrastructure in place should help with such issues, as we'll have better control on what gets encoded where, and how.
  • the attached test.jsp shows that Safari swaps an NFD-formed string to its NFC form. (Tested Safari 4.0.3 and Chrome 4.0.237.0 on OSX, and Chromium 4.0.237.0 on Ubuntu, which all show this behaviour; Firefox 3.0.4 and Opera 10, on the other hand, seem better behaved and respect the given encoding)

When trying things out, one might need to manually create names in the NFC or NFD forms specifically

        byte[] nfcBytes = new byte[]{103, 114, -61, -92, 103};
        byte[] nfdBytes = new byte[]{103, 114, 97, -52, -120, 103};
        String nfc = new String(nfcBytes, "UTF-8");
        String nfd = new String(nfdBytes, "UTF-8");
// Then double check with java.text.Normalizer.isNormalized()

http://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms

http://www.w3.org/TR/html401/interact/forms.html#h-17.13.3.3

URI encoding in Tomcat

Seems we're not the only ones struggling: http://twiki.org/cgi-bin/view/Codev/UnicodeMac
And this confirms that WebKit is probably doing normalization to NFC on purpose! : https://bugs.webkit.org/show_bug.cgi?id=8769 (since 2006) http://www.w3.org/TR/charmod-norm/#sec-UnicodeNormalized

  File Modified
File Tree-unicode.patch 2009-11-09 by Magnolia International
File test.jsp 2009-11-09 by Magnolia International

  • No labels