Your Rating: |
![]() ![]() ![]() ![]() ![]() |
Results: |
![]() ![]() ![]() ![]() ![]() |
102 | rates |
Draft for 5.2, 5.3
Proposal for cache improvements.
The locking was solved in 4.3, other improvements are pending
There are 4 main areas to study/develop in order to have a simplified and faster caching module:
- separate client caching from server-side caching
- remove byte arrays and use stream to write to e read from cache elements
- synchronize read / write operations at cache element level, not at global cache level
- add a global voter
Separate client caching from server-side caching
Split cache filter in two filters (cachable resources are the resources not bypassed):
- Headers filter, with a manager (the simple implementation is an in-memory (concurrenthashmap) table) to
- store response headers
- apply max-age and Expires (or no-cache), or whatever else (for example ETags)
- check request headers in order to send back to client SC_NOT_MODIFIED
- Content filter, with a manager (the simple implementation is a filesystem based manager) to
- cache resources by streaming response (multiplex streams) to an outputstream taken from cache element
- check for SC_NOT_MODIFIED using cache element creation date
- stream from cache using an inputstream
Memory consumption optimization
Optimize memory usage by removing the use of byte arrays both in writing to cache and reading from cache
Cache locking
Use the java.util.concurrent.locks.ReentrantReadWriteLock ReadLock and WriteLock to do per-element resource locking
Remove boilerplate and hide locking concerns
We could think of adding a "getOrCache
" (unconvincing name to be debated) method on the Cache
interface whose implementation could look something like the following (not taking any locking/synchronizing issue in account, so this code might not be accurate)
Object getOrCache(Object key, Callback c) { Object cached = get(key); if (cached == null) { Object value = generateCacheValue(); put(key, value); return value; } else { return cached; } } interface Callback { Object generateCacheValue(); }
... where the Callback
interface would thus be responsible for "generating" the cache value; this new method could thus be called as such:
cache.getOrCache(key, new Callback() { public Object generateCacheValue() { return retrieveValueFromSomeRemoteService(); } });
See this diff (this class) for an example of an implementation.
TBH, I wouldn't be surprised if more recent versions of EhCache and other cache libraries had such a construct natively. Seems clean and elegant enough to be used in many cases. Not sure it would work for our page caching, but most likely for many other situations where we want to cache "stuff" (I'm using this in the external-indexing module, for example)
edit: looking at the EhCache 2.5 API, it could perhaps indeed be implemented with Cache.putIfAbsent
, by passing a subclass of Element
whose getValue
(or getObjectValue
, not sure which) would be lazy. It also has a SelfPopulatingCache
class, which might be interesting looking into.
Additionally, if feasible, in some cases, using generics for key and values in the cache would avoid casting.
(there's possibly going to be an unchecked cast at some point when retrieving the cache instance, but eh)
Caching other objects
See Concept - Cache arbitrary objects.
Global cache voter
MGNLCACHE-37 - Getting issue details... STATUS
I.e.
public class AllInOneCacheVoter extends AbstractBoolVoter { private String allowedExtensions; private String deniedExtensions; private String allowedRequestContentTypes; private String deniedRequestContentTypes; private String allowedResponseContentTypes; private String deniedResponseContentTypes; private boolean allowRequestWithParameters = false; private boolean allowAdmin = false; private boolean allowAuthenticated = false; private boolean allowDocroot = true; private boolean allowDotResources = true; private boolean allowDotMagnolia = false; private VoterSet voters = new VoterSet(); // called by Content2Bean public void init() { if (StringUtils.isNotBlank(allowedExtensions) || StringUtils.isNotBlank(deniedExtensions)) { ExtensionVoter voter = new ExtensionVoter(); voter.setAllow(allowedExtensions); voter.setDeny(deniedExtensions); voter.setNot(true); voters.addVoter(voter); } ... create voters and add them to voters } /** * {@inheritDoc} */ @Override protected boolean boolVote(Object value) { return voters.vote(value) == 0; } ... getters and setters ... }
6 Comments
Manuel Molaschi
Some thoughts about caching:
Philipp Bärfuss
Manuel Molaschi
about simplified cache configuration i just have seen MAGNOLIA-2557 (linked on top of the page... ops...) in which Philipp made the same proposal
Magnolia International
Feel free to edit the page directly.
Streaming from cache: Have a
void stream(Object key, OutputStream out)
method instead of the currentget(Object key)
. This would allow an implementation to "find" the source based on some condition (if size>X, serve from cache, otherwise serve from fs - or, to speak implementation details, we'd probably have a different CachedEntry - which instead of holding the content, would have a pointer to the fs or other streamable source)re:configuration: sounds good and similar to what we had in mind indeed. We'll also need to think about update tasks (yay)
re:decoupling - one reason that might speak against it is that they (i think?) share (or should?) the configuration. The current cache caches all http headers, too, so I'm not sure how that would help ? What is the actual problem - other than the complexity (of both the configuration and of the strategy and executors system...)
re:google gears: yes, scratch that, i'd just noted this down here a while back when i found out about gears, but it's not very relevant for us at the moment.
To re-iterate, client headers and streaming are two issues independent from the configuration one and can/should be solved independently
Philipp Bärfuss
I think we have now the pieces together and should rename this concept page and update the content. Then we make a short meeting to finalize the decisions needed for 4.3.
Philipp Bärfuss
Thanks for updating the page. I totally agree that the mentioned issues must be solved, but I am afraid that rewriting the cache completely is taking too drastic measures. Following some short answers.
Possible that the current implementation needs improvement or is to complex, but this can be solved without rewriting everything. Todays solution has laid out some future solutions as using content information for setting headers. I am thinking about using template configuration or page properties to decided on caching strategies. Another thing we have is a uuid to cache key mapping which plains the way for isolated cache flushing (based on a linked graph).