Page tree
Skip to end of metadata
Go to start of metadata

Rationale

Audit log is active only when we using Content API. But the Content API is deprecated and also when we use directly JCR API for operations on nodes and any such ops are not audited then.

To solve those issues we need to move audit log into JCR API. Preferred way of doing so is by introducing AuditLoggingNodeWrapper.

Jira ticket

Error rendering macro 'jira'

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Basic idea of new implementation


We will wrap all sessions during their initialization with MgnlAuditLoggingContentDecoratorSessionWrapper. This is done in DefaultRepositoryManager. Same way how we wrap session with MgnlVersioningSession.

Content API implementation vs MgnlAuditLoggingContentDecorator (JCR API) differences

When we should log the action?

Problem:
In Content API we had full control on actions which were done. So when some change was done, we logged this change and also save them. But in JCR API create node, modify property, delete node, or move node are not persisted immediatly. So, we can't log action when it was done because if the session crashes, expires or is not saved for any other reason then the changes are never persisted to the repository and it's as if they didn't happen.

Solution:

We will log actions into audit-log output after they are stored by calling session.save() or node.save().
We will store entry about action in temporary log map. The map will be in MgnlAuditLoggingContentDecorator class. MgnlAuditLoggingContentDecorator is instantiated when we wrap session with MgnlAuditLoggingContentDecoratorSessionWrapper in DefaultRepositoryManager. And during call of session.save() or node.save() we will log into audit-log full or partial list of operations respectively.

- Session.save() - we can write into audit log output each log entry which is stored id temporary log storage. And than clean this storage.
- Node.save() - output in the log and clean from the map only changes which were done on the node and subnodes.

There is one exception to the above - Workspace. Workspace.move() and Workspace.copy() operations are persisted immediatly so we can write log action into audit log output immediatly.

MetaData change

Problem:
In Content API when we change MetaData of the node this was logged as change on the node. We need preserve same behaviour to new API.

Solution:
When we store log entry into temporary log map. We can do check if node is type mgnl:metaData. If so then we get parent of MetaData node and we'll store this action like it was done for parent node. Due to changes in handling MetaData in Magnolia 5.0 this part of the code doesn't need to be ported.

Don't log action on system workspaces

Problem:
Workspaces like Store, Expression, imaging, mgnlVersion and mgnlSystem are used for Magnolia system function and produce many audit log entries which are unnecessary. The actions are done automatically in system context during activation, image variation creation, etc.

Solution:
We will not wrap the Store, Expression, imaging, mgnlVersion and mgnlSystem with MgnlAuditLoggingContentDecoratorSessionWrapper for now. In the future we will make audit logging configurable per workspace Unable to locate Jira server for this macro. It may be due to Application Link configuration. .

Audit logging is not active

Problem:
If audit logging is not active then is unnecessary store log entries into temporary log map.
We can't simply check in DefaultRepositoryManager if AuditLogging is active and then decide if wrap session or not. Because we get AuditLoggingManager from repository ({{config:/server/auditLogging}}). So for opening session we already need AuditLoggingManager.


Solution:
So we wrap all sessions with MgnlAuditLoggingContentDecoratorSessionWrapper and check if AuditLogging is active before storing log entry into temporary log map. MgnlAuditLoggingContentDecorator.logActionCreate(Node), MgnlAuditLoggingContentDecorator.logActionModify(Node) etc. The check is done for each audited operation.

Open Question: Would not be enough to do the check only during first change?

Introducing new log entry time stamp

Because we don't log change immediately when it's done but after call to save(), we need store also into log entry the time stamp when the change was originally done.

Example of the output:
Format is: date, action performed, when the change was originally done, user ID, workspace, nodetype, nodepath.
09.05.2013 14:42:38, create, 19517237863844, superuser, website, mgnl:page, /untitled

Potential new problems

Determine if operation should be logged, storing log entry into temporary log map can lead to performance and memory leak issues.

Planned tests

We will do performance and memory usage tests described in our wiki http://wiki.magnolia-cms.com/display/DEVINT/Magnolia+4.5.x+Performance http://wiki.magnolia-cms.com/display/DEVINT/Memory+usage

Future improvements

Add possibility into AuditLogging to monitor only specific workspace Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Add possibility to log warning in the regular log file in case number of unsaved operations in session go over arbitrary (configurable) limit.  Unable to locate Jira server for this macro. It may be due to Application Link configuration.

  • No labels

8 Comments

  1. Why would the temporary (unflushed) logs be a Map ? Sounds like a List is a more appropriate structure ?

  2. For temporary log storage we use LinkedHashMap<String, LinkedList<MgnlAuditLogEntry>> structure. Where map key is absolute path to changed node. There are two reasons for it.

    • When Node.save() is used then only changes which were done on the node and subnodes are saved.
    • We used this structure for limiting number of log entries. For example if node was modified more than once. It's enough to log Modify action only once.

    If we use map structure we have better access to log entries for specific path. We don't need go through across log entries which are for different path.

    1.  if node was modified more than once. It's enough to log Modify action only once.

      It's not !

      The purpose of an audit log is to be able to retrace all actions that were taken, not only their results.

       

      1. This is made same way as in Content API there we also log Modify only once.

        If I would do something like session.getRootNode.addNode("/test"); session.removeItem("/test"); session.save();

        Then the "test" node was never persisted to the repository. So it's not necessary to log it.

         

        1. My point is exactly that this should be logged as well (wink), even if not persisted.

          1. I disagree. if this is not persisted, it didn't happen. audit log would be useless otherwise.

            What would be the value in seeing what is not persisted? 

            1. Detecting modification attempts (possibly ill-intentioned ones)

              1. How? We have session per request. The only time this will arise is in scripts or automatically generated pages on first access. Currently all of that already produces unreadable logs with old API.

                 

                Another point to this is that purpose of audit log is to show persisted changes and who and when made them. This is not the debug log neither the test trace showing method execution order.