Auditing Design and Implementation

From AlfrescoWiki

Jump to: navigation, search

Contents

[edit] Introduction

Auditing is carried out at the service layer of the Repository. This captures both user and application interaction with the repository without recording all the underlying changes to nodes etc. required to achieve this. It is not low level auditing.

The service layer currently defines and enforces transactional behaviour and security. Auditing fits naturally between the two. If auditing is done in transaction and recorded in the database then it reflects what has been done. It will not record actions that have been rolled back, which is a problem with file based auditing. Failed actions are recorded in another transaction.

From version 1.4, auditing is configured for all the write methods on all public services. Auditing is disabled by default and must be turned on via configuration if it is required.

[edit] Structural Overview

This shows where the audit interceptor fits into the public service architecture.

Service Layer
Transaction Interceptor
Audit Interceptor -------- Audit Component
Exception Translator Interceptor
Security Interceptor
Service Description Interceptor
Service Implementation Bean

[edit] Auditing flow

The diagram below shows how the auditing interceptor wraps calling the underlying service and the returned object or exceptions it produces.

... TX Interceptor Audit Interceptor Exception Translator Interceptor ...
(method call) -->
Check Audit Flag
Suspend auditing
Audit this call
Record initial state
Record method arguments
(method call) -->
...
<-- (method return)
Record return object
Record final state
AUDIT
Reactivate auditing
<-- (method return)
<-- (exception thrown)
Record exception
AUDIT in new TX
Reactivate auditing
<-- (exception thrown)

[edit] What is recorded?

  • The current authenticated user (including domain) (done)
  • The session identifier (Future as we do not really have this information)
  • Timestamp (done)
  • The ID of the containing transaction (done)
  • The key node ref (done)
  • Host Machine (done)

[edit] =Not Recorded (TODOs)

  • The path to the key node ref (optional - TODO)
  • Description (done) - only used for API level logging data
  • Source (done) - eg The service method
  • Presence of Service/Method/Key Type/Key Path/Key NodeRef/Key Property filters (optional - TODO)
  • Method return object (string fixed, serialized optional - TODO)
  • Method arguments (first 5 as strings fixed, all serialised as array optional - TODO)
  • Exception Summary (message - fixed, path - fixed, full serialised optional - TODO)
  • Key Node properties before method invocation (optional - TODO)
  • Key Node properties after method invocation (optional -TODO)

[edit] Interceptors

There is one generic interceptor that supports auditing around any public service. This is configured in public-services-context.xml.

Each public API bean is declared as a public service using the PublicService annotation. This makes calls to the bean auditable when wrapped with the audit interceptor. Each method on the public interface must be annotated to describe its audit behaviour. Firstly, if it is audited or not; secondly, what should be audited. For audited methods define:

  • The argument that defines the key node reference
  • A key to lookup up an internationalised description string
  • Argument position to name mappings

The Audit Service API exposes the audit configuration for read access so the interceptor can apply audit control.

[edit] What are we not doing?

We are not going to add behaviours to wire up policies in the node service and other services to the audit service.

[edit] Implementation

[edit] Annotations

Public Service

Auditable

NotAuditable

Sample Example:

@PublicService
public interface AnnotationTestInterface
{
   @Auditable()
   public void noArgs();
   
   @Auditable(key = Auditable.Key.ARG_0, parameters = {"one", "two"})
   public String getString(String one, String two); 
   
   @Auditable(key =  Auditable.Key.ARG_0, parameters = {"one"})
   public String getAnotherString(String one); 
}

[edit] Interceptor

A single instance interceptor that delegates method execution to an authentication component.

[edit] XML Configuration

The structure of the configuration file is defined by the audit schema. The bundled example enables auditing for all public service methods that change the repository.

[edit] API

[edit] Service

AuditService

[edit] Component

Audit Component


[edit] Impl details

  • Log audit config at start up
    • Add as an option to the audit configuration XML
  • Thread local audit flag
    • Only log the outer-most public API call.
    • Can be set in the config.
  • Warn about public service methods without auditing

[edit] Other ideas

  • Possible record and replay?

[edit] Developer guide

Auditing has an impact on public all service APIs.

Public services must be marked as such with the PublicService annotation.

Services not declared as public services will be ignored and no audit information will be recorded.

Methods on all public service interfaces must be defined as one of:

If a method is not annotated and then called with auditing enabled a RuntimeException wil be thrown.

The Auditable annotation requires:

  • names for method arguments - so they can be used to filter on particular parameter values
  • an optional key (which may be an argument or return value) that is the natural way of filtering auditing based on type/aspect/path etc
  • the return string value and argument string values are recorded by default, this can optionally be controlled so they are never recorded in the log (eg password values)

Here are a few examples:

@Auditable(parameters = {"userName", "password"}, recordable = {true, false})
public void createAuthentication(String userName, char[] password) throws AuthenticationException;
@Auditable
public List<StoreRef> getStores();
@Auditable(key = Auditable.Key.ARG_0 ,parameters = {"nodeRef"})
public boolean exists(NodeRef nodeRef);
@Auditable(key = Auditable.Key.ARG_0 ,parameters = {"parentRef", "assocTypeQName", "assocQName", "nodeTypeQName"})
   public ChildAssociationRef createNode(
           NodeRef parentRef,
           QName assocTypeQName,
           QName assocQName,
           QName nodeTypeQName)
           throws InvalidNodeRefException, InvalidTypeException;


@NotAuditable
   public Collection<QName> getAllModels();


Public services should be identified using a marker annotation

@PublicService

[edit] Simple audit template example

Basic audit information can be shown by applying the example audit template show_audit.tfl to spaces, folders, files etc.

[edit] Sample mysql audit queries against repo version 1.4.x and on

To pull back everything:

select * from alf_audit_fact fact                 
  join alf_audit_date date on fact.audit_date_id = date.id           
  join alf_audit_source source on fact.audit_source_id = source.id


To find out what methods and services have been audited so far:

select * from alf_audit_source


To link person information (eg email, forename, surname. etc): add something like

select fact.*, 
       date.*, 
       source.*, 
       convert(p2.string_value using utf8) as Name,
       convert(p3.string_value using utf8) as Surname, 
       convert(p4.string_value using utf8) as EMail
  from alf_audit_fact fact
  join alf_audit_date date on fact.audit_date_id = date.id
  join alf_audit_source source on fact.audit_source_id = source.id
  left outer join alf_node_properties p1 
     on fact.user_id = p1.string_value
        and p1.qname  = '{http://www.alfresco.org/model/content/1.0}userName'
  left outer join alf_node_properties p2 
     on p1.node_id = p2.node_id 
        and p2.qname = '{http://www.alfresco.org/model/content/1.0}firstName'
  left outer join alf_node_properties p3 
     on p1.node_id = p3.node_id 
        and p3.qname = '{http://www.alfresco.org/model/content/1.0}lastName'
  left outer join alf_node_properties p4 
     on p1.node_id = p4.node_id 
        and p4.qname = '{http://www.alfresco.org/model/content/1.0}email' 



To find people who have never logged in or caused anything to be audited:

select convert(prop.string_value using utf8)
   from alf_audit_fact fact
   right outer join alf_node_properties prop on fact.user_id = prop.string_value
   where prop.qname = '{http://www.alfresco.org/model/content/1.0}userName'
     and fact.user_id is null

To find the fist audit entry for all users:

select fact.user_id, min(timestamp)
   from alf_audit_fact fact group by fact.user_id

To find the first login entry for all users:

select fact.user_id, min(timestamp)
   from alf_audit_fact fact
   join alf_audit_source source 
      on fact.audit_source_id = source.id
         and source.service = 'AuthenticationService'
         and source.method = 'authenticate'
   group by fact.user_id


To get roll ups of login information for all time:

select fact.user_id, count(fact.id)
   from alf_audit_fact fact
   join alf_audit_source source 
      on fact.audit_source_id = source.id
         and source.service = 'AuthenticationService'
         and source.method = 'authenticate'
   group by fact.user_id

To get the number of logins broken down by year:

select fact.user_id, date.year, count(fact.id)
   from alf_audit_fact fact
   join alf_audit_date date on fact.audit_date_id = date.id
   join alf_audit_source source
      on fact.audit_source_id = source.id
         and source.service = 'AuthenticationService'
         and source.method = 'authenticate'
   group by fact.user_id, date.year

To get the number of logins broken down by year and quarter:

select fact.user_id, date.year, date.quarter+1, count(fact.id)
   from alf_audit_fact fact
   join alf_audit_date date on fact.audit_date_id = date.id
   join alf_audit_source source
      on fact.audit_source_id = source.id
         and source.service = 'AuthenticationService'
         and source.method = 'authenticate'
   group by fact.user_id, date.year, date.quarter+1

To get the number of logins broken down by year and month:

select fact.user_id, date.year, date.month+1, count(fact.id)
   from alf_audit_fact fact
   join alf_audit_date date on fact.audit_date_id = date.id
   join alf_audit_source source
      on fact.audit_source_id = source.id
         and source.service = 'AuthenticationService'
         and source.method = 'authenticate'
   group by fact.user_id, date.year, date.month+1

To get the number of logins broken down by year, month, and day of month


select fact.user_id, date.year, date.month+1, date.day_of_month, count(fact.id)
   from alf_audit_fact fact
   join alf_audit_date date on fact.audit_date_id = date.id
   join alf_audit_source source
      on fact.audit_source_id = source.id
         and source.service = 'AuthenticationService'
         and source.method = 'authenticate'
   group by fact.user_id, date.year, date.month+1, date.day_of_month


To get the login history for a given user:

select fact.timestamp   
from alf_audit_fact fact
   join alf_audit_source source
      on fact.audit_source_id = source.id
         and source.service = 'AuthenticationService'
         and source.method = 'authenticate'
   where fact.user_id = 'admin'   


To find all content "viewed" by a given user in the last 7 days

select fact.timestamp, convert(prop.string_value using utf8)
from alf_audit_fact fact
   join alf_audit_source source
      on fact.audit_source_id = source.id
         and ( 
                  (source.service = 'ContentService' and source.method = 'getReader') 
               or (source.service = 'FileFolderService' and source.method = 'getReader')
             )
   join alf_node_status ns on ns.guid = fact.node_uuid
   join alf_node_properties prop 
      on ns.node_id = prop.node_id 
         and prop.qname = '{http://www.alfresco.org/model/content/1.0}name'
   where 
         fact.user_id = 'admin'
     and fact.timestamp > ADDDATE(now(), -7)

To find all items created via the node server or file folder service by a given user in the last 7 days (not rules, people, ....):

select fact.timestamp, convert(prop.string_value using utf8)
from alf_audit_fact fact
   join alf_audit_source source
      on fact.audit_source_id = source.id
         and (
                  (source.service = 'NodeService' and source.method = 'createNode')
               or (source.service = 'FileFolderService' and source.method = 'create')
             )
   join alf_node_status ns on ns.guid = fact.node_uuid
   join alf_node_properties prop
      on ns.node_id = prop.node_id
         and prop.qname = '{http://www.alfresco.org/model/content/1.0}name'
   where 
         fact.user_id = 'andy'
     and fact.timestamp > ADDDATE(now(), -7)


If you can create functions on your mysql database you could create a function to build the UI displayed path - the path made up of each item's name, not its full path in the repository.

  DELIMITER $$

DROP FUNCTION IF EXISTS `alfresco`.`getPath` $$
CREATE FUNCTION `alfresco`.`getPath` (id BIGINT(20)) RETURNS VARCHAR(2048)
BEGIN
  DECLARE name VARCHAR(128);
  DECLARE answer VARCHAR(2048);
  DECLARE current BIGINT(20);
  DECLARE previous BIGINT(20);
  SET current = id;
  SET previous = 0;
  REPEAT
   SELECT convert(prop.string_value using utf8) into name from alf_node_properties prop
      where prop.node_id = current
        and prop.qname = '{http://www.alfresco.org/model/content/1.0}name';
   SET previous = current;
   SELECT parent_node_id INTO current from alf_child_assoc
      where child_node_id = current and is_primary > 0;
   IF previous = id
   THEN
      SET answer = name;
   ELSEIF current <> previous
   THEN
      SET answer = concat(name, '/', answer);
   ELSE
      SET answer = concat('/', answer);
   END IF;

  UNTIL current = previous
  END REPEAT;
  RETURN answer;
END $$

DELIMITER ;

If this function is defined, instead of displaying just the name of the docs found you can display the path. This call will not be cheap.

select fact.timestamp, convert(prop.string_value using utf8), getPath(ns.node_id)
from alf_audit_fact fact
   join alf_audit_source source
      on fact.audit_source_id = source.id
         and (
                  (source.service = 'ContentService' and source.method = 'getReader')
               or (source.service = 'FileFolderService' and source.method = 'getReader')
             )
   join alf_node_status ns on ns.guid = fact.node_uuid
   join alf_node_properties prop
      on ns.node_id = prop.node_id
         and prop.qname = '{http://www.alfresco.org/model/content/1.0}name'
   where 
         fact.user_id = 'andy'
     and fact.timestamp > ADDDATE(now(), -7)