Data Dictionary Guide

From alfrescowiki

Jump to: navigation, search

The Alfresco Repository provides support for the storage, management and retrieval of content. Content may range from coarse-grained documents to fine-grained snippets of information (such as XML elements). To describe the structure of such content, the Alfresco Repository supports a rich Data Dictionary where the properties, associations and constraints of content are described.

Out-of-the-box, the Repository Data Dictionary is pre-populated with definitions that describe common Content constructs such as Folders, Files and meta-data schemes. However, each Content Application will have its own content requirements and as such the Data Dictionary is extendable allowing the Repository to manage new types of Content.

This guide explains the concepts behind the Data Dictionary, how to define new types of Content and use them in a Content Application.

Contents

The Data Dictionary

At the heart of the Data Dictionary is a meta-model for describing one or more Content models. For clarity, we refer to the dictionary meta-model as level M2, and a content model as level M1 - see Levels of Modelling.

The meta-model supports two primary constructs; Content Type and Content Aspect. Both provide the ability to describe a specific content structure including properties (meta-data) and associations to other content. Object-oriented constructs such as inheritance are also supported.

What's the difference between a Content Type and Content Aspect? Content is said to be of a Content Type (i.e. it is of one Type at any one time). The Type describes the fundamental structure of the Content. A Content Aspect represents another form of encapsulation (known as cross cutting) where the described structure can be applied to any content of any Content Type. Any number of Aspects may be applied to a single piece of Content. Common meta-data schemes (such as Dublin Core) are well suited to being described as an Aspect as the meta-data could be applied to any Content Type in the Repository. With careful design, Aspects can alleviate the issue where all capabilities (or meta-data) is dumped onto a root Content Type, thus adding unnecessary overhead. The core Repository services (such as Versioning and Classification) make heavy use of Aspects.

The Alfresco data dictionary meta-model is pictured (UML notation) below.

Data Dictionary Meta Model

Data Types

The following data types are supported:

Type Name Description Java Equivalent
text java.lang.String
content Content Descriptor (includes content store URL, MIME type, etc.) org.alfresco.service.cmr.repository.ContentData
int java.lang.Integer
long java.lang.Long
float java.lang.Float
double java.lang.Double
date java.util.Date
datetime java.util.Date
boolean java.lang.Boolean
qname Qualified Name org.alfresco.service.namespace.QName
category Reference to a category within a classification java.lang.String
noderef Node Reference org.alfresco.service.cmr.repository.NodeRef
period Period/duration/cron/... org.alfresco.service.cmr.repository.Period
path Path org.alfresco.service.cmr.repository.Path
any Any of the above java.lang.Object

The canonical list of types is to be found in tomcat/webapps/alfresco/WEB-INF/classes/alfresco/model/dictionaryModel.xml

Content Model Schema Tour

Content Model Overview

Now we have a meta-model, we can start to define a Content Model. A Content Model is a collection of related Content Types and Aspects. As in the XML world, Namespaces are used to ensure that a global set of unique definitions are developed. A Content Model may refer to definitions within another model.

As a side note, it is our vision that common Content models will be developed (for example, to represent standards or common classification schemes) and that these will be available via the Internet for download.

M2model.gif

Built-in Content Model Namespaces

Each Content model is described in its own XML file and is identified by its defined Namespace and Name. Out-of-the-box, the Alfresco Repository is primed with several models:

The above models use the following Alfresco Namespaces and may be found within the Alfresco source code tree at:

  • projects\repository\config\alfresco\model\systemModel.xml
  • projects\repository\config\alfresco\model\contentModel.xml
  • projects\repository\config\alfresco\model\applicationModel.xml
  • projects\data-model\config\alfresco\model\dictionaryModel.xml

The Repository also defines several other models to support the implementation of services such as User management, Versioning, Actions and Rules. You may also be interested to know that some of our Unit Tests also define their own custom model specifically for the test at hand.

These content models have been modeled using UML Class Diagrams.

NOTE: All models defined within the Alfresco namespace http://www.alfresco.org, are "owned" by Alfresco and are subject to change between released versions of Alfresco. Generally, it is not good practice to modify these models as custom changes may not be compatible with future Alfresco changes. However, if you must change the definition of a type or aspect within one of these models, the following approach may be used.

Schema Namespaces

All Content Model XML must conform to the following Data Dictionary XML Schema. It's time to look at the XML - the following snippets are from the Content Domain Model. Each model starts by introducing itself. Here, you can see the model is named 'cm:contentmodel' and the XML conforms to the Dictionary XML schema.

<model name="cm:contentmodel" xmlns="http://www.alfresco.org/model/dictionary/1.0">
   <description>Alfresco Content Model</description>
   <author>Alfresco</author>
   <published>2005-06-03</published>
   <version>1.0</version>
   ...

The 'cm' prefix is defined as follows using the <namespaces> element:

   <namespaces>
      <namespace uri="http://www.alfresco.org/model/content/1.0" prefix="cm"/>
   </namespaces>

A model may define any number of Namespaces, but may also refer to other model definitions by importing their Namespace. This model is importing the Dictionary namespace under the prefix 'd'. This allows it to refer to data types (such as text and category) as defined by the dictionary model.

   <imports>
      <import uri="http://www.alfresco.org/model/dictionary/1.0" prefix="d"/>
   </imports>

Only those Namespaces that have been defined by models registered with the Repository are valid.

Content Types

With the context established, a Content Type can now be defined.

   <types>
      <type name="cm:cmobject">
         <title>Object</title>
         <parent>sys:base</parent>
         <properties>
            <property name="cm:name">
               <type>d:text</type>
            </property>
         </properties>
         <mandatory-aspects>
            <aspect>cm:auditable</aspect>
         </mandatory-aspects>
      </type>
      ...

Above, we define a simple abstract Content Type called 'cm:cmobject'. It supports a single property named 'cm:name' of type 'd:text'. It derives from the system type called 'sys:base', thus inheriting all of its definitions. Notice, the use of <mandatory-aspects> which forces the 'cm:auditable' aspect to be applied automatically whenever content of this Type is created.

Aspects

Now, lets define the 'cm:auditable' aspect:

   <aspects>
      <aspect name="cm:auditable">
         <title>Auditable</title>
         <properties>
            <property name="cm:created">
               <type>d:datetime</type>
            </property>
            <property name="cm:creator">
               <type>d:text</type>
            </property>
            <property name="cm:modified">
               <type>d:datetime</type>
            </property>
            <property name="cm:modifier">
               <type>d:text</type>
            </property>
            <property name="cm:accessed">
               <type>d:datetime</type>
            </property>
         </properties>
      </aspect>
      ...

The above aspect can be applied to any content of any Content Type at run-time. In this particular case, once applied, the auditable properties are automatically maintained whenever the content is touched (see Attaching Model Behaviour).

Additional Property Capabilities

Properties also support the following configuration:

  1. <mandatory enforced='true|false'>true|false</mandatory>
    1. The Alfresco Web Client will enforce the setting of all mandatory properties
    2. The Alfresco Repository will only enforce the setting of mandatory properties with the enforced attribute set
  2. <default>{value}</default> - automatically assign the specified {value} when the node is created
      <property name="my:property">
         <type>d:text</type>
         <mandatory enforced='true'>true</mandatory>
         <default>my default value</default>
      </property>

It's possible to describe how properties are indexed. By default, each property is indexed atomically (i.e. synchronized with committed Repository content). The defaults may be changed as follows:

      <property name="cm:example">
          <type>d:text</type>
          <mandatory>false</mandatory>
          <index enabled="true">
             <atomic>false</atomic>       <!-- index in the background -->
             <stored>false</stored>       <!-- store the property value in the index -->
             <tokenised>true</tokenised>
          </index>
      </property>

Child Associations

Lets look at the definition of Folder which introduces an association:

      <type name="cm:folder">
         <title>Folder</title>
         <parent>cm:cmobject</parent>
         <associations>
            <child-association name="cm:contains">
               <source>
                  <mandatory>false</mandatory>
                  <many>false</many>
               </source>
               <target>
                  <class>sys:base</class>
                  <mandatory>false</mandatory>
                  <many>true</many>
               </target>
               <duplicate>false</duplicate>
               <propagateTimestamps>true</propagateTimestamps>
            </child-association>
         </associations>
      </type>

A child association can be thought of as composition in UML terms. That is, the parent effectively owns the children, and as such, operations like delete etc will propagate through the children. It is also possible to specify whether modification timestamps should be propagated to the parent. The source and target roles of the association are detailed allowing constraints such as cardinality and target class (Content Type or Aspect) to be defined.

Each child association is given a name within the parent. Independently, it is possible to specify if the child's name ('cm:name') must be unique for the parent or not.

The above definition specifies a child association named 'cm:contains' that supports any ('sys:base') child type, enforces name ('cm:name') uniqueness and propagates modification timestamps to the parent.

Peer (Non-Child) Associations

It's also possible to define a peer (non-child) association as follows:

      <aspect name="cm:subscribable">
         <associations>
            <association name="cm:subscribedBy">
               <source>
                  <mandatory>false</mandatory>
                  <many>true</many>
               </source>
               <target>
                  <class>cm:person</class>
                  <mandatory>false</mandatory>
                  <many>true</many>
               </target>
            </association>
         </associations>
      </aspect>

Here, we're defining an association on the subscribable aspect to the people who have subscribed. Unlike child associations, delete propagation does not take place and there is no notion of naming the associated node. Alfresco's search languages do not yet support joins across non-child associations.

That's a brief tour of the Content Model, but it touches on most of the concepts and how they're expressed in the XML. There are plenty of example Content Types and Aspects within the models packaged with the Repository, so take a look at them (or use the Forum, of course) if you get stuck.

Step by Step Model Definition

This section describes how to perform the following common model customisations:

  • Defining a new Content Type (by deriving from an existing one)
  • Defining a new Content Aspect
  • Defining a new Association
  • Defining a new Child Association

Steps are provided for creating a new model XML file and registering it with the Repository.

For your convenience the file we're going to create is located at: {Alfresco source bundle} - /projects/repository/config/alfresco/extension/exampleModel.xml.sample. You can view it here.

Step 1: Create a new Model

Create an XML file (named exampleModel.xml, although this can be any name) and place into a directory that's accessible via the classpath. The recommended location for this is the alfresco.extension package:

  • Tomcat - TOMCAT_HOME/shared/classes/alfresco/extension
  • JBoss - JBOSS_HOME/server/default/conf/alfresco/extension

First, we provide a Model description. An example is given below. The important parts are the name and namespace.

<model name="my:mynewmodel" xmlns="http://www.alfresco.org/model/dictionary/1.0">
   <description>Example custom Model</description>
   <author></author>
   <version>1.0</version>

   <namespaces>
      <!-- Define a Namespace for my new definitions -->
      <namespace uri="my.new.model" prefix="my"/>
   </namespaces>

   <!-- Type and Aspect definitions go here -->
</model>

Step 2: Create a new Content Type

We're going extend the existing 'cm:content' Type and add some properties for 'Standard Operating Procedure' maintenance e.g. published date, authorised by.

First, we need to declare that we want to refer to definitions in the Alfresco Content Model (for the alf:content content type) and Dictionary Model (for property type definitions). We do that by adding the following imports before the <namespaces> element in our XML file:

      ...

      <imports>
         <!-- Import Alfresco Dictionary Definitions -->
         <import uri="http://www.alfresco.org/model/dictionary/1.0" prefix="d"/>
         <!-- Import Alfresco Content Domain Model Definitions -->
         <import uri="http://www.alfresco.org/model/content/1.0" prefix="cm"/>
      </imports>

      <namespaces>
         ...

      <!-- Type and Aspect definitions go here -->
   </model>

Next, we define a new Content Type that inherits from 'cm:content' that supports the new properties. While it is possible to change the existing 'cm:content' definition directly, it is not recommended since any future upgrades will likely overwrite your changes.

   ...
   <!-- Type and Aspect definitions go here -->
   <types>
      <type name="my:sop">
         <title>Standard Operating Procedure</title>
         <parent>cm:content</parent>
         <properties>
            <property name="my:publishedDate">
               <type>d:datetime</type>
            </property>
            <property name="my:authorisedBy">
               <type>d:text</type>
            </property>
         </properties>
      </type>
    </types>
 </model>

Now, we have a new Content Type.

Step 3: Create a new Aspect

For this example, we're going to define an Aspect that records simple information about Images e.g. Width, Height and Resolution.

The definition of a Content Aspect is very similar to that of a Content Type. We just replace <type> with <aspect>. Add the following aspect definition to your model XML file:


   ..
   </types>
 
   <aspects>
      <aspect name="my:imageClassification">
         <title>Image Classification</title>
         <properties>
            <property name="my:width">
               <type>d:int</type>
            </property>
            <property name="my:height">
               <type>d:int</type>
            </property>
            <property name="my:resolution">
               <type>d:int</type>
            </property>
         </properties>
      </aspect>
   </aspects>

Now, we have a new Content Aspect.

Step 4: Register the Model with the Repository

When the Repository starts, the Data Dictionary reads in all registered Content Models and compiles them for run-time access. At that point, the Repository can start to manage content as described by those models.

Model Bootstrapping

The Data Dictionary is populated at start-up by a component called the Dictionary Bootstrap. The Dictionary Bootstrap is informed of which models to load via the following configuration:

    <bean id="myModels" parent="dictionaryModelBootstrap">
        <property name="models">
            <list>
                <value>[path reference to model xml file]</value>
            </list>
        </property>
    </bean>

For example, the out-of-the-box models (i.e. System, Content, Application and Dictionary) are registered in the bean called 'dictionaryBootstrap' within the configuration file 'core-services-context.xml' located in the directory projects\repository\config\alfresco.

Your own models can be registered in the bean called 'extension.dictionaryBootstrap' within the configuration file example-model-context.xml.sample located in the directory projects\repository\config\alfresco\extension.

The bean definition is as follows:

    <bean id="extension.dictionaryBootstrap" parent="dictionaryModelBootstrap" depends-on="dictionaryBootstrap">
        <property name="models">
            <list>
                <value>alfresco/extension/exampleModel.xml</value>
            </list>
        </property>
    </bean>

You'll notice that exampleModel.xml is already registered, but you can remove that and add as many other custom models as you like... A best practice is to introduce the content model through AMP Files.

        ...
        <property name="models">
            <list>
                <value>my/customModel1.xml</value>
                <value>my/customModel2.xml</value>
            </list>
        </property>
        ...

To deploy your model, you will need to rename example-model-context.xml.sample to example-model-context.xml. If you are using a downloaded bundle this file will be in the location mentioned in step 1 above. For more information on configuring the repository please refer to the Repository Configuration Guide.

For more information about Data Bootstrapping, see Bootstrap Data.

Step 5: Testing the Model definition

There are three ways to test the model syntax and semantics are correct.

  • Make model changes and re-start the server (Tomcat, JBoss, Stand-alone etc)

At startup, the models are read and validated. Errors are reported which have to be fixed to successfully start the server which means going round the edit, deploy and start cycle again. This is a bit slow, so the alternatives are ...

  • Execute the org.alfresco.repo.dictionary.TestModel Java Application

NOTE: The TestModel class is not included in Alfresco 4.1.6 or later anymore. The jar file containing test classes is available in the alfresco artifact repository for enterprise customers only. (See Alfresco Jira: MNT-11192)

This test includes a light bootstrap which is extremely fast to execute allowing for quick edit and test cycles of new models.

Usage is as follows:

org.alfresco.repo.dictionary.TestModel [model classpath location]*

Example:

org.alfresco.repo.dictionary.TestModel alfresco/extension/exampleModel.xml

Testing dictionary model definitions...
 alfresco/model/dictionaryModel.xml
 alfresco/model/systemModel.xml
 alfresco/model/contentModel.xml
 alfresco/model/applicationModel.xml
 alfresco/extension/exampleModel.xml
Models are valid.

If you've setup Eclipse as your development environment, this step is very easy to perform.

  • If you're running Alfresco from HEAD (post 2.x) you can also load and activate models dynamically. Refer to Dynamic Models for more details.

Step 6: Defining Associations

We'll now go back to our SOP Content Type definition and add some association definitions.

First, we define an association "my:signoff" from the SOP Content Type to a "sign-off" document which itself is just another piece of content.

Next, we'll define a child association "my:processSteps" for the SOP Content Type which represents a series of "process step" documents.

      <type name="my:sop">
         ...
         </properties>
         <associations>
            <association name="my:signoff">
               <target>
                  <class>cm:content</class>
                  <mandatory>false</mandatory>
                  <many>false</many>
               </target>
            </association>
            <child-association name="my:processSteps">
               <target>
                  <class>cm:content</class>
                  <mandatory>false</mandatory>
                  <many>true</many>
               </target>
            </child-association>            
         </associations>
      </type>

At this point, we can test our updated model definition and re-deploy.

Note: Associations must be declared after properties, or the model will fail to bootstrap.

Constraints

Properties in a model can have constraints defined as part of the model definition. See Constraints.

Model Localization

Every Type, Aspect, Property, Association and DataType defined within a model has a title and description. Both of these values are defined in the model xml file but only one language is supported i.e. the language of the values.

To support localization of a model, it is possible to augment the model XML values with locale specific values. This is achieved by registering a standard Java Resource Bundle for each language variant of a model.

The Alfresco models (System, Content, Application, Dictionary) are each packaged with a default Resource Bundle. An example from the Content Model follows:

# Display labels for Content Domain Model

cm_contentmodel.description=Alfresco Content Domain Model

cm_contentmodel.type.cm_object.title=Object
cm_contentmodel.type.cm_object.description=Base Content Domain Object
cm_contentmodel.property.cm_name.title=Name
cm_contentmodel.property.cm_name.description=Name

cm_contentmodel.type.cm_folder.title=Folder
cm_contentmodel.type.cm_folder.description=Folder
cm_contentmodel.property.cm_orderedchildren.title=Ordered Children
cm_contentmodel.property.cm_orderedchildren.description=Indicates whether the children of the folder are ordered
cm_contentmodel.association.cm_contains.title=Contains
cm_contentmodel.association.cm_contains.description=Contains

The key structures within each Resource Bundle are:

  • <model_prefix>_<model_name>.[title|description]
  • <model_prefix>_<model_name>.<model_element>.<element_prefix>_<element_name>.[title|description]

where:

  • <model_prefix> is the model namespace prefix
  • <model_name> is the model name
  • <model_element> is one of type, aspect, property, association, datatype
  • <element_prefix> is the element namespace prefix
  • <element_name> is the element name

For example, we provide the following Resource Bundle for the Example Model called ExampleModelResourceBundle:

my_mynewmodel.description=Example custom model
my_mynewmodel.type.my_sop.title=SOP
my_mynewmodel.type.my_sop.description=Standard Operating Procedure
my_mynewmodel.property.my_publishedDate.title=Published Date
my_mynewmodel.property.my_publishedDate.description=Date signed off and published
...

Since the 4.0 release LIST constraints can now also be localized. In order for the constraint values to be localised it must be assigned a name. The key structure used is:

listconstraint.<constraint_prefix>_<constraint_name>.<allowed_value>

For example taking the LIST constraint defined in bpmModel.xml

<constraint name="bpm:allowedPriority" type="LIST">
   <parameter name="allowedValues">
      <list>
         <value>1</value>
         <value>2</value>
         <value>3</value>
      </list>
   </parameter>
</constraint>

the following keys are used to localise the allowed values:

listconstraint.bpm_allowedPriority.1=High
listconstraint.bpm_allowedPriority.2=Medium
listconstraint.bpm_allowedPriority.3=Low

NOTE: The model prefix and name must not be provided.

Registering a Model Resource Bundle

If you remember from Step 4: Register the Model with the Repository a Dictionary Bootstrap component exists for registering models. This component is also used for registering Model Resource Bundles:

   <bean id="extension.dictionaryBootstrap" ...
        ...
        <property name="labels">
            <list>
                <value>ExampleModelResourceBundle</value>
            </list>
        </property>
        ...
    </bean>

The property labels is a list of resource bundles that can be found in the classpath.

Attaching Model Behaviour

TODO: Document how to attach behaviour to Aspects using the PolicyComponent and Java.

For an example, refer to CustomAspect sample in the SDK which demonstrates a 'Content Hits' aspect.

Note: Alfresco Community Member Jeff Potts of Optaros wrote a great tutorial on behaviors.

Using a Custom Model

Web Client

The Web Client can be configured to expose custom Content Types and Aspects. With each release, the depth of configuration will increase. For details on how to configure the Web Client for custom models visit here.

API

The Alfresco public APIs provide support for managing custom Content Types and Aspects. For example, the Java API provides a Node Service which allows the creation of content of any Content Type, the setting and getting of Property values and the application of Content Aspects.

Types, Aspects, Properties and Associations are referred to by their name as defined in the model.

Further reference material for each API can be found at:

Programmatic Access to the Data Dictionary

Data driven applications can be developed against the Data Dictionary. In particular, the Java Service API provides a DictionaryService interface.

Modifying an Alfresco Content Model

Although not recommended, it is sometimes necessary to modify the definition of an out-of-the-box Alfresco content type or aspect.

As an example, imagine you wish to modify the definition of the type cm:content (which resides in contentModel.xml) to add new properties and a default versionable aspect like so:

     <type name="cm:content">
        <title>Content</title>
        <parent>cm:cmobject</parent>
        <archive>true</archive>
        <properties>
           <property name="cm:content">
              <type>d:content</type>
              <mandatory>false</mandatory>
              <index enabled="true">
                 <atomic>true</atomic>
                 <stored>false</stored>
                 <tokenised>true</tokenised>
              </index>
           </property>
           <property name="cm:myprop">
              <type>d:text</type>
           </property>
        </properties>
        <mandatory-aspects>
           <aspect>cm:versionable</aspect>
        </mandatory-aspects>
     </type>

Please note: Modifications must be restricted to additions (e.g. new properties, new associations). Other kinds of modifications (e.g. delete property) may cause an unexpected error in Alfresco.

Alfresco does not yet formally support this type of customisation (see JIRA issue AR-1073). However, the following work-arounds may be employed if it's an absolute must to modify Alfresco definitions:

1) Directly edit an existing model definition. That is, edit contentModel.xml to apply the above modification.
2) Copy an existing model definition, edit it, and re-register. This will completely replace the existing model definition with your new edited version.

Copy contentModel.xml to customContentModel.xml and apply the above modification to customContentModel.xml. Then add the following the bean to Alfresco's extension config:

    <bean id="custom.dictionaryBootstrap" parent="dictionaryModelBootstrap" depends-on="dictionaryBootstrap">
        <property name="models">
            <list>
                <value>alfresco/extension/model/customContentModel.xml</value>
            </list>
        </property>
    </bean>

This is very similar to approach 1, except the modification is isolated to the extension directory, rather than editing core Alfresco files.

The other advantage of this approach is that you can register other custom models prior to registering your modified Alfresco model. This means your modified Alfresco model may refer definitions in other custom models.

3) While this approach might seem like a valid one, it DOES NOT WORK. Define your own model, import the existing model namespace, and re-define any types/aspects you wish modify. With Alfresco v1.4, the Data Dictionary will allow this (i.e. no error is given). However, Alfresco will still use the original definition and ignore the modified one.
Personal tools
Download and go
© 2014 Alfresco Software, Inc. All Rights Reserved. Legal | Privacy | Accessibility