High Availability Configuration V1.3
From AlfrescoWiki
Contents |
[edit] Introduction
*********************************************************************** * THE CONFIGURATIONS IN THIS DOCUMENT WILL NOT WORK WITH 1.4 OR LATER * ***********************************************************************
The Alfresco server is built up of a vast number of interacting components that are mostly wired together using the Spring framework. This document gives a high-level overview of the components involved in a cluster of Alfresco servers. Canned or sample configurations are available from Alfresco upon request.
NOTE: This document assumes knowledge of how to extend the server configuration. (Repository Configuration)
- <configRoot> - the Alfresco configuration files
- <extConfigRoot> - the customized system configuration files
- All configuration changes are shown against the original files, but it is expected that configuration will be done by means of extension rather than modication. The former will ease the load on the administrator during future upgrades.
[edit] Configuration Components
[edit] Alfresco Server

This is the core component in the configuration. Although there are many configuration options controlling the server, these will only be dealt with here insofar as they affect the servers' interoperability with other components.
Any server has three essential points of failure when it comes to data loss: The database, the content store (if not in the database) and the indexes. Although the indexes are composed entirely of derived data and can therefore be full recovered or rebuilt, this can be a time-consuming processes.
The root location of all server configuration files is dependent on the particular type of installation. Details of where the configuration can be found are available with the download of each type of installation. Most of the Alfresco configuration files will be located in an alfresco subdirectory located on the classpath. For the purposes of this document, that location will be referred to as <configRoot>/alfresco.
<configRoot>/alfresco/domain/transaction.properties
The settings in the file control whether transactions in the server are read-only or not. In other words, the server can be put into a read-only mode, disallowing any write operations from occuring.
<configRoot>/alfresco/scheduled-jobs-context.xml
The Alfresco server uses the Quartz libraries to manage scheduled jobs. All scheduled jobs are kept here.
[edit] Hibernate L2 Cache and other Caches
![]()
Caches are use in two primary areas: The Hibernate Level 2 cache and internal caches used by Alfresco components.
Support within Alfresco is limited to the EHCache.
<configRoot>/alfresco/domain/hibernate-cfg.properties
The Hibernate cache provider class and transaction strategy are specified here.
<configRoot>/alfresco/hibernate-context.xml
Here the cache strategy is mapped to the persistent entities and collections.
<configRoot>/alfresco/cache-context.xml
This file contains caches configured for direct use by server components.
[edit] EHCache
EHCache is not inherently transactional but is wrapped by Hibernate so that all entities visible in the L2 cache conform to the cache strategy specified. A transactional cache adapter, org.alfresco.repo.cache.TransactionalCache, exists within Alfresco so that internal caches function transactionally. All necessary EHCache classes are shipped with the Alfresco server.
The currently shipped version of EHCache, which is 1.2.2, allows the caches to be clustered. Unfortunately, the configuration is quite verbose for the number of caches that are contained in the Alfresco system. See the EHCache homepage or the printable manual for more details.
<configRoot>/alfresco/cache-context.xml
This file contains the bean definitions for the Alfresco-controlled caches. The transactionalEHCacheManager bean defines the location of EHCache configuration for the Alfresco-controlled caches. This bean can be overridden to change the location of the cache configuration file. Note that for EHCache, each permanent cache, e.g. xyzSharedCache, is transactionally wrapped by a xyz bean.
<configRoot>/alfresco/ehcache-default.xml
This file contains the configurations for the Hibernate-controlled caches, amongst other things, controlling the number of entities that can be loaded into the cache on a per-entity and per-association basis. The default values in the file will lead to a total cache size of approximately 512MB. Generally a server should have much more than this available in order to cope with the transient, per-session, memory requirements.
<configRoot>/alfresco/ehcache-transactional.xml
This is the default location of the file containing the configurations for the Alfresco-controlled caches.
[edit] SwarmCache
The SwarmCache will have an adapter soon, allowing it to be added to the configurable options supported by the Alfresco Server.
This cache supports clustered invalidation, i.e. the local VM is responsible for adding objects to the cache just like EHCache and only removals from the cache are replicated across the cluster.
[edit] TreeCache
The Jboss TreeCache is a fully transactional, cluster-capable cache. It uses UDP multicast support, provided by JGroups, to communicate with other caches in the cluster. It should only be used in clusters, i.e. where multiple Alfresco servers will be used concurrently against the same database or database cluster. All objects added to the cache are replicated across the cluster. See the JBoss TreeCache Documentation for more details.
All necessary classes are shipped with the Alfresco server.
<configRoot>/treecache.xml
The Jboss documentation on the cache configuration should be adequate to configure a clustered cache. A sample configuration ships with the Alfresco server, containing the likely defaults needed to get started quickly.
<configRoot>/alfresco/domain/hibernate-cfg.properties
Use the org.hibernate.cache.TreeCacheProvider as the Hibernate cache provider class.
The only cache strategy that can be used by this cache, within Alfresco, is the transactional caching strategy.
This cache can only be used by Hibernate when operating within a JTA environment; for this reason, the Hibernate session factory needs access to the JTA Transaction Manager.
<configRoot>/alfresco/hibernate-context.xml
The transactionManager must be a JtaTransactionManager. The configuration for this will be shown with examples where necessary as there are several changes around the datasource and JNDI lookups that need configuring.
[edit] Content Stores
Content is critical data and is stored as binary files using a content store abstraction. In order to support distributed environments, various components have been written that perform functions such as synchronous and asynchronous replication and backup. The server is essentially unaware of the particular configurations and will use a single entry point when accessing content.
The stores' interface contract specifies that content never gets overwritten. Everytime a file is updated, the content is actually written to a new file. It is the job of cleanup operations to clear up old content when necessary and appropriate. These are not discussed here.
The content metadata, which points to the location of the content in the content store, is updated as part of the transactions in which all threads operating in Alfresco must partake. Consequently, the behaviour of the content stores is essentially transactional as long as unused content eventually gets cleaned out of the stores.
<configRoot>/alfresco/content-services-context.xml
The contentService bean's store property points to the store to be used by the server.
The default store is a FileContentStore.
<configRoot>/alfresco/extension/replicating-content-services-context.xml.sample
This file is not linked into the application context by default. It contains examples of the various bean configurations pertaining to content replication. Use this file as a template when extending the server High Availability configurations.
<configRoot>/alfresco/repository.properties
The Alfresco data root is configured in this file (as used by the default fileContentStore bean).
ReplicatingContentStore
This store has no storage of its own, but is configured to work on top of a primary and secondary stores. It supports replication of content from the secondary stores into the primary store during reads (inbound replication) and/or replication from the primary store to the secondary stores during writes (outbound replication). The outbound replication can be configured to be either synchronous (in-transactional) or asynchronous.
Further examples will be provided with the appropriate configurations.
ContentStoreReplicator
This is not a content store, but acts as a dedicated process synchronizing, unidirectionally, a primary content store with a secondary content store. This background process can be used to ensure that a backup store remains up to date with a live store. It can be used in conjuntion with other components modifying the stores.
Rsync
For file-based content stores, it is possible to use a 3rd party filesystem replication tool such as Rsync. The net effect is the same.
[edit] Lucene Indexes
Although the indexes contain derived data, the Alfresco server relies heavily on the transactional indexing capabilities built on top of Lucene. The indexes must be kept up to date with the current state of the persistence layer. Components are provided that can be configured to maintain or recover indexes where necessary – once again, depending on the particular server configuration.
<configRoot>/alfresco/repository.properties
The Alfresco data root acts as the root location for the Lucene indexes.
<configRoot>/alfresco/index-recovery-context.xml
The indexRecoveryComponent is disabled by default, but can be configured to execute once before quitting. This single pass is just a quick check to ensure that the state of the indexes is correct relative to the persistent data. This component was designed to provide the ability for backup servers to keep their indexes up to date with a database being changed by another server or process. For this reason, the system-wide L2 cache behaviour can be overwridden as appropriate. Some examples will help illustrate appropriate configurations.
[edit] The Database
Specific database configurations are out of the scope of this document, but the behaviour required given the required server configuration is not. DBAs must be au fait with the clustering and replication cababilities of the database of choice. There are also some 3rd party database replication and clustering libraries available.
<configRoot>/alfresco/repository.properties
The default database connection details can be found here.
<configRoot>/alfresco/core-services-context.xml
The default dataSource bean is defined here using the properties set in repository.properties. This data source will not be used when a JNDI-provided JtaTransactionManager is in use.
<configRoot>alfresco/hibernate-context.xml
All database interaction is handled by Hibernate. The data source provided to the Hibernate session factory will determine how the database is accessed.
[edit] Content Store Replication
[edit] Asynchronous Background Replication
The simplest component to use to enable content store replication is the ContentStoreReplicator, which asynchronously ensures that all content from the source store is copied to the target store. It is possible to configure a replicator to operate from A to B and another to operate from B to A.
<configRoot>/alfresco/extension/replicating-content-services-context.xml.sample
<bean id="backupContentStore"
class="org.alfresco.repo.content.filestore.FileContentStore">
<constructor-arg>
<value>${dir.contentstore}/../backups/alfresco</value>
</constructor-arg>
</bean>
<bean id="contentStoreReplicator"
class="org.alfresco.repo.content.replication.ContentStoreReplicator"
depends-on="fileContentStore, backupContentStore" >
<property name="sourceStore">
<value>fileContentStore</value>
</property>
<property name="targetStore">
<value>backupContentStore</value>
</property>
</bean>
<bean id="contentStoreBackupTrigger" class="org.alfresco.util.CronTriggerBean">
<property name="jobDetail">
<bean class="org.springframework.scheduling.quartz.JobDetailBean">
<property name="jobClass">
<value>org.alfresco.repo.content.replication.ContentStoreReplicator$ContentStoreReplicatorJob</value>
</property>
<property name="jobDataAsMap">
<map>
<entry key="contentStoreReplicator">
<ref bean="contentStoreReplicator" />
</entry>
</map>
</property>
</bean>
</property>
<property name="scheduler">
<ref bean="schedulerFactory" />
</property>
<property name="cronExpression">
<value>0 0 3 * * ?</value>
</property>
</bean>
[edit] In-process Replication
The ReplicatingContentStore serves two primary aims: Replication of content (both inbound and outbound) and simultaneous access to multiple content stores.
The primary ContentStore is the default location from which content will be read and to which content will be written. The secondary stores are used to access content in the event that it doesn't exist in the primary store. The order of the secondary stores is the order in which they will be searched.
[edit] Replication
Replication can be either or both inbound and outbound.
<configRoot>/alfresco/extension/replicating-content-services-context.xml.sample
<property name="inbound">
<value>true</value>
</property>
If content is not present in the primary store, but can be found in one of the secondary stores then the content is copied into the primary store. This can be used to access content that has been archived to a slow-access store but must remain accessible to the repository.
<property name="outbound">
<value>true</value>
</property>
<property name="transactionService">
<ref bean="transactionComponent" />
</property>
<property name="outboundThreadPoolExecutor">
<ref bean="threadPoolExecutor" />
</property>
Newly written content is copied from the primary store to all the secondary stores. If the outboundThreadPoolExecutor property is set, then the content will be replicated outwards asynchronously otherwise replication occurs within the same transaction that closed the stream into the primary store. Either way, when outbound replication is enabled, the transactionService must be supplied.
[edit] Simultaneous Access
Regardless of whether replication is being used, the ReplicatingContentStore can be used to access multiple content stores. The order of access is the primaryStore followed by the secondaryStores in the order in which they appear in the configuration file.
[edit] Standalone Configuration
[edit] Index Recovery Component
<configRoot>/alfresco/index-recovery-context.xml
<bean id="indexRecoveryComponent" class="org.alfresco.repo.node.index.FullIndexRecoveryComponent" parent="indexRecoveryComponentBase">
<property name="executeFullRecovery">
<value>true</value>
</property>
<property name="runContinuously">
<value>true</value>
</property>
<property name="waitTime">
<value>1000</value>
</property>
<property name="l2CacheMode">
<value>IGNORE</value>
</property>
</bean>
If the Lucene indexes are corrupted or need to be rebuilt, then the executeFullRecovery can be enabled. When not running continuously, the processes will make a sequential pass through all persisted data and reindex missing data. If the indexes are corrupted, they can be deleted and this option will ensure that they get rebuilt.
By default, this component just ensures that any outstanding full text indexing is completed shortly after server startup. This is activated by a Quartz job bean, indexRecoveryTrigger, in <configRoot>/alfresco/scheduled-jobs-context.xml.
[edit] Content Store
<configRoot>/alfresco/content-services-context.xml
<bean id="fileContentStore"
class="org.alfresco.repo.content.filestore.FileContentStore">
<constructor-arg>
<value>${dir.contentstore}</value>
</constructor-arg>
</bean>
The file store is given a location against which all content URLs are relative. The property in particular is substituted by Spring.
<configRoot>/alfresco/repository.properties
dir.root=c:/temp/alfresco
dir.contentstore=${dir.root}/contentstore
Override the properties or bean to put content in a well-known location. By default, the store root is relative to the execution location and should be changed for production environments.
[edit] Database
<configRoot>/alfresco/repository.properties
# Database configuration
db.driver=org.gjt.mm.mysql.Driver
db.name=alfresco
db.url=jdbc:mysql:///${db.name}
db.username=alfresco
db.password=alfresco
[edit] Cold Backup
The simplest form of backup for the Alfresco server involves backing up the three critical data stores.
The recommended backup order is:
- Content Stores
- Missing content (relative to the metadata in the database) is handled by the server.
- Database
- Any backup of the live database must adhere to a READ_COMMITTED isolation level. No backup process will ever be fast enough to guarantee that all content metadata has content at this stage. For this reason, the indexes and clients have been developed to handle missing content. Doing a text search for "nicm" (Not Indexed Content Missing) will show some of the documents missing from the store. Otherwise the clients will highlight the missing documents when they are accessed.
- Indexes
- These can be rebuilt; so backing these up this is just a way to save time when starting a server up on restored data. The index recovery component will not only ensure that changes more current than the backed up index are added, but will remove any stale data from the indexes too. If the indexes are corrupted, they can be deleted completely and, with time, the indexRecoveryComponent will completely rebuild the indexes.
- Use the indexBackupTrigger to ensure that a clean and consistent set of indexes are available for backup. If a copy of the live indexes is made, then there is a possibility of having corrupt backups of the indexes.
All three components are resilient to any missing data.
<configRoot>/alfresco/repository.properties
The default location for storage of both content and indexes is defined by the dir.root property.
<configRoot>/alfresco/core-services-context.xml
<bean id="luceneIndexBackupComponent"
class="org.alfresco.repo.search.impl.lucene.LuceneIndexerAndSearcherFactory$LuceneIndexBackupComponent">
<property name="transactionService">
<ref bean="transactionComponent" />
</property>
<property name="factory">
<ref bean="luceneIndexerAndSearcherFactory" />
</property>
<property name="nodeService">
<ref bean="nodeService" />
</property>
<property name="targetLocation">
<value>${dir.root}/backup-lucene-indexes</value>
</property>
</bean>
The luceneIndexBackupComponent (>V1.0.0) bean is able to copy the Lucene files whilst holding an in-VM lock of the Lucene indexes. Set the target location to copy the indexes to. The target location is overwritten each time this process is run.
<configRoot>/alfresco/scheduled-jobs-context.xml
<bean id="indexBackupTrigger" class="org.alfresco.util.TriggerBean">
...
<property name="hour">
<value>03</value>
</property>
<property name="minute">
<value>00</value>
</property>
<property name="repeatInterval">
<value>86400000</value>
</property>
</bean>
[edit] Warm Backup Server
A warm backup server is ready to run after a few minor configuration changes. Content is replicated from the master server's content store (or a backup of it), the indexes are kept up to date locally and the database is replicated using native database mechanisms.
In this example, the server is set to run in read-only mode. This allows full read functionality whilst the backup server is running. Whether or not the server should run in read-only mode is dependent on the behaviour of the slave database. If the slave database allows transactions to be committed, then care must be taken to ensure that no write operations are performed.
[edit] Server
<configRoot>/alfresco/domain/transaction.properties
server.transaction.mode.default=PROPAGATION_REQUIRED, readOnly server.transaction.allow-writes=false
Make all transactions read-only and disable write from components such as the system bootstrap importer.
<configRoot>/alfresco/domain/hibernate-cfg.properties
hibernate.cache.use_second_level_cache=false
Disable the Hibernate L2 cache so that the server always has full visibility of the latest database changes.
[edit] Content Store
Configure a continuously-running content store replicator to pull content from the shared backup store into the primary store on the replicated server.
[edit] Lucene Indexes
<configRoot>/alfresco/index-recovery-context.xml
<bean id="indexRecoveryComponent" class="org.alfresco.repo.node.index.FullIndexRecoveryComponent" parent="indexRecoveryComponentBase">
<property name="executeFullRecovery">
<value>true</value>
</property>
<property name="runContinuously">
<value>true</value>
</property>
<property name="waitTime">
<value>1000</value>
</property>
<property name="l2CacheMode">
<value>IGNORE</value>
</property>
</bean>
When running continuously against a changing database, the Hibernate L2 cache must never be used.
[edit] Hot Backup Server
A hot backup server is always ready to run.
The setup could be exactly the same as the warm backup server, with the exception that it is not read-only. Depending on the access allowed by the database, access to the server may have to be restricted until the server is live, i.e. until the database is ready for use. In a failover environment, the server might be permanently ready for use but invisible until the failover server exposes it.
Currently, the failover characteristics of the CIFS server are under investigation.
[edit] Server
<configRoot>/alfresco/domain/transaction.properties
server.transaction.mode.default=PROPAGATION_REQUIRED server.transaction.allow-writes=true
The server must remain writable in case it is brought online.
<configRoot>/alfresco/domain/hibernate-cfg.properties
hibernate.cache.use_second_level_cache=true
This must be enabled for general use once the server goes live. Until the server goes live, the L2 cache will remain unpopulated. Once the SwarmCache adapters are present, it would be feasible to have all servers (live and backup) run against this type of cache even in the failover environment.
[edit] Content Store
This is the same as the the warm server. If the content store is not shared with the live server, then a ContentStoreReplicator can be configured to pull the content in asynchronously.
Another option is to use a ReplicatingContentStore with the secondary content store being the common backup store.
[edit] Lucene Indexes
<configRoot>/alfresco/index-recovery-context.xml
<bean id="indexRecoveryComponent" class="org.alfresco.repo.node.index.FullIndexRecoveryComponent" parent="indexRecoveryComponentBase">
<property name="executeFullRecovery">
<value>true</value>
</property>
<property name="runContinuously">
<value>true</value>
</property>
<property name="waitTime">
<value>10000</value>
</property>
<property name="l2CacheMode">
<value>IGNORE</value>
</property>
</bean>
All access to the persistence layer will bypass the L2 cache, effectively leaving it completely unchanged while running as a hot backup.
Once the server is made live, the process will still run, but will cease performing any work.
[edit] Clustered Server
Although there are many ways to configure the Alfresco server to run in a cluster, the essential mechanisms of clustered database, clustered caches, shared content and updating indexes remain the same. In the diagram above, the server has been configured to share content via a backup content store, cluster the caches and database, and keep the indexes up to date with an indexRecoveryComponent.
[edit] Lucene Indexes
Configure the index recovery component to run continuously against the L2 cache in NORMAL mode.
[edit] Content Store
In this example, the backup content store is visible to all servers in the cluster. Configure the ReplicatingContentStore to:
- Replicate out to the backup store either synchronously or asynchronously, depending on the server usage.
- Replicate in if the backup store is on a network.
[edit] Database
All databases in the cluster must replicate their data transactionally.
[edit] Caches
In this example, the clustered cache being used will be the Jboss TreeCache. It is a fully transactional cache, and must therefore have direct access to a TransactionManager. When running in the JBoss appserver, this is readily available via JNDI.
<configRoot>/alfresco/hibernate-context.xml
<bean id="jndiTransactionManagerFactory" class="org.springframework.jndi.JndiObjectFactoryBean" >
<property name="jndiName">
<value>java:/TransactionManager</value>
</property>
<property name="proxyInterface">
<value>javax.transaction.TransactionManager</value>
</property>
</bean>
<bean id="transactionManager" class="org.springframework.transaction.jta.JtaTransactionManager" >
<property name="userTransactionName">
<null/>
</property>
<property name="transactionManager">
<ref bean="jndiTransactionManagerFactory" />
</property>
</bean>
Configure a JtaTransactionManager to lookup the TransactionManager from the standard location in the JBoss server JNDI tree.
<bean id="sessionFactory" class="org.springframework.orm.hibernate3.LocalSessionFactoryBean">
<property name="dataSource">
<ref bean="dataSource" />
</property>
<property name="jtaTransactionManager" >
<ref bean="jndiTransactionManagerFactory" />
</property>
...
Hibernate sessions will lookup the TransactionManager to participate in the transactions.
<configRoot>/alfresco/hibernate-cfg.properties
... hibernate.cache.provider_class=org.hibernate.cache.TreeCacheProvider ... cache.strategy=transactional
The TreeCache for the L2 cache will be constructed with the lookup the the common TransactionManager. The L2 cache will therefore be fully transactional.
<configRoot>/treecache.xml
<attribute name="CacheMode">REPL_SYNC</attribute>
<configRoot>/alfresco/cache-context.xml
<bean name="userToAuthorityCache" class="org.alfresco.repo.cache.TreeCacheAdapter">
<property name="cache">
<bean class="org.jboss.cache.TreeCache" init-method="start">
<property name="transactionManagerLookup">
<bean class="org.alfresco.repo.transaction.TransactionManagerJndiLookup">
<property name="jndiName">
<value>java:/TransactionManager</value>
</property>
</bean>
</property>
</bean>
</property>
<property name="regionName">
<value>userToAuthorityCache</value>
</property>
</bean>
As the TreeCache is already fully transactional, there is no need to wrap access to the cache with a transactional adapter cache. The userToAuthorityCache can be defined directly, being passed a fully configured TreeCache. Override all caches to use the TreeCacheAdapter as appropriate.





