2-Way Contact Synchronization

Analysis Overview

At the request of partners and customers attempting to implement a Contact synchronization service between SmartOffice and their internal systems with their own Contact Database, we analyzed five Ebix projects that included a middleware service accomplishing 2-Way Contact Synchronization through SmartIntegrator’s XML Engine.

The five services, based off different technologies and architecture, all have very similar business requirements and this analysis attempt to compare and contrast the process and functionality design between the projects.

2WaySync1.jpg

Defining Scope

We found six factors that primarily influenced the scope of the requirements and design. Before committing to implementing a similar project, we recommend carefully considering the following concepts and deciding how your project will approach each.

Conflict Resolution Interface

Most of the services have a Conflict Report for Contacts determined to be duplicates and determined to have insufficient data. Some of the services consider a difference of any data element on the Contacts a conflict and present the user with a way to selectively determine how these conflicts are resolved on a field-by-field basis as a manual process or a configuration.

The sophistication of your conflict resolution strategy will directly impact the scope of your work more than any other factor. You will need to carefully balance the power you want to give users, with the difficulty of training users for complicated workflows and configurations, with the limitations of resources you want to commit to the project.

System of Record Settings

Deciding how you will determine the “system of record” for matching Contacts will determine the design of comparing matching records and applying updates. Using the record of one system as the ultimate system of record for matching Contacts and using that Contact to completely overwrite the Contact in the other is the simplest design in the services analyzed.

Some of the services have system of record rules at the field-level (i.e. the foreign system is always correct for Addresses and SmartOffice is always correct for Birth Date) on the complex end of the spectrum. In most of the services, a populated value would never be overwritten with a NULL value from the other system. Some of the services will look at the record Created On and Modified On time stamps to determine which data is more current and apply changes based upon those criteria.

Synchronization Criteria

Your decision of which Contact records from each system you are going to synchronize is going to influence the data volume your project is going to be dealing with. Knowing the expected volume for each synchronization session will influence your scalability design and throughput metrics.

The synchronization criteria and volume will also influence what user workflows are feasible and how fuzzy the matching algorithms can be without compromising your throughput goals.

Delete Logs

Being able to reference a log of Contact records deleted from a system is extremely useful. This gives you the option to translate that deletion into the Contact being deleted or removed from the Synchronization Criteria of the other system. This prevents the frustrations of users not being able to figure out how to permanently delete the Contact from one or both systems. [i]

The absence of this option requires more complex design to be implemented to deal with record deletion in general.

Foreign ID Capture

Being able to write a persistent, unique identifier for a record from one system into the other system is another powerful tool. This will allow you to streamline the matching algorithm and a workaround for identifying when a record has been deleted in the absence of the Delete Log.

Not having this option will require that less dependable components of your matching algorithm always be used. You will either need to implement fuzzier matching logic (compromising performance) or accept a higher rate of logical matching failures.

Workflow Triggers and On-Demand Synchronization

Determine what initiates the synchronization of a Contact, a sub-set of Contacts, or the full set of Contacts has a large impact on scope. Some services have an automated full synchronization scheduled to occur a specified time intervals and these services usually have the lease conflict resolution options and user workflows as everything is happening in the background.

The more sophisticated systems have a user workflow triggers that tells the synchronization application an event has occurred and the Contact needs to be synchronized immediately. This allows for real-time data synchronization between the two databases. [ii]

Another approach implemented in some of the systems is an On-Demand initiation. They provide the user with a way to manually initiate synchronization of a Contact record.

Typical Process

The overall workflow of the system’s analyzed indicates the below important steps and requirements as best practices.

System of Record Pushes First

Most of the services analyzed view SmartOffice as the system of record and for this reason move data from SmartOffice to the external system first. Having the system of record push first, simplifies the field-level contact handling. The data in the secondary system gets overwritten in the first stage, then when data is moved the other direction we know that the proper data has survived.

Process Delete Logs First

To prevent the unnecessary handling of data that is going to be deleted anyway, and to prevent the repeated resurrection of records users are trying to remove from both systems, the removal of deleted records from the target system should be the first step in a synchronization process.

We advocate determining what records have been deleted from the source system, applying the same standard RMA to find these records in the target system, and then deleting each from the target system. This should be done before processing the normal records from the source system.

One RMA for Both Directions

We recommend making the RMA for each direction of 2-Way Synchronization as identical as possible. Inconsistencies in the matching logic between directions are the primary source of data duplication in the synchronization process.

SYNC Method

When pushing data into SmartOffice, leverage the SYNC method is very effective way to limit the number of request/response transactions needed to process the data set. Using the SYNC method will also simplify your design as it reduces the number of temporary variables that need to be handled by the synchronization application. We acknowledge it is generally impossible to put the full RMA in a single SYNC request (i.e. Object ID matching is not supported in SYNC) and we still encourage you to leverage it to the extent you can.

In the projects analyzed, SYNC requests for the full data sets are made in batches around fifty records. Often the requests are multithreaded in order for the project to hit elapsed-time goals.

Always Write Foreign ID and Add To Sync Set

Whenever a Contact is matched upon or inserted in either system, the populate of the source system’s ID should written to the record even if the requirements dictate the target record should not be updated. Tagging the target record with the ID will provide a unique one-shot match on all future synchronization sessions allowing the synchronization application to skip more complex and time consuming logic.

Similar to writing the foreign ID, matching upon or inserting a record in the synchronization process should also always modify the target record in a way that it is added to the data set that will be used for processing in the other direction. Failing to do this often results in overall system of record requirements being violated as the data is continually only synchronized in one direction for individual Contacts that are targeted in one direction, but missing from the source set in the other direction.

SEARCH then GET (and Multithread)

Do not attempt to use a single SEARCH request to retrieve all of the data when retrieving data from SmartOffice as the source data unless you can reasonably assume you'll have less than 2,000 Contacts. This is a very typical design mistake because it is so simple to implement and appears to work with small data sets. The problem is that response pagination will start at 2,000 records that requiring SmartOffice to maintain a persistent session so your synchronization application can retrieve page after page of data. This session has a memory buffer limit and the session data will be flushed when the limit is reached resulting in the premature end of response data.

Instead, use the SEARCH request to only retrieve the Object ID of all the Contact records and follow it up with a GET request using the Object ID’s to retrieve the rest of the data for each Contact. This will return all of your data without violating the buffer limit. This can also give you better elapsed time for the overall process as you make simultaneous multi-threaded GET requests for all of the data where the SEARCH page requests would have been single-threaded.

Launch Conflict Resolution

The final phase of each synchronization process is to launch the conflict resolution workflow. As Contact records are identified as being in conflict in the previous phases, they are usually added to a conflict log that is referenced for this final phase. Having the user address conflicts immediately as they are identified in each phase is usually undesirable for usability reasons.

Record Matching Algorithm

Below is an example of the logical workflow for processing an individual Contact record from an external system against SmartOffice. This example represents a cross-section of the workflows from the systems analyzed. This workflow follows the best practices listed above and should provide a starting point for design on any synchronization project.

2WaySync2.jpg

[i] There is a table of logically deleted Contact records in SmartOffice that can be used as a Deletion Log. As of April 25th, 2012 and enhancement is in development to expose this table through a method on the SmartIntegrator XML Engine.

[ii] As of April 25th, 2012 there is no Workflow Trigger feature in SmartOffice. This is an enhancement scheduled to be implemented in SmartOffice v8.



Edit | Attach | Print version | History: r4 < r3 < r2 < r1 | Backlinks | View wiki text | Edit WikiText | More topic actions...
Topic revision: 29 Oct 2012, dustin
 

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback