Integration Middleware Brick
Univ of Hawaii - ITS Technical Architecture - Brick
Integration Middleware
Primary Architect: Michael Hodges
Background:
Middleware is required for efficiently sharing data between applications in many scenarios. From a simplified middleware perspective, there are two types of applications and four data sharing strategies. Each data sharing strategy has its strengths and weaknesses.
Application Types:
Publishers
Applications that make business data available to other applications.
Generally the publishers are canonical and are considered to be the Systems of Record (SoR). Each of the ERP applications (e.g.: Banner, KFS, PeopleSoft HR) is a SoR.
Consumers
Applications that access or acquire business data from SoRs and other applications.
Common Data Sharing Strategies:
Enterprise Service Bus (ESB) and/or Message Broker (MB)
Data is published, queued, and consumed, effectively decoupling the applications that publish the data from the applications that consume the data.
Data publication is event driven, resulting in near real-time sharing of data updates.
Data can be published once and consumed multiple times.
This is a highly scalable architecture, conserves the publisher application’s CPU and I/O resources, and protects the publisher application from poorly written queries.
Standardizes the code needed to consume the data. Opensource MB clients are readily available for nearly every programming language.
Decouples the publishing from the consuming applications. There is substantially diminished need to coordinate or otherwise anticipate planned and unplanned application outages.
The ESB or MB infrastructure is generally implemented to be redundant and highly available.
Application Programming Interface (API) and/or Application Web Service (WS)
A custom application interface that allows a consumer application to request data.
The consumer application must periodically poll the publisher API and may require custom logic to determine what data updates, deltas, have occurred.
The applications are considered to be coupled.
Each request impacts the publisher application workload.
The consumer application may throw an unexpected error if the publisher is not available.
Design flaws in the consumer application may impact the performance of the publisher application.
Direct database access (DB)
The consumer application includes custom logic to directly access the publisher database.
The applications are considered to be tightly-coupled.
Each request impacts the publisher application workload.
The consumer application may throw an unexpected error if the producer is not available.
Design flaws in the consumer application may impact the performance of the publisher application.
Any changes to the publisher application design (data, business logic) require software modifications and regression testing for the consumer application(s).
Batch Extract, Transform, Load (Batch ETL) process
A custom, recurring batch-process step extracts data from the publisher application, transforms it, and saves it as an extract file.
A second custom, recurring batch-process step reads the extract file, transforms the data, and updates the consumer application.
Custom semaphores are required to coordinate the file exports and imports in order to ensure the integrity of the process.
Use-Cases and Factors for Selecting an Integration Solution:
Near Real-time data synchronization (e.g.: sub one minute)
Data updates immediately trigger events that publish the data to a message queue where it is immediately available for one or more applications to consume. A minimal amount of network latency can be anticipated.
The consumer applications do not need additional logic to detect the data updates.
Periodic data synchronization (e.g.: hourly, nightly)
A MB queue may be utilized and read periodically.
A Batch ETL process may be used.
High volume updates (e.g.: large initial data loads, large periodic data reloads)
A MB queue may be utilized and read periodically.
A Batch ETL process may be used.
Application decoupling (e.g.: workload decoupling, application logic decoupling)
Decoupling applications ensures that:
planned and unplanned outages for one application do not affect the availability of the related applications;
peak workloads for one do not impact related applications
business logic changes in one do not require recoding in the others.
Publish once, consume many times
Once a message is published, any number of registered consumer applications can consume the message before it is retired.
Transactional data processing (e.g.: direct SoR database updates)
Database design provides for bundling database updates into transactions to ensure data integrity. If a transaction is not committed in its entirety the partially updated data is backed out.
State checking (e.g.: a direct SoR access for current student class enrollment status)
Query the SoR data to determine current state, such as student enrollment in a class, employee leave status, etc.
Familiarity (e.g.: ease of developing source code to utilize a data sharing strategy)
The newer the technology the less the developer community will be familiar with it.
Data Sharing Recommendations Table:
Integration Use-Cases & Factors | MB | API/WS | DB | Batch ETL |
| Yes | No(2) | No(2) | No |
| Yes | Yes | Yes | Yes |
| Yes | Yes | Yes | Yes |
| Yes | No | No | Yes |
| Yes | No | No | No(4) |
| No | Yes(6) | Yes(5) | No |
| No | Yes | Yes | No |
| No(1) | No(3) | Yes | Yes |
Table footnotes:
This is relatively new technology for our community. It has been introduced via the UH App Developer meetings and IAM documentation for developers. HCC, ORS, MIS and IAM thus far have become familiar with the MB.
Brokered event-triggered data synchronization can be considered near real-time and scalable. Direct high frequency polling of an application is not scalable and is not recommended.
Writing custom APIs, while not difficult, is not a typical programming task.
Batch ETLs tend to be written one per pair of applications.
However, it is unusual for another application to update the data of a System of Record.
Yes, if the API/WS is designed accordingly. The API/WS decides what’s considered to be a transaction, not the consumer application.
Experimental |
|
Strategic (3-5 Years) |
|
Tactical (1-2 Years) |
|
Containment |
|
Retirement |
|
Notes
The UH Message Broker is in Production for the following:
Banner, message producer
HCC, message consumer
KFS, message consumer
myGrant, message consumer
SECE, message producer
UHIMS, message producer and consumer
Emerging Trends
Banner Event Processor (BEP), developed by Scott Masuno
Consumer apps for student and course information
UHM requires auto-generated email lists based on student educational objectives.
HCC is interested providing instructors with dynamic student add/drop information as events unfold.
The BEP can replace the Luminis Message Gateway functionality.
Enterprise Service Bus
This is a layer above the Message Broker and supports multiple message brokers, provides for multiple routes such that a single message can be published to multiple destinations rather than to a single Message Broker, for example. Standardized as well as custom ETL logic can be associated with routes and destinations as needed; for example, data transformation to JSON. So far we’ve not determined a need for the additional complexity.
Change History
Brick first approved January 2015
Definitions
Experimental | Someone in ITS is currently investigating or experimenting with this technology. |
Strategic | ITS will be investing in this technology for 3-5 years. |
Tactical | ITS will be investing in this technology for 1-2 years. |
Containment | ITS will continue to use this technology for existing systems, but will no longer invest in this technology and/or grow its use. |
Retirement | ITS has a firm plan (and timeline) to retire this technology. |