/
Integration Middleware Brick

Integration Middleware Brick

Univ of Hawaii - ITS Technical Architecture - Brick   

Integration Middleware

Primary Architect:  Michael Hodges

Background:

Middleware is required for efficiently sharing data between applications in many scenarios.  From a simplified middleware perspective, there are two types of applications and four data sharing strategies.  Each data sharing strategy has its strengths and weaknesses.

Application Types:

  1. Publishers

    1. Applications that make business data available to other applications.  

    2. Generally the publishers are canonical and are considered to be the Systems of Record (SoR).  Each of the ERP applications (e.g.: Banner, KFS, PeopleSoft HR) is a SoR.

  2. Consumers

    1. Applications that access or acquire business data from SoRs and other applications.

Common Data Sharing Strategies:

  1. Enterprise Service Bus (ESB) and/or Message Broker (MB)

    1. Data is published, queued, and consumed, effectively decoupling the applications that publish the data from the applications that consume the data.

    2. Data publication is event driven, resulting in near real-time sharing of data updates.

    3. Data can be published once and consumed multiple times.  

      1. This is a highly scalable architecture, conserves the publisher application’s CPU and I/O resources, and protects the publisher application from poorly written queries.

      2. Standardizes the code needed to consume the data.  Opensource MB clients are readily available for nearly every programming language.

    4. Decouples the publishing from the consuming applications.  There is substantially diminished need to coordinate or otherwise anticipate planned and unplanned application outages.

    5. The ESB or MB infrastructure is generally implemented to be redundant and highly available.

  2. Application Programming Interface (API) and/or Application Web Service (WS)

    1. A custom application interface that allows a consumer application to request data.

    2. The consumer application must periodically poll the publisher API and may require custom logic to determine what data updates, deltas, have occurred.

    3. The applications are considered to be coupled.  

      1. Each request impacts the publisher application workload.

      2. The consumer application may throw an unexpected error if the publisher is not available.

      3. Design flaws in the consumer application may impact the performance of the publisher application.

  3. Direct database access (DB)

    1. The consumer application includes custom logic to directly access the publisher database.  

    2. The applications are considered to be tightly-coupled.  

      1. Each request impacts the publisher application workload.

      2. The consumer application may throw an unexpected error if the producer is not available.

      3. Design flaws in the consumer application may impact the performance of the publisher application.

      4. Any changes to the publisher application design (data, business logic) require software modifications and regression testing for the consumer application(s).

  4. Batch Extract, Transform, Load (Batch ETL) process

    1. A custom, recurring batch-process step extracts data from the publisher application, transforms it, and saves it as an extract file.  

    2. A second custom, recurring batch-process step reads the extract file, transforms the data, and updates the consumer application.

    3. Custom semaphores are required to coordinate the file exports and imports in order to ensure the integrity of the process.

Use-Cases and Factors for Selecting an Integration Solution:

  1. Near Real-time data synchronization (e.g.: sub one minute)

    1. Data updates immediately trigger events that publish the data to a message queue where it is immediately available for one or more applications to consume.  A minimal amount of network latency can be anticipated.

    2. The consumer applications do not need additional logic to detect the data updates.

  2. Periodic data synchronization (e.g.: hourly, nightly)

    1. A MB queue may be utilized and read periodically.

    2. A Batch ETL process may be used.

  3. High volume updates (e.g.: large initial data loads, large periodic data reloads)

    1. A MB queue may be utilized and read periodically.

    2. A Batch ETL process may be used.

  4. Application decoupling (e.g.: workload decoupling, application logic decoupling)

    1. Decoupling applications ensures that:

      1. planned and unplanned outages for one application do not affect the availability of the related applications;

      2. peak workloads for one do not impact related applications

      3. business logic changes in one do not require recoding in the others.

  5. Publish once, consume many times

    1. Once a message is published, any number of registered consumer applications can consume the message before it is retired.

  6. Transactional data processing (e.g.: direct SoR database updates)

    1. Database design provides for bundling database updates into transactions to ensure data integrity.  If a transaction is not committed in its entirety the partially updated data is backed out.

  7. State checking (e.g.: a direct SoR access for current student class enrollment status)

    1. Query the SoR data to determine current state, such as student enrollment in a class, employee leave status, etc.

  8. Familiarity (e.g.: ease of developing source code to utilize a data sharing strategy)

    1. The newer the technology the less the developer community will be familiar with it.

 

Data Sharing Recommendations Table:

 

Integration Use-Cases & Factors

MB

API/WS

DB

Batch ETL

  1. Near Real-time data synchronization

Yes

No(2)

No(2)

No

  1. Periodic data synchronization

Yes

Yes

Yes

Yes

  1. High volume updates

Yes

Yes

Yes

Yes

  1. Application decoupling

Yes

No

No

Yes

  1. Publish once, consume many times

Yes

No

No

No(4)

  1. Transactional data processing

No

Yes(6)

Yes(5)

No

  1. State checking

No

Yes

Yes

No

  1. Familiarity

No(1)

No(3)

Yes

Yes

Table footnotes:

  1. This is relatively new technology for our community.  It has been introduced via the UH App Developer meetings and IAM documentation for developers.  HCC, ORS, MIS and IAM thus far have become familiar with the MB.

  2. Brokered event-triggered data synchronization can be considered near real-time and scalable.  Direct high frequency polling of an application is not scalable and is not recommended.

  3. Writing custom APIs, while not difficult, is not a typical programming task.

  4. Batch ETLs tend to be written one per pair of applications.

  5. However, it is unusual for another application to update the data of a System of Record.

  6. Yes, if the API/WS is designed accordingly.  The API/WS decides what’s considered to be a transaction, not the consumer application.

 

 

Experimental

  • Message Broker: RabbitMQ Clusters for High Availability

Strategic (3-5 Years)

  • Message Broker: RabbitMQ

  • Web Services: RESTful (for publishing and consuming)

Tactical (1-2 Years)

  • Message Broker: RabbitMQ version 3.x

  • Batch ETL: Custom Programming (e.g. SQL Script to Extract)

  • API: Application Specific 

  • Web Services: SOAP (for consuming)

Containment

  • Direct Database (e.g. Oracle DBLink)

  • Batch ETL (e.g. Oracle Warehouse Builder)

  • Web Services: SOAP (for publishing)

Retirement

  • Message Broker: Luminis Message Broker & Gateway

Notes

  • The UH Message Broker is in Production for the following:

    • Banner, message producer

    • HCC, message consumer

    • KFS, message consumer

    • myGrant, message consumer

    • SECE, message producer

    • UHIMS, message producer and consumer

Emerging Trends

  • Banner Event Processor (BEP), developed by Scott Masuno

    • Consumer  apps for student and course information

      • UHM requires auto-generated email lists based on student educational objectives.

      • HCC is interested providing instructors with dynamic student add/drop information as events unfold.

      • The BEP can replace the Luminis Message Gateway functionality.

  • Enterprise Service Bus

    • This is a layer above the Message Broker and supports multiple message brokers, provides for multiple routes such that a single message can be published to multiple destinations rather than to a single Message Broker, for example.  Standardized as well as custom ETL logic can be associated with routes and destinations as needed; for example, data transformation to JSON.  So far we’ve not determined a need for the additional complexity.

 


Change History

  • Brick first approved January 2015



Definitions

Experimental

Someone in ITS is currently investigating or experimenting with this technology.

Strategic

ITS will be investing in this technology for 3-5 years.

Tactical

ITS will be investing in this technology for 1-2 years.

Containment

ITS will continue to use this technology for existing systems, but will no longer invest in this technology and/or grow its use.

Retirement

ITS has a firm plan (and timeline) to retire this technology.

 


Related content