BLOG
ย
๐๐๐๐๐๐๐๐๐๐ ๐๐๐๐ ๐๐๐๐๐๐๐๐๐๐๐
ย ๐๐๐๐๐๐ - ๐๐๐๐๐๐๐๐๐๐ ๐๐๐๐ ๐๐๐๐๐๐๐๐๐๐๐
There are several approaches to Enterprise Data Integration on existing enterprise landscapes, shown below in the order of increasing cost for the business.
Higher levels also bring in scalability and performance. On the other hand complexity increases, as well as setup and maintenance effort. Like with anything, it's about hitting the optimal spot for your particular business scenario.
๐๐จ๐ญ๐: for organizations fortunate enough to design an enterprise architecture from grounds up (or completely revamping an existing architecture) consider strategies like Data Virtualization or Data Lakes which eliminate the need to physically move data, leading to the Single Source of Truth (SSOT) concept.
๐๐๐ฏ๐๐ฅ 1 - ๐๐๐ง๐ฎ๐๐ฅ
A person in a data manager roles controls the integration. For example by using custom code or other low level means (like files) to connect sources to targets.
Pros: Low cost, Flexibility.
Cons: Scalability issues, Error prone.
๐๐๐ฏ๐๐ฅ 2 - ๐๐ฉ๐ฉ๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง
Using applications to directly access and manipulate data from various sources and targets. For example SQL scripts, Data import/export utilities, Database Replication, Data Virtualization, Message brokers, Event driven architecture.
Pros: Simple, Reuses existing tools.
Cons: Scalability and Data Quality issues, Complex to setup and manage.
๐๐๐ฏ๐๐ฅ 3 - ๐๐ข๐๐๐ฅ๐๐ฐ๐๐ซ๐
Using specialized software that connects applications and transfers data. Optionally it can also transform and cleanse data. For example Data Integration platforms or tools like ETL, ELT, CDC.
Pros: Scalable and performant, Unified interface to multiple sources and targets, Handles complex data transformations (mostly true for ETL tools).
Cons: Complex to setup and manage, Requires additional Software/Hardware, More expensive.
๐๐ฉ๐ฉ๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง level integration tends to work well in hybrid environments (on-prem + hybrid cloud).
๐๐ข๐๐๐ฅ๐๐ฐ๐๐ซ๐ works well when integrating legacy with modern applications using either on-prem or private/public cloud, although availability may me limited. For example ETL tools like Google Data Fusion, AWS Glue or Azure Data Factory are only available on the respective public clouds.
๐ช๐๐๐๐๐๐๐๐๐
There's no unique/best approach, it needs to be taylored to the specific scenario and requirements. Don't forget to read opinions from everyday users and try to ignore the marketing hype. When using a public cloud product, don't forget to factor in the use cost as it tends to add up quickly with large data transfers.ย
ย