EXTRACT, TRANSFORM & LOAD

_____
Integrations Services, Azure Data Factory, ETL Automation & BIML Experts Reduce complexity, standardise & automate, Find how we shave 80% off your ETL / ELT workload.

Data Engineering & Integration

With every data analytics & reporting project, there is always the requirement to load data from external applications and systems into the analytics platform. We call this Data Integration or in a more modern parlance Data Engineering.

There is a vibrant ecosystem of tools available for ETL & ELT, Microsoft SSIS (SQL Server Integration Services), Talend, Matillion, IBM Data Stage, Informatica and others have been available for many years. The more modern ones are Azure Data Factory & Matillion that are cloud-only or cloud-first solutions.

Our knowledge of both ETL and ELT helps us design the best solution for your business. Your organisation must pick the best tools and employ the correct methodology for project success.

ETL Vs ELT

Bringing your business data into a data warehouse usually involves one of two different methods. ETL and ELT. While on the face of it, these two topologies look pretty similar but depending on volumes of data, they could perform very differently.

Both use the same three steps. Below we briefly look at these stages and how their order can impact your business.

Extract – getting the raw data from your different data sources, for example, financial or manufacturing databases.

Transform – converting the data into a standard format the target database (Data Warehouse) and BI tools will use. Transforming data may include cleaning and fixing data issues or enriching the dataset.

Load –  Loading the newly transformed data into the Data Warehouse ready for consumption by the Business Intelligence Platform.

Data Integration

ETL – Extract, Transform & Load

ETL is the more traditional model for data warehousing. It has been the standard, having been around for well over two decades. It is a well tested and trusted method to extract data from your business systems to power your reports.

The main advantage of ETL is that it’s relatively easy to use. Once you’ve decided on the format of your data, the graphical front ends of most ETL tools, make it pretty straight forward map out.

However, it does come with some issues. By far the biggest is the time taken for the data to be available to the end-user and coupled with relative inflexibility once the data warehouse is live may mean that ETL is a potentially poor choice for a company with big data.

The best use case for ETL is for companies that are ingesting relational data from existing database systems that have relatively low data volumes and data that is slowly changing.

ELT  – Extract, Load & Transform

ELT is the newer version of the model which has the last two steps switched.  That doesn’t sound like much of a difference, but in terms of large scale businesses, it can be huge.

ELT means “Extract” and “Load” the raw data and loading it to a storage system for later processing (Transform). You do not have to worry at that stage about business rules, data cleansing, take the data and save it.

There are many systems geared up to store massive amounts of quickly arriving unstructured data sets. Amazon S3, Azure Storage, Hadoop etc. can be used as Data Lakes and we will talk more about those below.

By taking advantage of cloud technologies volumes of data are less of an issue, meaning you can scale data volumes easily and quickly. However, as the datasets get more substantial, the cost of the cloud infrastructure increases due to additional storage and processing requirements.

How can we help?

  • Current ETL/ELT Review
  • Proof of Concept Projects
  • Design Scoping
  • Data Integration & ETL/ELT Development
  • ETL/ELT Automation
  • Microsoft Azure Implementation & Management
  • Experience in Finance, Manufacturing, Healthcare, Retail & Charity sectors.