Why do I need ETL

BI data management (part 1): data preparation through the ETL process

Operational company data, such as those generated in day-to-day business, support a company's business processes. The data from, for example, ERP, accounting or merchandise management systems are exclusively related to the control and monitoring of day-to-day business. Because of this alignment, they reflect a temporary level of information and are normally updated over time by subsequent transactions.

Analytical BI applications are only based directly on the operational databases in exceptional cases. As a rule, they require topic-related, integrated data collections - for example by customer, product or organizational unit. For this, on the one hand, the mostly aggregated data material desired from the management's point of view must be stored permanently over long periods of time. On the other hand, large amounts of data must be consolidated from several operational databases, which are then stored in the data warehouse.

In order to combine and process data from several operational data sources, they are converted into management-relevant information by means of targeted conversion actions. This action is carried out in three steps as a so-called ETL process. ETL means in detail:

  • extraction (Extract) the relevant data from various sources

  • transformation (Transform) the data into the schema and format of the target database

  • load (Load) the data into the data warehouse

The creation of the ETL process is often the most complex step in data warehouse development. It should not be neglected, because a solidly structured data warehouse is only possible with a high-quality database.