SAP : Overview of Data Warehouse Architecture

Introduction

The Data Warehouse Architecture can be defined as a structural representation of the concrete functional arrangement based on which a Data Warehouse is constructed that should include all its major pragmatic components, which is typically enclosed with four refined layers, such as the Source layer where all the data from different sources are situated, the Staging layer where the data undergoes ETL processing, the Storage layer where the processed data are stored for future exercises, and the presentation layer where the front-end tools are employed as per the users’ convenience.

Layers in Data Warehousing with BI

Data Warehousing Architecture has three layers

  1. Top Layer
  2. Middle Layer
  3. Bottom Layer

Top Layer

  • This Layer consists frontend side of the architecture which used by clients
  • Logics and transformations which are applied on information in data warehouse are used here for business purpose.
  • There are several tools which are used for report generation and Displaying the dashboards.
  • Data Mining is applied on this layer
  • This Layer consists frontend side of the architecture which used by clients
  • Logics and transformations which are applied on information in data warehouse are used here for business purpose.
  • There are several tools which are used for report generation and Displaying the dashboards.
  • Data Mining is applied on this layer
  • All Requirement Analysis document, cost, and all features that determine a profit-based Business deal is done based on these tools, which use the Data Warehouse information.

Middle Layer

  • This layer contains servers like OLAP Servers.
  • OLAP is used to provide data to business analysts for analysis purpose.
  • OLAP server stands for Online Analytical Processing Server.
  • This layer interacts with information present in bottom layer and passes it to tools in top layer for various purpose.
  • Mostly Relational or Multi Dimensional OLAP is used in Data warehouse architecture.

Bottom Layer

This is the most important layer in warehousing. It consists ETL tools, Data Sources and Data Warehouses.

1. Data Sources

The Data Sources consists of the Source Data that is acquired and provided to the Staging and ETL tools for further process.

2. ETL Tools

  • ETL tools are very important because they help in combining Logic, Raw Data, and Schema into one and loads the information to the Data Warehouse Or Data Marts.
  • Sometimes, ETL loads the data into the Data Marts, and then information is stored in Data Warehouse. This approach is known as the Bottom-Up approach.
  • The approach where ETL loads information to the Data Warehouse directly is known as the Top-down Approach.

Difference Between Top-down Approach and Bottom-up Approach

Top-Down ApproachBottom-Up Approach
Provides a definite and consistent view of information as information from the data warehouse is used to create Data MartsReports can be generated easily as Data marts are created first, and it is relatively easy to interact with data marts.
Strong model and hence preferred by big companiesNot as strong, but the data warehouse can be extended, and the number of data marts can be created
Time, Cost and Maintenance is highTime, Cost and Maintenance are low.

Data Marts

  • Data Mart is also a storage component used to store data of a specific function or part related to a company by an individual authority.
  • Datamart gathers the information from Data Warehouse, and hence we can say data mart stores the subset of information in Data Warehouse.
  • Data Marts are flexible and small in size.

3. Data Warehouse

  • Data Warehouse is the central component of the whole Data Warehouse Architecture.
  • It acts as a repository to store information.
  • Big Amounts of data are stored in the Data Warehouse.
  • This information is used by several technologies like Big Data which require analyzing large subsets of information.
  • Data Mart is also a model of Data Warehouse.

Different Layers of Data Warehouse Architecture

Below are the different layers:

ETL

There are four different types of layers which will always be present in Data Warehouse Architecture.

1. Data Source Layer

  • The Data Source Layer is the layer where the data from the source is encountered and subsequently sent to the other layers for desired operations.
  • The data can be of any type.
  • The Source Data can be a database, a Spreadsheet or any other kinds of text file.
  • The Source Data can be of any format. We cannot expect to get data with the same format considering the sources are vastly different.
  • In Real Life, Some examples of Source Data can be
  • Log Files of each specific application or job or entry of employers in a company.
  • Survey Data, Stock Exchange Data, etc.
  • Web Browser Data and many more.

2. Data Staging Layer

The following steps take place in Data Staging Layer.

Step 1: Data Extraction

The Data received by the Source Layer is feed into the Staging Layer, where the first process that takes place with the acquired data is extraction.

Step 2: Landing Database

  • The extracted data is temporarily stored in a landing database.
  • It retrieves the data once the data is extracted.

Step 3: Staging Area

  • The Data in Landing Database is taken, and several quality checks and staging operations are performed in the staging area.
  • The Structure and Schema are also identified, and adjustments are made to data that are unordered, thus trying to bring about a commonality among the data that has been acquired.
  • Having a place or set up for the data just before transformation and changes is an added advantage that makes the Staging process very important.
  • It makes data processing easier.

Step 4: ETL

  • It is an Extraction, Transformation, and Load.
  • ETL Tools are used for the integration and processing of data where logic is applied to rather raw but somewhat ordered data.
  • This data is extracted as per the analytical nature that is required and transformed to data that is deemed fit to be stored in the Data Warehouse.
  • After Transformation, the data or rather information is finally loaded into the data warehouse.
  • Some examples of ETL tools are Informatica, SSIS, etc.

3. Data Storage Layer

  • The processed data is stored in the Data Warehouse.
  • This Data is cleansed, transformed, and prepared with a definite structure and thus provides opportunities for employers to use data as required by the Business.
  • Depending upon the approach of the Architecture, the data will be stored in Data Warehouse as well as Data Marts. Data Marts will be discussed in the later stages.
  • Some also include an Operational Data Store.

4. Data Presentation Layer

  • This Layer where the users get to interact with the data stored in the data warehouse.
  • Queries and several tools will be employed to get different types of information based on the data.
  • The information reaches the user through the graphical representation of data.
  • Reporting Tools are used to get Business Data, and Business logic is also applied to gather several kinds of information.
  • Meta Data Information and System operations and performance are also maintained and viewed in this layer.

Conclusion

An important point about Data Warehouse is its efficiency. To create an efficient Data Warehouse, we construct a framework known as the Business Analysis Framework.

Visit the main page for SAP tutorials