Services
- Categories Overview
  01 Strategize Create a clear strategy to leverage data, analytics, and AI for your business goals.
  02 Build Build your future-proof data infrastructure with us.
  03 Activate Activate your investments in data to create measurable value.
  Services
  We’re FELD M – The Data Collective, your consulting and implementation partner for successful data, analytics, and AI projects that create measurable business value.
Tools
- Tools
  We’re FELD M – The Data Collective, your consulting and implementation partner for successful data, analytics, and AI projects that create measurable business value.
Insights
- Insights
  We’re FELD M – The Data Collective, your consulting and implementation partner for successful data, analytics, and AI projects that create measurable business value.
Careers
About us

We’re FELD M – The Data Collective, your consulting and implementation partner for successful data, analytics, and AI projects that create measurable business value.

Deutsch

Back to overview

Development of a global online and offline data archive in the Azure Cloud

Data engineering & architecture

Client

International retailer

Industry

Retail

Tools we used

The data integration and architecture project focused on the following key points:

Processing of more than 500 GB of data daily
ETL pipeline in Python and Spark without lock-in effects on the platform
Integration of Adobe Analytics raw data from more than 40 markets

Introduction
High-performance IT architecture and automated monitoring
A scalable ETL pipeline to unlock the potential of Adobe Clickstream raw data
Contact us

An international retailer with high brand awareness and huge online and offline customer bases asked for support from FELD M in designing and developing an IT architecture for storing and handling all online raw data across more than 40 markets. The solution should process, enrich, and deliver more than 500 GB of data daily in a high-performance database for further analyses.

FELD M was tasked with overcoming a number of problems: Global reports were time-consuming and needed a lot of manual effort. It was impossible to drill down into the details of the available online data. Data Scientists and Analysts were able to access the data via the Adobe interface only, which limited their capabilities, especially for long-term analyses. The traditional in-house data warehouse (DWH) was unable to deal with massive volume of raw data. Due to the existence of several parallel architectures, there was no single source of truth.

High-performance IT architecture and automated monitoring

Over a period of 18 months, FELD M accompanied the project. The tasks ranged from the conception and implementation of the appropriate IT architecture for data processing and enrichment to the translation of the Web Analytics raw data and the dashboard implementation. The ultimate goal was to realize a cost-effective solution to process the massive volumes of data via the ETL pipeline and visualize this data in automated performance dashboards, including an alerting and monitoring system.

A scalable ETL pipeline to unlock the potential of Adobe Clickstream raw data

To avoid a complete lock-in effect on the cloud platform, we set up an ETL pipeline in Python and Spark. The tasks of the ETL pipeline were to archive and process the historical data of past years, to link them with additional data sources (e.g. product data) and to store the required aggregations in a database (SQL Server). The PowerBI dashboards for cross-market analyses were based on these data.

The resulting cost-effective, high-performance solution can store, process, and aggregate more than 500 GB of data daily. The client now has a uniform data basis including various levels of pre-aggregations for analyses by the Data Science and Analytics departments.

Have a similar project?

Let's find out together how we can help!

Contact us

Similar case studies

Data science & AI

Media

More accurate subscription predictions through data science

Tools we used

Read now
Data engineering & architecture

E-commerce

Data integration with a modern data platform for seamless analytics

Tools we used

Read now
Data engineering & architecture

A uniform platform for all social media dashboards

Tools we used

Read now
Data engineering & architecture

Future-proof with a modernized data landscape

Read now
Data engineering & architecture

Web App operator monetizes its platform

Tools we used

Read now
Data engineering & architecture

E-commerce

Rapid prototyping to integrate TV and web data

Read now