Consulting and Advisory

Data Processing

Making existing data processing workflows near real-time.

Overview

A customer is a global programmatic media buying service with live operations in 50 countries, and builds software tools to deliver optimal return on its clients’ advertising goals. The company leverages high-load and big data having ~6B bid requests per day, ~100M targeted impressions per day an impressive ~8 PB datacenter storage.

Services

Consulting services in the following areas:

  • Lambda Architecture
  • Modern data processing ecosystem, e.g. Apache Spark, Apache Parquet, etc.

Challenge

A customer has built successful MapReduce workflows to daily process terabytes of historical data. But mentioned workflows had two main drawbacks: no near real-time application, delay up to 24h to get updated analytics; and no possibility to perform calculations based on unified data from different input sources.

Solution

Lohika introduced lambda architecture designed to take advantages of both batch and streaming processing methods. So we leveraged fast access to historical data with real-time streaming data using Spark (Core, SQL, Streaming), Apache Parquet, etc. This decision gave us a possibility to provide low latency response based on near real-time data and near real-time results based on data from 2 or more input sources.

Results

In 2 weeks Lohika team:

  • Got acquainted with existing solution, e.g. load, data (volume, structure, type, etc.), input sources, output artifacts, etc.
  • Prepared technical proposal for the new improved solution
  • Successfully completed POC based on Lambda Architecture to prove future architecture

As a result, the customer received a clear roadmap how to gradually evolve existing data processing workflows.

Talk To Us

Let’s talk about how Lohika can help you scale your engineering organization.