Saltar a contenido

OLTP πŸ‘‰ OLAP

  1. OLTP (Online Transaction Processing)
  2. OLAP (Online Analytical Processing)
  3. Ejemplo bΓ‘sico

OLTP (Online Transaction Processing) y OLAP (Online Analytical Processing) son dos tipos de sistemas de procesamiento de datos que se utilizan en diferentes contextos y tienen caracterΓ­sticas y objetivos distintos

CaracterΓ­stica OLTP OLAP
PropΓ³sito Procesamiento de transacciones en tiempo real AnΓ‘lisis de datos histΓ³ricos y toma de decisiones
Operaciones CRUD (Crear, Leer, Actualizar, Eliminar) Agregaciones, sumas, promedios, etc.
Volumen de Datos PequeΓ±as cantidades por transacciΓ³n Grandes volΓΊmenes de datos histΓ³ricos
Concurrencia Alta (muchos usuarios concurrentes) Baja (menos usuarios concurrentes)
Modelo de Datos Normalizado Desnormalizado
Integridad y Aislamiento CrΓ­tico Menos crΓ­tico
Ejemplos de Uso Bancos, comercio electrΓ³nico, sistemas de reserva Informes de ventas, anΓ‘lisis financiero, BI

Voyager.png


Lectura Recomendada

Why your transaction database can’t do analytics OLAP vs OLTP explained

  • TΓ­tulo: Data Warehouse vs Data Lake vs Data Mart: The Ultimate Guide With Real Examples
  • Fuente: Medium (se necesita suscripciΓ³n)
  • Autor: Prem Vishnoi
  • Fecha publicaciΓ³n: 2026-01-25

Data Architecture

How do you store 230 million customers’ viewing history, preferences, and behavior data in a way that lets you:

  • Process 500+ billion events per day
  • Generate personalized recommendations in milliseconds
  • Run complex analytics on years of historical data
  • Handle both real time streaming and batch analysis

The answer? They do not use just One type of database.

They use a combination of data warehouses, data lakes, and specialized data stores each doing what it does best.

Netflix is not special, Every company dealing with serious data faces the same architectural decisions.

In this article, I will break down:

  • Why your regular database can’t handle analytics OLAP vs OLTP
  • The three main data storage options warehouse, lake, mart
  • Real world examples from companies you know
  • How to choose the right one for your needs

Part 1: Why You Need two Types of Databases OLAP vs OLTP

OLTP vs OLAP

Imagine you own a coffee shop, you have two very different information needs:

Need 1: Did this customer pay?

  • Needs to be answered right now
  • Involves one customer, one transaction
  • Must be 100% accurate you are handling money
  • Happens thousands of times per day

Need 2: What’s our best selling drink this quarter?

  • Can wait a few seconds or minutes
  • Involves thousands of transactions
  • Needs to aggregate and compare data
  • Happens a few times per week

These two needs require fundamentally different database architectures.

OLTP

Think of it as your digital cash register, It handles the day to day operations:

OLAP

Why You Can not Use One for Both

Using OLTP for analytics:

SELECT region, SUM(sales), AVG(profit_margin)
FROM transactions
WHERE date BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY region
ORDER BY SUM(sales) DESC

This query might scan 50 million rows, While it runs:

  • The database slows down
  • Regular transactions time out
  • Customers can’t check out
  • Your CFO gets the report but your business loses money

Using OLAP for transactions:

  • OLAP systems are not designed for rapid writes
  • No transaction guarantees ACID compliance
  • Data might be hours or days old
  • You would be charging customers based on stale data

The solution? Use both, OLTP handles operations, OLAP handles analytics. Data flows from OLTP to OLAP usually nightly or in real-time.

OLAP vs OLTP: Quick Comparison

Part 2: The Three Data Storage Options

Now that you understand why analytics needs its own system, let’s look at your options:

Data Warehouse

A Data Warehouse is like a well organized library where every book is cataloged, indexed, and easy to find:
- Every book has a catalog number
- Books are organized by subject
- You can find any book in seconds
- No random papers lying around
- Librarian checks everything before shelving

Key Point:

  • Structured data only: Everything fits into neat tables
  • Pre processed: Data is cleaned and transformed before storage
  • Schema on Write:You define the structure BEFORE loading data
  • Optimized for queries : Designed for fast SQL performance
  • Historical: Stores years of data for trend analysis

When to Use a Data Warehouse

  • Business intelligence and reporting
  • Executive dashboards
  • Regulatory compliance reports
  • Customer analytics
  • Sales performance tracking

Real Example: Coca Cola

Coca Cola uses a data warehouse to consolidate sales data from 200+ countries, They can answer questions like:

  • How did Sprit perform in Southeast Asia last quarter?
  • Which regions show declining Diet Coke sales?
  • What’s the correlation between temperature and beverage sales?

All this data is structured, cleaned, and ready for instant querying.

Data Lake: The Storage Warehouse

A data lake is like a massive storage warehouse where you can dump anything boxes, furniture, documents, random stuff and sort it out later.

  • Any Data Type: Structured, semi structured, unstructured
  • Raw Storage: Data is stored as-is, no preprocessing required
  • Schema On Read: You define structure when you read the data
  • Massive Scale: Can handle petabytes at low cost
  • Flexibility: Perfect for data science and machine learning
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚                      DATA LAKE                              β”‚
    β”‚                (The Storage Warehouse)                      β”‚
    β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
    β”‚                                                             β”‚
    β”‚      Throw anything in β€” boxes, furniture, whatever         β”‚
    β”‚      No need to label or organize upfront                   β”‚
    β”‚      Massive space at cheap prices                          β”‚
    β”‚      Sort through it when you need something                β”‚
    β”‚      Some stuff might be junk β€” that's okay                 β”‚
    β”‚                                                             β”‚
    β”‚   In data terms:                                            β”‚
    β”‚   β€’ Store raw data: logs, images, videos, JSON, XML         β”‚
    β”‚   β€’ No preprocessing required                               β”‚
    β”‚   β€’ Extremely cost-effective for large volumes              β”‚
    β”‚   β€’ Process data only when needed (ELT)                     β”‚
    β”‚   β€’ May contain duplicates or unverified data               β”‚
    β”‚                                                             β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    

When to Use a Data Lake

They do not know exactly how they will use all this data, so they store it raw. Later, data scientists can explore it for new features like discover Weekly or Wrapped.

Data Mart: The Department Store Section

A data mart is like a specific section of a department store electronics, clothing, furniture.

It’s a subset of the Data Warehouse tailored for a specific team or use case.

When to Use a Data Mart

Real Example: Walmart

Quick Comparison: Warehouse vs Lake vs Mart

Real World Case Studies

Case Study 1: Netflix The Hybrid Approach

The Challenge: Netflix processes 500+ billion events per day from 230 million subscribers across 190 countries.

The Solution: Netflix uses a combination:

The Result:

  • Personalized recommendations that drive 80% of viewer activity
  • A/B testing at massive scale
  • Real time quality monitoring for streams

Key Takeaway: Netflix doesn’t choose one option.

They use all three based on the use case.

Case Study 2: Airbnb Data Lake Evolution

The Challenge: Airbnb needed to democratize data access across 6000+ employees while maintaining quality.

The Solution: They built a data lake called β€œDataportal” with:

  • Raw zone: All events stored as-is
  • Curated zone: Cleaned, validated datasets
  • Consumption zone: Ready to query tables for analysts

Technology stack:

  • Amazon S3 (storage)
  • Apache Spark (processing)
  • Presto (querying)
  • Airflow (orchestration)

The Result:

  • 200+ data sources integrated
  • 10000+ datasets available
  • Self service analytics for all employees

Key Takeaway: A data lake isn’t just a dumping ground you need governance layers.

Case Study 3: Uber Real-Time + Historical

The Challenge: Uber needs to:

  • Match riders with drivers in real time (OLTP)
  • Analyze trip patterns for pricing (OLAP)
  • Train ML models for ETA prediction (Data Lake)

The Solution:

The Result:

  • 100 million+ trips analyzed daily
  • Dynamic pricing based on real-time demand
  • Accurate ETAs within 2 minutes

Key Takeaway: Different use cases require different architectures β€” there’s no one-size-fits-all.

Case Study 4: Target Retail Analytics

The Challenge: Target wanted to predict what customers need before they know they need it.

The Solution:

  • Data warehouse: Transactional data (purchases, returns)
  • Data lake: Clickstream, mobile app, social media

Data marts:

  • Marketing mart (customer segments)
  • Inventory mart (stock optimization)
  • Store ops mart (staffing, layout)

Famous Example: Target’s pregnancy prediction model analyzed purchase patterns (unscented lotion, vitamins, cotton balls) to identify pregnant customers and send targeted coupons sometimes before the customers told their families!

Key Takeaway: Combining structured warehouse data with unstructured lake data unlocks powerful insights.

How to Choose: Decision Framework

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚               WHICH STORAGE DO YOU NEED?                    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                             β”‚
β”‚   START HERE                                                β”‚
β”‚       β”‚                                                     β”‚
β”‚       β–Ό                                                     β”‚
β”‚   Is your data structured (tables, rows, columns)?          β”‚
β”‚       β”‚                                                     β”‚
β”‚       β”œβ”€ NO β†’ Do you need to store it anyway?               β”‚
β”‚       β”‚          β”‚                                          β”‚
β”‚       β”‚          β”œβ”€ YES β†’ DATA LAKE (store raw)             β”‚
β”‚       β”‚          └─ NO β†’ Don't store it                     β”‚
β”‚       β”‚                                                     β”‚
β”‚       └─ YES β†’ Who needs access?                            β”‚
β”‚                  β”‚                                          β”‚
β”‚                  β”œβ”€ Whole company β†’ DATA WAREHOUSE          β”‚
β”‚                  β”‚                                          β”‚
β”‚                  └─ One team β†’ DATA MART                    β”‚
β”‚                      β”‚                                      β”‚
β”‚                      └─ (Or create mart FROM warehouse)     β”‚
β”‚                                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Simple Rules of Thumb

Choose a Data Warehouse when:

  • You need to answer business questions with SQL
  • Data quality and consistency are critical
  • Multiple teams need access to the same data
  • You’re building dashboards and reports

Choose a Data Lake when:

  • You have lots of unstructured data logs, images, videos
  • You’re doing machine learning or data science
  • You don’t know how you’ll use the data yet
  • Cost per GB is a major concern

Choose a Data Mart when:

  • One team needs specialized access
  • You want to improve query performance
  • You need to restrict data access
  • You are doing a specific project

Most companies use ALL THREE:

Raw Data β†’ Data Lake β†’ Data Warehouse β†’ Data Marts
                              ↓
                      Analytics & Reports

The Modern Data Stack (2026)

Contenido relacionado

πŸ‘‰ Concepto Pila de Datos
πŸ‘‰ Data Pipeline: ETL vs ELT
πŸ‘‰ IntroducciΓ³n a Modern Data Stack
πŸ‘‰ Panorama IngenierΓ­a de Datos

Here’s what a typical Enterprise Architecture looks like today:

Key Takeaways

Final Thought