Design a Ride-Sharing App (Uber) — System Design

A ride-share platform like Uber or Lyft is a real-time two-sided marketplace: it must continuously ingest GPS streams from millions of drivers, match riders to drivers with low latency, and handle financial transactions across a global fleet — all at 150 million monthly active users and 15 million trips per day. The deceptively simple act of "hail a ride" hides a formidable distributed systems problem: you need a near-real-time geospatial index that can handle millions of concurrent moving objects, a matching algorithm that optimizes globally rather than greedily, a reliable state machine for the trip lifecycle, surge pricing that responds to supply-demand imbalances in seconds, and a payment pipeline that is financially correct without slowing down the booking experience.

⚡ Quick Takeaways

GPS pipeline = at-most-once — one lost 5-second packet is negligible; guaranteed delivery overhead not worth it for location data.
In-memory index + Cassandra — in-memory spatial index for sub-millisecond matching lookups; Cassandra for durable location history, partitioned by city and product type.
Bipartite graph matching — batch incoming requests and solve globally for minimum average pickup time, not greedy nearest-driver per request.
Geohash / S2 cells for geo-indexing — encode lat/lng into hierarchical cells; query nearby drivers by looking up cells at the right precision level.
Consistent hashing (Ringpop) — distributes driver location data evenly across nodes with automatic repartitioning.
Trip state machine — explicit states (REQUESTED → ACCEPTED → EN_ROUTE → IN_TRIP → COMPLETED) prevent illegal transitions and make retries safe.
Surge pricing via supply/demand ratio — computed per geohash cell every 60–120 s; multiplier caps prevent PR disasters.
Async payment — payment deduction and driver payout are off the critical trip-completion path to avoid processing delays degrading the booking experience.

tldr

GPS packets flow every ~5s into a message queue (at-most-once delivery). An in-memory spatial index backed by Cassandra stores driver locations partitioned by city and product type. Matching uses batched bipartite graph optimization — not greedy nearest-driver — to minimize global average pickup time. Geohash or Google S2 cells enable O(1) radius queries. The trip is an explicit state machine. Surge pricing is a ratio computed per geohash cell. Payment is handled asynchronously via a third-party payment processor.

Step 1 — Clarify Requirements

Before diving in, scope the problem explicitly. Ride-share is broad — define exactly what you will and won't design.

Functional requirements

Display available driver locations on a map in real time.
Rider requests a trip — system matches them to a nearby available driver.
ETA and upfront price estimate displayed before booking.
Real-time location tracking during the trip (rider sees driver moving).
Trip completion: fare calculation, payment, receipt.
Driver earnings dashboard and trip history.
Driver onboarding with background check.

Non-functional requirements

Low latency — driver location updates visible within ~5 s; matching response under 2 s.
High availability — the marketplace must be reachable even under partial failures; a down matching service means zero revenue.
Consistency — a driver must not be double-matched; a trip must have exactly one driver.
Global scale — 150 million monthly active users, 15 million trips per day across hundreds of cities.
Geo-awareness — queries are inherently spatial; the system must index and search by location efficiently.

interview tip

State out of scope explicitly: fraud detection, driver incentive programs, split fares, scheduled rides. This lets you focus the interview time on the genuinely hard parts — GPS pipeline, matching, and the trip state machine.

Step 2 — Capacity Estimation

Back-of-the-envelope math anchors your architecture choices. Assume the Uber scale: 150M MAU, 15M trips/day.

GPS write load

Active drivers at peak: assume 1M concurrent drivers globally during evening rush.
Each driver sends a GPS packet every 5 s → 1M ÷ 5 = 200,000 location writes/sec at peak.
Each packet is ~100 bytes (driver_id, lat, lng, timestamp, heading) → ~20 MB/s of GPS ingestion.
This is a bursty stream — a message queue is mandatory to absorb spikes.

Trip and matching load

15M trips/day ÷ 86,400 s ≈ ~174 trips/sec average; peak (Friday 6 pm) 3–5× → ~700 trip requests/sec.
Each trip request triggers one matching computation, one ETA query, and at least two location reads (rider + drivers in radius).
Matching is CPU-bound; plan for dedicated matching service instances, not shared with GPS ingestion.

Storage

GPS history: 200K writes/sec × 100 bytes × 86,400 s ≈ ~1.7 TB of raw location data per day. Only the last known position needs to be in memory; history goes to Cassandra.
Trip records: 15M trips × ~1 KB (trip metadata, fare, route summary) ≈ ~15 GB/day, totally manageable in SQL.
In-memory spatial index: 1M concurrent drivers × 200 bytes per entry ≈ ~200 MB — trivially fits on a single node but is sharded for availability.

note

The GPS write load (200K/s) is the dominant data-plane concern. The trip request rate (700/s) is comparatively modest. Your architecture must handle the former without letting it starve the latter.

Step 3 — API Design

Key REST endpoints across the ride-share services:

HTTP

# Driver — location update (high-frequency, at-most-once)
POST /driver/location
     { driver_id, lat, lng, heading, timestamp }

# Rider — request a trip
POST /trips
     { rider_id, pickup_lat, pickup_lng, dropoff_lat, dropoff_lng, product_type }
     → returns { trip_id, estimated_eta_s, estimated_fare, driver_info }

# Trip status (polled or WebSocket push)
GET  /trips/{trip_id}
     → { status, driver_location, eta_s, fare_cents }

# Driver — accept/decline a match
POST /trips/{trip_id}/accept   { driver_id }
POST /trips/{trip_id}/decline  { driver_id }

# Driver — trip lifecycle events
POST /trips/{trip_id}/arrived    # driver at pickup
POST /trips/{trip_id}/start      # trip begins
POST /trips/{trip_id}/complete   # trip ends, triggers fare + payment

The POST /driver/location endpoint is called at very high frequency and uses at-most-once delivery — the server accepts and moves on; no acknowledgement retry. The trip lifecycle endpoints are idempotent — calling /complete twice on an already-completed trip is a no-op at the state machine level.

Step 4 — Driver and Rider Registration

The User Service handles authentication and profile management for both sides of the marketplace. Driver onboarding involves an additional Background Checker that validates accident history, license validity, SSN, and vehicle status before a driver is activated. Frequently accessed user profiles are cached in Redis to reduce database load on high-traffic lookup paths.

Driver profiles carry additional state beyond what a rider profile needs: vehicle details (make, model, license plate, color), current availability status (OFFLINE / AVAILABLE / IN_TRIP), and product type eligibility (UberX, Comfort, Black, Pool). This eligibility is stored in SQL and cached; the matching service reads it to filter the driver pool before running the assignment algorithm.

Step 5 — GPS Ingestion Pipeline

Every active driver transmits a GPS packet approximately every 5 seconds, containing their user_id, current location, and a timestamp. At 15 million trips per day and assuming ~1M concurrent active drivers at peak, this produces ~200,000 writes per second — a high-volume, bursty write stream that a synchronous HTTP endpoint cannot absorb without being a bottleneck.

A message queue (Kafka or a purpose-built low-latency alternative) decouples the GPS producers from the location processing consumers, absorbing variability in QPS. Drivers write to the queue without waiting for confirmation; the consumers — the location update workers — drain the queue and update the in-memory spatial index and Cassandra asynchronously.

The delivery guarantee is at-most-once. A single lost GPS packet is acceptable because the next one arrives 5 seconds later — the cost of guaranteed delivery (retry loops, deduplication, higher latency) adds complexity for a negligible benefit. The contrast with payments is instructive: payment events use at-least-once because losing a payment message has real financial consequences.

GPS data pre-processing

Raw GPS readings are noisy — consumer-grade sensors have 3–5 m accuracy, and satellites occasionally produce wildly incorrect fixes. Before updating the spatial index, the location workers apply:

Outlier rejection — discard readings that imply impossible speed (e.g., >200 km/h between two consecutive packets).
Map matching — snap the lat/lng to the nearest road segment via the map service API, so the driver icon appears on the road rather than floating 10 m into a building.
Kalman filtering (optional, higher fidelity) — smooth the GPS trace using a Kalman filter to reduce jitter on the rider's live-tracking map.

why not WebSocket for GPS ingestion?

WebSockets maintain persistent connections — great for pushing data to clients, but 1M persistent inbound connections from driver apps would saturate connection table memory on a single gateway. The GPS write pattern (fire-and-forget every 5 s) is better served by short HTTP POSTs that a load balancer can distribute across stateless ingestion workers.

Step 6 — Geospatial Indexing

The core spatial operation is: "given a rider at (lat, lng), find all available drivers within radius R." Doing this efficiently at scale requires a purpose-built geospatial index rather than a brute-force scan of all driver records.

Geohash

Geohash encodes a (lat, lng) pair into a short alphanumeric string by recursively bisecting the world into a 2D grid. Each additional character halves the cell size. A 6-character geohash represents a ~1.2 km × 0.6 km cell. To find nearby drivers, you look up the target cell plus the 8 adjacent cells at the appropriate precision — nine lookups return all candidates within roughly 1–2 km.

concept

-- Geohash precision table
length  cell width  cell height  use case
  4     ~39 km      ~20 km       city-level
  5     ~4.9 km     ~4.9 km      neighborhood
  6     ~1.2 km     ~0.61 km     block-level (good for driver search)
  7     ~153 m      ~153 m       fine-grained ETA

-- Finding nearby drivers: look up target cell + 8 neighbors
nearby_cells = geohash_neighbors(rider_geohash, precision=6)
-- returns 9 cells; query in-memory index for AVAILABLE drivers in those cells

Google S2 cells

Google S2 maps the sphere onto a cube, then uses a Hilbert curve to produce hierarchical cell IDs with excellent spatial locality. S2 cells at level 13 are roughly 1.3 km², similar to a 6-char geohash. S2 is what Uber uses internally (via their H3 library, a hexagonal variant); it handles polar distortions better than geohash and supports efficient cap/polygon queries. For an interview, either is acceptable — geohash is simpler to explain, S2/H3 is more production-correct.

QuadTree

A QuadTree recursively subdivides the 2D plane into four quadrants. Nodes that contain too many objects are split. QuadTrees adapt naturally to non-uniform distributions — dense urban cores get fine granularity, sparse rural areas stay coarse. The downside is that updates (a driver moving between cells) require tree restructuring, which is expensive compared to the hash-lookup approach of geohash. QuadTrees are more commonly seen in gaming spatial indexing than in ride-share production systems, but are a valid interview answer with the right caveats.

Index Type	Strengths	Weaknesses
Geohash	Simple string prefix lookup; Redis native support; easy to shard by prefix	Cell boundary artifacts; non-uniform cell shapes at high latitudes
Google S2 / H3	Uniform cell areas; excellent spatial locality; polygon/cap queries	More complex library; less widely understood
QuadTree	Adaptive to density; hierarchical aggregation	Expensive updates; complex tree balancing; in-memory only
R-Tree (PostGIS)	Mature SQL integration; handles polygons natively	SQL DB is too slow for 200K GPS writes/sec; not suitable for hot path

Step 7 — Location Storage Architecture

Driver location data requires both high-speed lookup for matching and durable storage for trip history. These are two different access patterns that demand two different storage tiers:

Layer	Technology	Purpose
Active index	In-memory search index	Sub-millisecond reads and writes for real-time driver lookup during matching
Persistent store	Cassandra (NoSQL)	Durable trip and location records; horizontally scalable writes
Partition key	DriverUUID + time bucket	Enables efficient time-range queries per driver
Sharding	City / segment / product type	Isolates load by market; UberX, UberPool, etc. have distinct supply pools
Distribution	Consistent hashing (Ringpop)	Distributes data evenly across worker nodes; auto-repartitioning via Airflow

The in-memory index is partitioned by geohash prefix (or S2 cell ID prefix). Each shard is owned by a set of worker nodes determined by consistent hashing — Uber's Ringpop library implements this ring, allowing nodes to be added or removed without reshuffling all data. When a driver sends a location update, the ingestion worker hashes the driver's current geohash to find the owning shard, then updates that shard's in-memory map.

Cassandra is chosen for the persistent tier because it handles high write throughput with eventual consistency — the exact profile of GPS history. The partition key is (driver_id, date_hour_bucket) so all location records for a driver in a time window are co-located on one node, making "reconstruct this driver's route during trip X" a single partition read.

Pickup time optimization and driver matching

Step 8 — The Matching Algorithm

The intuitive approach — assign each incoming rider request to the nearest available driver — is greedy and globally suboptimal. If two riders are close to the same driver, the greedy approach gives the driver to the first requester, potentially leaving the second rider stranded with a much farther driver.

The correct framing is a bipartite graph matching problem: riders on one side, available drivers on the other, with edge weights representing pickup time. The system batches incoming requests within a short window and solves for the assignment that minimizes global average pickup time across all riders in the batch — not just the first rider in queue.

concept

-- Greedy (suboptimal):
rider_A → nearest driver D1  (2 min)
rider_B → next driver D2     (8 min)
avg pickup = 5 min

-- Bipartite matching (optimal):
rider_A → driver D2          (4 min)
rider_B → driver D1          (4 min)
avg pickup = 4 min

-- Algorithm: Hungarian algorithm or min-cost flow
-- Batch window: typically 2–5 seconds
-- Edge weights: ETA from routing API (not Euclidean distance)

Matching pipeline in detail

Geo lookup — query the in-memory spatial index for all AVAILABLE drivers within a search radius (e.g., 3 km) of the rider's pickup point. Filter by product type (UberX, Comfort, etc.).
ETA scoring — for each candidate driver, call the routing service (or use a lightweight straight-line estimate with a road factor) to get a realistic pickup ETA. Actual road ETA beats Euclidean distance significantly in grid-heavy cities like Manhattan.
Batch solve — collect all pending rider requests in the current batch window. Build the bipartite graph with ETA as edge weights. Solve using the Hungarian algorithm (O(n³)) or a min-cost flow approximation for large batches.
Offer dispatch — send a match offer to the assigned driver. If the driver declines or doesn't respond within ~15 s, the rider is re-queued for the next batch. Drivers who repeatedly decline are deprioritized in future batches.
Lock the driver — once a driver accepts, atomically flip their status from AVAILABLE to IN_TRIP in the spatial index to prevent double-matching.

why ETA over distance?

A driver 0.5 km away on the wrong side of a river may take 15 minutes; a driver 1.5 km away with a clear road takes 3 minutes. Using straight-line distance as the edge weight produces systematically wrong matches in real cities. Routing API calls are expensive — cache frequent city pairs or use pre-computed road factor heuristics for the scoring phase, then confirm with a full routing call after a match is found.

Step 9 — Trip State Machine

A trip is best modeled as an explicit state machine. Each state transition is triggered by an event — from driver or rider app, or from an internal timeout — and must be idempotent so that replaying a network-retried event does not advance the state incorrectly.

states

REQUESTED
  │ match found (bipartite algorithm)
  ▼
DRIVER_ASSIGNED
  │ driver accepts
  ▼
DRIVER_EN_ROUTE         ← driver heading to pickup
  │ driver arrives
  ▼
DRIVER_ARRIVED          ← driver waiting at pickup
  │ rider enters vehicle / driver taps "start"
  ▼
IN_TRIP                 ← ride in progress
  │ driver taps "complete" at destination
  ▼
COMPLETED               ← fare computed, payment triggered

  From any pre-COMPLETED state:
  REQUESTED / DRIVER_ASSIGNED → CANCELLED_BY_RIDER / NO_DRIVER_FOUND
  DRIVER_EN_ROUTE / DRIVER_ARRIVED → CANCELLED_BY_DRIVER / CANCELLED_BY_RIDER

Storing the state in a SQL row guarded by an application-level transition check prevents illegal hops (e.g., moving directly from REQUESTED to IN_TRIP). Each transition also publishes an event to Kafka so that downstream services — notification service (send "driver arriving" push), analytics service (record event), payment service (trigger charge on COMPLETED) — can react without being coupled to the trip service directly.

Timeout handling

If a driver does not accept within ~15 s of being offered a match, the trip service re-queues the rider and marks the driver as deprioritized for the next matching cycle.
If no driver is found after N retries over ~2 min, the trip moves to NO_DRIVER_FOUND and the rider is notified to try again later.
A distributed scheduler (e.g., Celery beat or a simple Kafka delayed message) drives these timeout transitions without a polling loop on the hot database.

Step 10 — ETA and Routing

Rather than building a proprietary mapping stack, ride-share platforms partner with external map providers (Google Maps, MapBox, OpenStreetMap) for road graph data, routing, and ETA calculation. A critical challenge is that GPS coordinates have no concept of roads — raw latitude/longitude must be snapped to the nearest road segment via map-matching before displaying a vehicle icon on-screen or computing ETA.

ETA computation layers

Road graph — a weighted directed graph where nodes are road intersections and edges are road segments with travel-time weights derived from historical speed data and real-time traffic.
Dijkstra / A* — shortest-path algorithms over the road graph; A* uses a geographic heuristic to prune the search space and run faster than Dijkstra in practice.
Traffic adjustment — real-time traffic feeds (Google Traffic, HERE) modulate edge weights continuously; a road that takes 2 min at noon may take 12 min at 5 pm.
ETA caching — popular city-pair ETAs are cached with short TTLs (30–60 s) since traffic changes frequently. Exact per-driver ETAs are computed fresh but use the cached road graph.

The matching service uses a lightweight ETA approximation (straight-line distance × road factor) for the scoring phase to avoid N routing API calls per batch. Once a match is confirmed, the full routing API computes the accurate ETA shown to the rider.

Step 11 — Surge Pricing

When demand exceeds supply in a geographic area, surge pricing raises fares to attract more drivers to the area and reduce demand from price-sensitive riders, rebalancing the market. Implementing surge pricing is a real-time data pipeline problem:

Computing the surge multiplier

Every 60–120 seconds, the pricing service reads from the spatial index: for each geohash cell at precision 5–6 (neighborhood level), count pending_requests (riders waiting) and available_drivers (AVAILABLE status).
Compute the supply/demand ratio: ratio = pending_requests / max(available_drivers, 1).
Map the ratio to a multiplier via a lookup table (e.g., ratio < 0.5 → 1.0×; 0.5–1.0 → 1.2×; 1.0–2.0 → 1.5×; >2.0 → 2.0×). Cap the multiplier (e.g., 3.0×) to avoid PR disasters during emergencies.
Write the multiplier to a Redis key per geohash cell with a short TTL (120 s). All pricing lookups hit Redis, not the spatial index.

edge case

What happens at geohash cell borders? A rider near the edge of a high-surge cell might be a few meters from a zero-surge cell. The system should aggregate multipliers from the rider's cell plus neighbors and take the weighted average, or simply use the rider's cell. The exact boundary behavior is a product policy decision — disclose surge clearly to riders regardless.

Upfront fare estimation

The fare shown before the rider books is an estimate based on: base_fare + (distance_km × per_km_rate + duration_min × per_min_rate) × surge_multiplier. The final fare uses actual GPS-tracked distance and time. If the actual fare significantly exceeds the estimate (e.g., detour by driver), rider protections cap the overage.

Step 12 — Map Services

Map services provide three critical functions: road graph data for routing, ETA computation, and the map tiles displayed in the app UI. Ride-share platforms partner with external providers rather than build these in-house — the investment required to maintain a global road graph (OpenStreetMap editing, traffic data partnerships, map rendering pipeline) is enormous and not a core competency.

Key integration points:

Geocoding / reverse geocoding — convert human-readable addresses ("Pier 39, San Francisco") to (lat, lng) and back. Used for pickup/dropoff entry in the rider app.
Directions API — given origin and destination, return turn-by-turn directions and ETA. Called once per confirmed match to show the driver their navigation route.
Map matching API — snap raw GPS traces to road segments. Called by the GPS pre-processing pipeline and by the map display layer to show a smooth driver icon.
Map tiles — raster or vector tiles for the app background map. Served via CDN from the map provider; the ride-share app does not re-host tiles.

Step 13 — Data Model

Key tables stored in SQL (trip records, user accounts) and in Cassandra (location history):

SQL

CREATE TABLE trips (
  id              BIGINT PRIMARY KEY,
  rider_id        BIGINT NOT NULL,
  driver_id       BIGINT,
  status          VARCHAR(32),   -- REQUESTED|DRIVER_ASSIGNED|...
  pickup_lat      DECIMAL(9,6),
  pickup_lng      DECIMAL(9,6),
  dropoff_lat     DECIMAL(9,6),
  dropoff_lng     DECIMAL(9,6),
  product_type    VARCHAR(16),   -- UBERX|COMFORT|BLACK|POOL
  surge_multiplier DECIMAL(4,2),
  estimated_fare_cents INT,
  actual_fare_cents    INT,
  requested_at    TIMESTAMP,
  completed_at    TIMESTAMP
);

CREATE TABLE drivers (
  id              BIGINT PRIMARY KEY,
  availability    VARCHAR(16),  -- OFFLINE|AVAILABLE|IN_TRIP
  product_types   TEXT,          -- comma-separated eligible types
  vehicle_plate   VARCHAR(16),
  rating          DECIMAL(3,2),
  last_lat        DECIMAL(9,6),
  last_lng        DECIMAL(9,6),
  last_geohash    VARCHAR(12),
  updated_at      TIMESTAMP
);

Cassandra CQL

-- GPS location history (write-heavy, time-series)
CREATE TABLE driver_location_history (
  driver_id   UUID,
  time_bucket TEXT,    -- e.g. '2024-05-24-18' (hourly bucket)
  recorded_at TIMESTAMP,
  lat         DOUBLE,
  lng         DOUBLE,
  heading     SMALLINT,
  PRIMARY KEY ((driver_id, time_bucket), recorded_at)
) WITH CLUSTERING ORDER BY (recorded_at DESC);
-- Partition = one driver × one hour; query = "all GPS for driver D in hour H"

Step 14 — Payment Processing

Three architectural options exist for payment:

Proprietary payment system — full control, high build cost and regulatory burden (PCI-DSS Level 1).
Direct third-party integration (PayPal, Block/Square) — simpler, but tied to one provider.
Unbranded payment gateway via processor partner — Uber's initial approach with Braintree; enables multi-provider flexibility and regional payment processor support without exposing the payment brand.

Payment deductions and driver payout calculations are handled asynchronously to keep them off the critical trip-completion path. When a trip reaches COMPLETED, the trip service publishes a TripCompleted event to Kafka. The payment service consumes this event, computes the final fare (actual distance × actual duration × surge multiplier), charges the rider's saved payment method via the processor API, and triggers driver payout (typically daily batch settlement rather than per-trip to reduce transfer fees).

Payment idempotency

The payment service passes a unique idempotency key (e.g., trip_id + "charge") to the payment processor on every charge call. If the network drops after the charge succeeds but before the response arrives, the retry with the same key is a no-op at the processor — the rider is not double-charged. A webhook from the processor confirms success or failure, which updates the trip record's payment status. A nightly reconciliation job catches any discrepancies between trip records and processor settlements.

edge case

What if the TripCompleted Kafka event is delivered twice (at-least-once semantics)? The payment service must dedup on trip_id — if a charge record already exists for the trip, skip the charge and return the existing result. This is the idempotent consumer pattern: every event consumer must handle re-delivery without side effects.

Step 15 — Scaling and Fault Tolerance

A global ride-share platform must survive regional outages, handle millions of concurrent connections, and route traffic intelligently across a distributed fleet of services.

Sharding the location index

The in-memory spatial index is sharded by city and product type — UberX in San Francisco is a completely separate data set from Comfort in London. Within a city, consistent hashing (Ringpop) distributes geohash prefixes across worker nodes. A driver moving between geohash cells triggers a location update that the hash ring routes to the correct shard automatically. When a node is added or removed, only the keys on the affected arc of the ring are remapped — minimizing data movement compared to naive modulo partitioning.

Multi-region deployment

Each major metro area runs its own matching, GPS ingestion, and spatial index cluster — a rider in Chicago never needs data from London. Cross-region communication is limited to user profile reads (cached aggressively) and global analytics pipelines. This geographic isolation keeps latency low, failure blast radius small, and regulatory data-residency requirements satisfied.

Circuit breakers and graceful degradation

If the routing API (Google Maps) is slow, the matching service falls back to Euclidean-distance estimates. Matches are slightly less accurate but still dispatched.
If the surge pricing service is down, fall back to a cached multiplier (or 1.0× as a safe default). Trips continue; pricing accuracy degrades temporarily.
If Cassandra is unavailable, the GPS pipeline buffers in Kafka and catches up when the store recovers. Only the historical record is delayed — the in-memory index is unaffected.

WebSocket for real-time tracking

During an active trip, the rider's app displays the driver's position moving in real time. This requires a push channel, not polling — polling at 1 s intervals from millions of active trips would generate ~millions of HTTP requests per second for position data. The solution:

A WebSocket gateway maintains persistent connections to active rider and driver apps.
When the GPS pipeline processes a location update for a driver currently in trip, it publishes the new position to a per-trip Pub/Sub channel.
The WebSocket gateway subscribes to that channel and pushes the update to the rider's connection.
The gateway is stateless except for the connection table; it can be horizontally scaled behind a load balancer that supports sticky sessions (or a distributed pub/sub mesh like Redis Pub/Sub).

Step 16 — Key Tradeoffs

Decision	Choice	Trade-off accepted
GPS delivery guarantee	At-most-once	Occasional missed packets; next update arrives in 5 s — negligible impact
Spatial index	In-memory + geohash	Memory cost; must be replicated; not durable — but speed is paramount for matching
Matching algorithm	Bipartite batch solve	Introduces 2–5 s batch window latency; globally better outcomes vs. greedy
Location persistence	Cassandra (eventual)	History is eventually consistent — acceptable; correctness needed for trip records (SQL)
Payment	Async (Kafka + processor)	Payment confirmed after trip; rare failure requires reconciliation job to catch
Map services	External provider	Cost per API call; dependency risk — but avoids enormous proprietary mapping investment
Surge computation	Periodic batch (60–120 s)	Up to 2-min lag on multiplier updates; acceptable given human decision timescales

takeaway

Ride-share design is dominated by two insights: GPS is a high-volume time-series stream (at-most-once, message queue + in-memory geohash index), and matching is a global optimization problem (bipartite graph), not a per-request greedy lookup. The trip state machine keeps lifecycle transitions correct and idempotent. Surge pricing is a periodic batch computation per geohash cell. Payment is intentionally async to keep it off the critical path. Everything else — consistent hashing, Cassandra, multi-region isolation — follows from handling those constraints at 200K GPS writes per second.

🎯 interview hot-takes

Why bipartite graph matching instead of greedy nearest-driver? Greedy assigns each rider to their nearest driver independently, which is locally optimal but globally suboptimal — it can leave nearby riders with much farther drivers. Batching requests and solving the bipartite assignment minimizes average pickup time across all riders.
Why at-most-once for GPS vs. at-least-once for payments? A dropped GPS packet has negligible impact (next arrives in 5 s); a dropped payment event has real financial consequences. Delivery guarantees should match the business cost of a lost message.
Why partition driver location data by city and product type? UberX and UberPool draw from separate driver pools; cross-city supply is irrelevant. Partitioning isolates load by market and makes matching queries fast within each segment.
How do you prevent a driver from being double-matched? The matching service atomically flips the driver's status from AVAILABLE to IN_TRIP in the spatial index before sending the offer. Any concurrent match attempt reads the updated status and skips that driver.
How does the surge multiplier get computed? Every 60–120 s, count pending_requests and available_drivers per geohash cell; compute their ratio; map to a multiplier table; write to Redis with a short TTL. Pricing reads are Redis hits — never a direct spatial index query.

Design a Ride-share App

Step 1 — Clarify Requirements

Functional requirements

Non-functional requirements

Step 2 — Capacity Estimation

GPS write load

Trip and matching load

Storage

Step 3 — API Design

Step 4 — Driver and Rider Registration

Step 5 — GPS Ingestion Pipeline

GPS data pre-processing

Step 6 — Geospatial Indexing

Geohash

Google S2 cells

QuadTree

Step 7 — Location Storage Architecture

Step 8 — The Matching Algorithm

Matching pipeline in detail

Step 9 — Trip State Machine

Timeout handling

Step 10 — ETA and Routing

ETA computation layers

Step 11 — Surge Pricing

Computing the surge multiplier

Upfront fare estimation

Step 12 — Map Services

Step 13 — Data Model

Step 14 — Payment Processing

Payment idempotency

Step 15 — Scaling and Fault Tolerance

Sharding the location index

Multi-region deployment

Circuit breakers and graceful degradation

WebSocket for real-time tracking

Step 16 — Key Tradeoffs