Design a URL Shortener

Problem Statement

Design a URL shortening service like Bit.ly. Users should be able to submit a long URL and receive a short, unique URL back. Anyone with the short URL should be redirected to the original URL.
The service should handle a large volume of reads (redirects) efficiently.

Time: 45 minutes

Functional Requirements

Given a long URL → generate a short URL
Given a short URL → redirect to the original URL
Short URLs should be unique
(Optional) Custom aliases, expiry time

Non-Functional Requirements

Low latency on reads (redirects are read-heavy)
High availability
Short URLs must not collide
Scale: 100M URLs generated/day, 10:1 read/write ratio

Template to Cover

1. Requirements Clarification    (~5 min)
2. Capacity Estimation           (~5 min)
3. API Design                    (~5 min)
4. High Level Design             (~10 min)
5. Deep Dive: Key Components    (~15 min)
6. Bottlenecks & Trade-offs      (~5 min)

Capacity Numbers to Work With

Metric	Value
Writes/day	100M
Reads/day	1B
URL length (short)	7 characters
Data retention	5 years

High Level Design: My Answers

Service Separation

The idea is to have two different services: one is the GET service which fetches the URLs and the other is the shortener service which shortens the URL. The idea with having two different services is to make sure that they can be independently scaled. Since we are heavily skewed in read to write ratio, we would ideally want to separate them so that the read requests do not bog down the writes.

Interviewer note: Strong reasoning. Independent scaling is the core justification and you stated it clearly.

Functional Requirements Clarification

Given a long URL, I am always making sure that there is only one single unique short code. Multiple same URLs will result in the same code.
Giving the user the ability to add a custom alias.
There is a default expiry time of one year. If a user wants a custom expiry that should also be allowed.
Short URL uniqueness will be enforced by a hashing strategy and a collision prevention strategy, adding salt and regenerating the hash on collision.
Short URL length is 7 characters, which gives 62^7 ≈ 3.5 trillion unique URLs, fairly decent for what we want to track today.
Data retention is 5 years. A background job will clean up expired URLs separately.

Interviewer note: Good clarifications. Calling out deduplication as a deliberate design choice (not just a side effect) is the right framing. Expiry handled both as a default and user-configurable.

Short Code Generation Algorithm

The shortener service receives the request, performs a SHA-256 hash of the long URL, converts the hash bytes to an integer, does a base62 encoding, and then takes the first 7 characters of the encoded value.

To prevent hash collisions mapping different URLs to the same short code, we add a salt (appending the attempt number) and regenerate the hash. If we reach the retry limit, we are forced to move to 8 characters which further increases the scope of viable short codes.

An alternative would be a distributed counter in Redis, since Redis is single-threaded, all services get a unique number per request which translates to a unique short code. The issue with that approach is that codes become sequential and enumerable, so obfuscating and preventing reverse engineering would require a separate design strategy.

Interviewer note: The hash approach is well reasoned. Calling out the Redis counter alternative and its trade-off (enumerability, obfuscation complexity) unprompted shows depth. Study the obfuscation strategy (bijective functions, Feistel cipher) further.

API Design & Redirect Strategy

Two APIs: the GET API and the shorten API. The first layer is the application load balancer which redirects the request to the right service.

For redirects, I return a 302 (not 301) because I do not want the response cached in the browser. Since I want click analytics, I always want to drag the user through my server first before redirecting them to the destination page. 301 is “Moved Permanently” and gets cached by browsers, which would make analytics permanently blind for repeat visits from the same browser.

Interviewer note: Initially stated 301 but corrected immediately when probed. The reasoning was right all along: 302 is correct. This is a common gotcha; knowing why matters more than memorising the code.

Caching Strategy

Since I want to handle close to a billion reads per day, I need an in-memory cache (Redis) where the short code is the key and the long URL is the value. The load balancer routes GET requests to the GET service, which first checks Redis. If found, return the 302 redirect directly. If not, fetch from the database, populate Redis, and then return the response. This is a cache-aside (lazy loading) pattern.

Storage estimate: each short code is 7 characters ≈ 7 bytes, each URL ≈ 2KB. Even if we were to cache all 3 trillion possible URLs that would be ~3TB, not unheard of on a single SSD. In practice we cache a fraction of that with LRU eviction.

Interviewer note: Cache-aside pattern correctly identified. Storage estimate is reasonable. Note: the 3TB is the theoretical maximum for the entire keyspace; the realistic working set is much smaller. LRU is the right eviction policy here.

Database Design & Indexing

Using Postgres. Write throughput is 100M/day ≈ 1000 writes/second, which Postgres handles fine (capable of ~10,000 writes/second with good indexing).

Indexing strategy:

Short code column: indexed for fast GET lookups. Hurts write efficiency slightly but that is acceptable.
Long URL column: not indexed. URLs can be 2000+ characters and Postgres B-tree indexes have a key size limit. Instead, deduplication is handled naturally, since SHA-256 is deterministic, the same URL always produces the same short code candidate on attempt 0. We look up by short code (indexed), compare the stored long URL, and deduplication falls out of the existing collision detection logic.
Unique constraint on short code column: enforces uniqueness at the database level, acting as a safety net for race conditions.

Interviewer note: Good recovery on the long URL indexing question. The insight that deterministic hashing makes a separate long URL index unnecessary is sharp. Unique constraint as a DB-level safety net is the right instinct.

High Availability

At the edge: Adding a CDN and edge network for better delivery. Significant copies of data at different edge locations reduce latency and handle regional traffic. This introduces complexity: CDN cache invalidation and asynchronous replication management. Most read traffic gets handled at the edge.

Trade-off with analytics: CDN caching conflicts with the goal of tracking click analytics through the server. The resolution is to move analytics to the edge, using something like CloudFront access logs with live stream logging, capturing analytics from there rather than from server-side requests. This is a real-world pattern but adds infrastructure complexity (log streaming pipeline, e.g. Kinesis).

At the data layer: Postgres has native WAL-based streaming replication. What it lacks natively is automated failover; tools like Patroni or managed services like AWS RDS Multi-AZ handle that. Multiple read replicas serve the read-heavy GET service. Geo-distributed replicas reduce regional latency. Reads always go to replicas, writes go to primary.

Interviewer note: CDN answer initially conflated edge caching with DB replication. Worth being precise about what lives where. The analytics vs. caching conflict was identified and resolved correctly. Postgres replication knowledge was partial: WAL streaming replication is native, automated failover is not. Study Patroni and RDS Multi-AZ specifically.

Replication Lag & Cache Warm-up

Risk: A user shortens a URL, the write goes to primary. A read replica that hasn’t replicated yet serves a GET request for that short code, resulting in a 404.

Mitigation: Write-through cache warm-up at creation time. When a new short code is created and returned to the user, simultaneously warm up Redis. Since the read path hits Redis first, replication lag becomes irrelevant: the short code is in cache before anyone can share the link.

Sequencing: DB write happens first. Only after a successful DB write do we warm up Redis. So DB fail + Redis success is impossible. However, if the DB write succeeds but the Redis warm-up fails, the short code is not in cache; a cache miss falls through to the replica which may not have replicated yet.

Trade-off: 100% availability cannot be guaranteed here. The mitigation is an async retry queue for the Redis warm-up: if it fails, enqueue the short code for retry. The failure window is seconds at most. Acceptable trade-off.

Interviewer note: Identifying the replication lag problem and solving it with write-through warm-up is strong. Correctly sequenced DB-first, Redis-second. Honest acknowledgement of the residual failure window and async retry as mitigation is the right answer.

Areas to Study Further

Postgres WAL replication: how streaming replication works, sync vs async, and what Patroni does specifically for automated failover.
CDN architecture: edge caching mechanics, cache invalidation strategies, and log streaming pipelines (Kinesis, Flink) for edge analytics.
Database sharding: partition key strategy for ~180B rows (100M/day × 5 years). How do you shard Postgres at this scale?
Rate limiting: never discussed. How do you prevent a single user from generating millions of short codes?
Distributed counter obfuscation: the Redis counter alternative deserves deeper study including bijective functions and Feistel cipher for making sequential counters non-enumerable.
Cryptographic primitives: SHA-256 internals (Feistel structure, round functions, sigma operations) revisited from your undergrad cryptography class.

Implementation

from dataclasses import dataclass, field
from typing import List
import uuid
import hashlib
from datetime import timedelta, datetime
from collections import deque

@dataclass
class ShortCode:
    long_url: str
    shortened_code: str #need to add a unique constraint
    created_on: datetime
    expiry_on: datetime
    custom_alias: str

class ShortnerService():
    MAX_RETRIES = 5
    def __init__(self):
        self.short_codes = {}

    def shortenURL(self, url, expiry_on, custom_alias):
        if custom_alias is None:
            for attempt in range(self.MAX_RETRIES):
                salted_url = url + str(attempt)
                num = int.from_bytes(hashlib.sha256(salted_url.encode()).digest())
                encoded = self.base62_encode(num)
                shortened_code = encoded[:7]
                if not expiry_on:
                    expiry_on = datetime.now() + timedelta(days=365) 
                new_code = ShortCode(url, shortened_code, datetime.now(), expiry_on, "")
                if shortened_code in self.short_codes and url == self.short_codes[shortened_code].long_url:
                    return self.short_codes[shortened_code]
                if shortened_code not in self.short_codes:
                    self.short_codes[shortened_code] = new_code
                    return new_code
            raise Exception("Max Retries exceeded")
        else:
            if custom_alias not in self.short_codes:
                if not expiry_on:
                    expiry_on = datetime.now() + timedelta(days=365) 
                new_code = ShortCode(url, custom_alias, datetime.now(), expiry_on, custom_alias)
                self.short_codes[custom_alias] = new_code
                return new_code
            
    def getUrl(self, code):
        if code in self.short_codes and self.short_codes[code].expiry_on > datetime.now():
            return self.short_codes[code].long_url
    
    def base62_encode(self, num: int) -> str:
        alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
        if num == 0:
            return alphabet[0]
        result = []
        while num > 0:
            remainder = num % 62
            num = num // 62
            result.append(alphabet[remainder])
        return ''.join(reversed(result))