System Design кейсы: 10 классических интервью

Зачем знать: System Design — это половина технического интервью на middle 2 и выше. Compании смотрят, можешь ли ты спроектировать систему “с нуля”: задать правильные вопросы, оценить нагрузку, выбрать архитектуру, объяснить trade-offs. В 2026 году вопросы стандартизированы — есть 10-15 классических кейсов (URL shortener, chat, feed, rate limiter), которые повторяются. Подготовка к ним — это база. После понимания этих 10 ты сможешь спроектировать почти любую систему: они покрывают всё ключевое (caching, sharding, async, queues, consistency).

Содержание

Концепция: подход к System Design интервью
Production-практики: 10 кейсов
Gotchas: что часто упускают
Real cases: реальные архитектуры из индустрии
Вопросы для собеседования (общие принципы)
Practice
Источники

1. Концепция: подход к System Design интервью

1.1 Структура интервью (45-60 мин)

Стандартный flow:

Requirements clarification (5-10 мин) — задавай вопросы, не jump’ай в дизайн.
Capacity estimation (5 мин) — RPS, storage, bandwidth.
API design (5 мин) — REST/gRPC endpoints.
Data model (5 мин) — SQL/NoSQL schema.
High-level architecture (10 мин) — boxes-and-arrows diagram.
Deep dive (15-20 мин) — focus на 1-2 components.
Bottlenecks / scale (5 мин) — где упрётся, как решать.

1.2 Requirements: functional vs non-functional

Functional — что система делает:

Создавать URL → возвращать short link.
Click counter, analytics.

Non-functional — как делает:

100M URLs/day, 10:1 read/write.
Availability 99.99%.
Latency P99 < 100ms.
Durability — не теряем data.

1.3 Capacity estimation

Базовые числа (back-of-envelope):

1M = 10^6.
1B = 10^9.
86400 секунд в дне ≈ 100k.

Storage:

1 byte char ASCII, 4 bytes UTF-8 worst case.
1 byte int8, 8 bytes int64.
1 row TPC = 100-1000 bytes typical.

Throughput:

HDD: 100 IOPS.
SSD: 100k IOPS.
Network 1Gbps = 125 MB/s.

Latency:

L1 cache: 0.5ns.
RAM: 100ns.
SSD: 100μs.
Network round-trip same DC: 0.5ms.
Cross-region: 50-200ms.

1.4 Архитектурные паттерны

Caching layers:

Client cache (browser).
CDN (CloudFlare, CloudFront).
API Gateway cache.
Application cache (in-memory, Redis).
DB cache (query result).

Sharding strategies:

Range-based (id < 1M → shard 1).
Hash-based (hash(id) % N).
Geographic (US → shard A, EU → shard B).
Consistent hashing — minimal redistribution при add/remove shard.

Replication:

Leader-follower (Postgres replica).
Multi-leader (CockroachDB, Cassandra).
Leaderless (Dynamo, Cassandra).

Async patterns:

Queue (Kafka, RabbitMQ, SQS).
Pub/Sub.
CDC (Change Data Capture).

Consistency:

Strong (CP) — Postgres, etcd.
Eventual (AP) — Cassandra, S3.
Bounded staleness — Cosmos DB.

1.5 Trade-offs к обсуждению

Choice	Pros	Cons
SQL	ACID, joins	Vertical scale
NoSQL (KV)	Horizontal scale, simple	No joins, no transactions
Sync API	Simple, immediate	Tight coupling
Async (queue)	Loose coupling, retry	Complexity, eventual
Cache aside	Simple	Stale data
Write-through cache	Consistent	Slower writes
Sharding	Scale	Cross-shard queries hard
Read replicas	Scale reads	Lag

2. Production-практики: 10 кейсов

2.1 Кейс 1: URL Shortener (bit.ly)

Requirements:

Functional: shorten long URL → short link, redirect, analytics.
Non-functional: 100M URLs/day, 10:1 read/write, 99.99% availability, P99 < 100ms.

Capacity:

Writes: 100M / 86400 ≈ 1200 RPS.
Reads: 12000 RPS (10x writes).
Storage: 100M * 500 bytes (URL + metadata) ≈ 50 GB/day, 18 TB/year.

API:

POST /shorten
  Body: { "url": "https://very-long-url" }
  Resp: { "short": "https://sho.rt/abc123" }

GET /:code
  Redirect 301 to original URL

Data model (Postgres):

CREATE TABLE urls (
    code VARCHAR(7) PRIMARY KEY,
    original TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT NOW(),
    user_id BIGINT,
    expires_at TIMESTAMP
);
CREATE TABLE clicks (
    id BIGSERIAL,
    code VARCHAR(7),
    timestamp TIMESTAMP,
    user_agent TEXT,
    ip INET,
    referrer TEXT
);

Architecture:

[Client] → [CDN/LB] → [API Gateway] → [App Servers]
                                          ↓
                              ┌───────────┼───────────┐
                              v           v           v
                          [Cache: Redis] [DB: Postgres] [Kafka]
                                                          ↓
                                                  [Analytics Pipeline]
                                                          ↓
                                                  [ClickHouse]

Short code generation:

Base62 (a-z, A-Z, 0-9) — 62 чарактеров.
7 chars: 62^7 ≈ 3.5 trillion URLs.
Generation: counter-based (centralized counter + base62) или hash + check collision.

Caching:

Hot URLs (Pareto: 20% URLs = 80% reads) в Redis.
Cache key: url:CODE, value: original_url.
TTL: 24h, evict LRU.

Analytics:

Write click → Kafka (async, не blocking redirect).
Pipeline: Kafka → ClickHouse для time-series.
Realtime counts: Redis HyperLogLog.

Bottlenecks:

DB writes: shard by code prefix.
Read hotspots: cache aside.
Hot URLs: pre-warm.

2.2 Кейс 2: Distributed Rate Limiter

Requirements:

Per user/IP/API key rate limit.
10M users, 1000 requests per minute limit.
P99 < 10ms.
Multi-region.

Algorithms:

Token bucket:

Bucket с N tokens, refills R per second.
Request consumes 1 token.
Burst allowed up to bucket size.

type TokenBucket struct {
    capacity   float64
    tokens     float64
    refillRate float64 // per second
    last       time.Time
}

func (b *TokenBucket) Allow() bool {
    now := time.Now()
    elapsed := now.Sub(b.last).Seconds()
    b.tokens = min(b.capacity, b.tokens+elapsed*b.refillRate)
    b.last = now
    if b.tokens >= 1 {
        b.tokens--
        return true
    }
    return false
}

Leaky bucket:

Queue requests, process at fixed rate.
Smooths traffic, no burst.

Sliding window log:

Хранит timestamps requests за окно.
Count в окне = limit check.
Memory: O(N) per user.

Sliding window counter (hybrid):

Current + previous bucket counts.
Weighted average.
O(1) memory.

Architecture (distributed):

[Client] → [API Gateway] → [Rate Limiter Middleware] → [Backend]
                                  ↓
                          [Redis Cluster]
                          (atomic INCR + EXPIRE)

Redis Lua script (atomic):

local key = KEYS[1]
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local current = redis.call('INCR', key)
if current == 1 then
    redis.call('EXPIRE', key, window)
end
if current > limit then
    return 0
end
return 1

Per-user/IP/key:

Key format: rate:USER_ID:ENDPOINT.
Separate limits для tier (free 100/min, paid 10000/min).

Edge case: Redis down → fail open (allow) or fail close (deny)?

API call critical (auth) — fail close.
Bulk endpoint — fail open.

Response:

429 Too Many Requests.
Retry-After: 30 header.
X-RateLimit-Remaining: 0 header.

2.3 Кейс 3: Distributed Cache (Memcached-like)

Requirements:

Key-value store, in-memory.
100M keys, 1KB average value.
Total: 100 GB RAM.
1M QPS.
P99 < 1ms.

Architecture:

[Client] → [Consistent Hash Ring] → [Cache Node 1] [Cache Node 2] [Cache Node 3]
                                          (LRU)         (LRU)         (LRU)

Consistent hashing:

Hash ring 0 to 2^32.
Each node mapped to multiple positions (virtual nodes, 100-200).
Key hashed, find next node clockwise.
Add/remove node — minimal redistribution (1/N keys).

type Ring struct {
    nodes  map[uint32]string // hash → node
    sorted []uint32
}

func (r *Ring) AddNode(node string, virtual int) {
    for i := 0; i < virtual; i++ {
        h := hash(fmt.Sprintf("%s-%d", node, i))
        r.nodes[h] = node
        r.sorted = append(r.sorted, h)
    }
    sort.Slice(r.sorted, func(i, j int) bool { return r.sorted[i] < r.sorted[j] })
}

func (r *Ring) GetNode(key string) string {
    h := hash(key)
    idx := sort.Search(len(r.sorted), func(i int) bool { return r.sorted[i] >= h })
    if idx == len(r.sorted) { idx = 0 }
    return r.nodes[r.sorted[idx]]
}

LRU eviction:

Doubly-linked list + hash map.
O(1) get/put/evict.

TTL:

Lazy expiration (check on get).
Background sweeper (sample-based, like Redis).

Replication:

Primary + N replicas.
Read from any, write to primary.
Or consistent hash to N nodes.

Bottlenecks:

Hot keys → replicate to multiple nodes.
Network — colocate cache and app.

2.4 Кейс 4: Notification System

Requirements:

Channels: email, SMS, push.
1M notifications/day per channel.
Rate limiting per user (max 10 emails/day).
Templates.
Delivery guarantees.

Architecture:

[Service A] [Service B]  → [Notification API]
[Service C]                       ↓
                          [Kafka: notifications]
                                  ↓
                  ┌───────────────┼───────────────┐
                  v               v               v
            [Email Worker]  [SMS Worker]   [Push Worker]
                  ↓               ↓               ↓
              [SendGrid]      [Twilio]        [FCM/APNS]
                  ↓               ↓               ↓
            [Status]         [Status]        [Status]
                  └───────────────┴───────────────┘
                                  ↓
                          [Notifications DB]

API:

POST /notifications
  Body: {
    "user_id": "...",
    "template": "order_shipped",
    "channels": ["email", "push"],
    "params": { "order_id": "..." }
  }

Templates:

Database или files.
Multi-language.
HTML + plain text.

Rate limiting per user:

Aggregate in Redis: notif:USER_ID:email:DAY → counter.
Skip if exceeds.

Idempotency:

Notification ID hash(user_id + template + params + day).
Дедуп в DB.

Retry logic:

Exponential backoff: 1m, 5m, 30m, 4h, 24h.
After 5 retries — dead letter queue.

Delivery tracking:

Webhook от SendGrid/Twilio.
Update notification status in DB.
Metrics: delivery rate, bounce rate.

2.5 Кейс 5: Chat System (WhatsApp/Telegram)

Requirements:

1B users.
100M concurrent connections.
Messages delivered <1s.
Group chats до 1000 members.
E2EE optional.

Architecture:

[Mobile/Web Client]
        ↓ WebSocket
[Gateway (LB)] → sticky sessions
        ↓
[WebSocket Server] (presence + message receive)
        ↓
[Kafka: messages]
        ↓
[Message Processor]
        ↓
[Cassandra: messages] [Redis: online users]
        ↓
[Push notifier] → [APNS/FCM]

Connection management:

WebSocket persistent.
100M connections → ~1000 servers (100k connections each).
Sticky session (client → same server).

Message storage (Cassandra):

CREATE TABLE messages (
    chat_id UUID,
    timestamp TIMESTAMP,
    message_id UUID,
    sender_id UUID,
    body TEXT,
    PRIMARY KEY ((chat_id), timestamp, message_id)
) WITH CLUSTERING ORDER BY (timestamp DESC);

Partition by chat_id, sorted by timestamp → fast fetch recent messages.

Online presence (Redis):

SADD online_users USER_ID при connect.
SREM online_users USER_ID при disconnect.
TTL для heartbeat.

Group chats (1000 members):

Fan-out на write — copy message to 1000 inboxes (heavy write).
Fan-out on read — single message store, members fetch (heavy read).
Hybrid: small groups fan-out write, large groups fan-out read.

E2EE (Signal protocol):

Client encrypts message с recipient public key.
Server только relay (no plaintext).
Forward secrecy через ephemeral keys.

Notification:

Если user offline → push через APNS/FCM.

2.6 Кейс 6: Twitter Feed

Requirements:

300M users, 50M tweets/day.
Feed: tweets followed users in reverse chronological order.
P99 feed < 200ms.

Two approaches:

Fan-out write (push):

User tweets → write to followers’ timelines.
Pros: read fast.
Cons: hot celebrities (Justin Bieber 100M followers → 100M writes).

Fan-out read (pull):

User tweets → just store.
Read feed: query last tweets of followed users.
Pros: write fast.
Cons: read slow для users following 1000 people.

Hybrid (Twitter actual):

Normal users — fan-out write.
Celebrities (>1M followers) — fan-out read (special handling).
Feed = merge(precomputed timeline + celebrities tweets).

Architecture:

Tweet → [Tweet Service] → [Tweets DB]
                              ↓
                        [Fan-out service]
                              ↓
                     ┌────────┴────────┐
                     v                 v
           [Normal: write inbox]  [Celeb: skip]
                     ↓
              [Redis Timeline]
                     ↑
Feed ← [Feed Service] ← [Celeb merger]

Timeline storage (Redis sorted set):

Key: timeline:USER_ID.
Score: timestamp.
Member: tweet_id.
Capped at last 800 tweets (paginate older from DB).

Ranking (ML):

Engagement prediction.
Personalization.

2.7 Кейс 7: Search (Google-lite)

Requirements:

1B documents indexed.
1000 QPS search.
P99 < 200ms.
Ranking.

Architecture:

[Crawler] → [Storage (S3)] → [Indexer] → [Index (Elasticsearch)]
                                              ↓
[Query] → [Query Service] → [Index shards] → [Ranker] → [Results]

Inverted index:

word → [doc1, doc2, doc3, ...]
"golang" → [doc_42, doc_100, doc_2003]

Search:

Tokenize query.
Lookup each token → posting lists.
Intersect (AND) or union (OR).
Score each result (TF-IDF, BM25).
Top-K.

Sharding:

Document-based (each shard owns N docs).
Term-based (each shard owns N terms) — rare.

Crawler:

Distributed (multiple workers).
Politeness (rate limit per domain).
Deduplication.
robots.txt respect.

Ranking signals:

TF-IDF / BM25.
PageRank (link structure).
ML signals (clicks, dwell time).

Elasticsearch для production обычно — built-in inverted index, ranking.

2.8 Кейс 8: Payment System

Requirements:

Transactional (no money lost).
Idempotent.
1000 TPS.
Audit log.
Reconciliation.

Critical principles:

Strong consistency — DB transactions.
Idempotency — re-submit safe.
Audit log — every change recorded.
Reconciliation — periodic balance check.

Architecture:

[Client] → [Payment API] → [Payment Service]
                                ↓
                       ┌────────┴────────┐
                       v                 v
                  [Postgres]      [Event Store]
                  (accounts)         (Kafka)
                       ↓
                 [Stripe/PSP]

Idempotency:

POST /payments
  Idempotency-Key: client-generated-uuid
  Body: { ... }

→ Server checks if key exists in last N hours.
→ If yes, return previous result.
→ If no, process and store.

Database schema:

CREATE TABLE accounts (
    id UUID PRIMARY KEY,
    balance NUMERIC(18,2) NOT NULL CHECK (balance >= 0),
    version BIGINT NOT NULL DEFAULT 0  -- optimistic lock
);

CREATE TABLE transfers (
    id UUID PRIMARY KEY,
    idempotency_key TEXT UNIQUE,
    from_account UUID,
    to_account UUID,
    amount NUMERIC(18,2),
    status TEXT,
    created_at TIMESTAMP
);

CREATE TABLE audit_log (
    id BIGSERIAL,
    entity TEXT,
    entity_id UUID,
    action TEXT,
    before JSONB,
    after JSONB,
    actor TEXT,
    timestamp TIMESTAMP
);

Transfer (atomic):

BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 'A' AND balance >= 100;
UPDATE accounts SET balance = balance + 100 WHERE id = 'B';
INSERT INTO transfers (...) VALUES (...);
INSERT INTO audit_log (...) VALUES (...);
COMMIT;

Saga для multi-step (multiple services):

Choreography (events).
Orchestration (central coordinator).
Compensating transactions если failure.

Reconciliation:

Daily job sumлирует все ledger entries.
Compare с balances.
Alert если diff.

2.9 Кейс 9: GPS Tracking / Logistics

Requirements:

1M vehicles, locations каждые 5 секунд.
200k events/second.
Real-time dashboard.
Geofencing.
Historical playback.

Architecture:

[Vehicles] → [MQTT broker] → [Kafka]
                                ↓
                  ┌─────────────┼─────────────┐
                  v             v             v
            [Time-series DB] [Stream]    [Cache]
            (TimescaleDB)    Processor   (Redis last location)
                                ↓
                        [Dashboard WebSocket]

Time-series storage:

TimescaleDB — Postgres + hypertable.
ClickHouse — column store, fast aggregations.
InfluxDB — alternative.

Real-time (WebSocket):

Subscribe to vehicle updates.
Push location changes.

Geofencing:

Predefined polygons.
Check if location inside (spatial query).
Trigger event (entered/exited).

Historical query:

SELECT * FROM vehicle_locations
WHERE vehicle_id = 'V123'
  AND time BETWEEN '2026-05-01' AND '2026-05-08'
ORDER BY time;

2.10 Кейс 10: Job Scheduler (Sidekiq для Go)

Requirements:

Schedule background jobs.
Cron support.
Retry с backoff.
10000 jobs/sec.
Visibility (job status).

Architecture:

[Producer] → [Queue (Redis/PostgreSQL)] → [Worker Pool]
                                              ↓
                                          [Job DB]
                                              ↓
                                          [Retry Queue]

Queue backend options:

Redis (asynq, machinery) — fast, simple.
PostgreSQL (river, neoq) — durable, transactional.
Kafka — high throughput.
RabbitMQ.

Worker pool:

type Worker struct {
    queue chan Job
}

func (w *Worker) Start(n int) {
    for i := 0; i < n; i++ {
        go func() {
            for job := range w.queue {
                w.process(job)
            }
        }()
    }
}

Retry with backoff:

func (w *Worker) process(job Job) {
    if err := execute(job); err != nil {
        job.Attempts++
        if job.Attempts < 5 {
            delay := time.Duration(math.Pow(2, float64(job.Attempts))) * time.Minute
            scheduleRetry(job, delay)
        } else {
            sendToDLQ(job)
        }
    }
}

Cron:

Master process schedules.
Distributed lock (Redis SETNX) — only one master.
Alternative: k8s CronJob.

Visibility:

Job status (pending, running, completed, failed) в DB.
Dashboard.

Идемпотентность:

Job ID — unique.
Worker checks if already executed.

3. Gotchas

3.1 ⚠️ Jump в дизайн без requirements

Сразу draw’ить boxes — антипаттерн. Всегда сначала clarify: scale, latency, consistency, durability требования.

3.2 ⚠️ Capacity estimation skipped

Без чисел не понятно, нужен ли sharding, in-memory, async. Calculate: RPS, storage, bandwidth.

3.3 ⚠️ Single point of failure

Каждый component should быть redundant. Database → primary + replicas. App servers → multiple instances behind LB.

3.4 ⚠️ Hot keys / hot shards

Even sharding может быть uneven (celebrity на Twitter). Plan для hot keys: replication, separate handling.

3.5 ⚠️ Sync API между services

A calls B calls C calls D — каждый round trip latency, и failure всех вверх. Use async где possible.

3.6 ⚠️ Optimization premature

“Use Cassandra” без обоснования. Сначала single Postgres, потом justify NoSQL по requirements (scale, model).

3.7 ⚠️ Идемпотентность забыта

Network может retry → side effects дважды. Always design APIs idempotent (idempotency keys, conditional updates).

3.8 ⚠️ CAP misunderstood

CAP не “выбери 2 из 3”. Это “в случае network partition выбери C or A”. Под normal conditions можешь иметь все.

3.9 ⚠️ Cache invalidation

“There are only two hard things in CS: cache invalidation, naming things, and off-by-one errors”. Думай о TTL, write-through vs write-aside, pub/sub invalidation.

3.10 ⚠️ Eventual consistency не обсуждается

Если используешь queue / async — пользователь увидит eventual consistency. Опиши acceptable window (1 sec, 1 min) и UX implications.

3.11 ⚠️ Security ignored

Auth, rate limiting, input validation, encryption — должны быть упомянуты, даже если не deep dive.

3.12 ⚠️ Monitoring & observability

Каждый production system needs: metrics, logs, traces, alerts. Если не упомянул — minus.

3.13 ⚠️ Disaster recovery

Что если region down? Backups? RPO/RTO targets?

3.14 ⚠️ Scale numbers wrong

“1M users → нужен Kafka”. 1M users в день — это 12 RPS. Postgres ОК.

4. Real cases

4.1 bit.ly architecture

Public talks от bit.ly:

Postgres for canonical URLs.
Redis для hot cache.
Kafka для click events.
Vertica/ClickHouse для analytics.

4.2 WhatsApp architecture

Erlang/OTP (Concurrency!).
FreeBSD на серверах.
2M connections per server.
~50 engineers servingbillion users.

4.3 Discord chat scale

Cassandra → ScyllaDB (2017 migration story public).
Elixir + Erlang OTP.
500M+ daily users.

4.4 Stripe payments

Postgres core.
Strict ACID.
Idempotency keys API-level.
API uptime 99.999%.

4.5 Uber dispatcher

Real-time matching рек+водителей.
Geo-sharded.
H3 indexing (Uber’s hexagonal grid).

4.6 Netflix architecture

Eureka — service discovery.
Hystrix — circuit breakers.
Zuul — API gateway.
Chaos Monkey — disaster simulation.

4.7 Twitter timeline

Manhattan (in-house KV store).
Earlybird (real-time search).
Hybrid fan-out (described above).

4.8 GitHub scale

Monolith! Rails.
MySQL + Vitess для sharding.
Spokes — Git replication.

4.9 Авито architecture

Авито (РФ) — отлично документированная архитектура (доклады):

Микросервисы Go.
Kafka между сервисами.
ClickHouse для analytics.
Vertica для DWH.

4.10 ВК (VKontakte)

KPHP (PHP compiler).
Custom database engines.
100M+ users.
Один из крупнейших в мире.

5. Вопросы для собеседования (общие принципы)

Q1: Какой первый шаг в System Design интервью? A: Clarify requirements (functional + non-functional). Никогда не jump в дизайн без понимания scale, latency, consistency.

Q2: Как делать capacity estimation? A: Считай по основным числам: RPS (transactions/day / 86400), storage (RPS × size × retention), bandwidth (RPS × payload). Это влияет на выбор stack.

Q3: SQL vs NoSQL — как выбрать? A: SQL для transactional, structured, joins, ACID critical. NoSQL для horizontal scale, denormalized data, high QPS simple queries. Не “лучше/хуже” — разные tools.

Q4: Что такое CAP theorem? A: В distributed system при network partition выбираешь между Consistency и Availability. CP (Postgres, etcd) — отказ во время partition, но consistent. AP (Cassandra, Dynamo) — продолжает работать, но может быть stale.

Q5: Strong vs eventual consistency? A: Strong — каждый read видит latest write (Postgres). Eventual — eventually все нодыsync (Cassandra). Strong дороже, eventual проще scale.

Q6: Что такое consistent hashing? A: Hash function такой, что при add/remove node перераспределяется только 1/N keys. Используется в caches, distributed databases (Cassandra, Dynamo).

Q7: Sharding strategies? A: Range-based (by id range), hash-based (hash % N), geographic (by region), consistent hashing. Trade-offs: range — easy queries, но hotspots. Hash — uniform, но joins hard.

Q8: Fan-out write vs read? A: Fan-out write — copy data ко всем downstream (Twitter timeline). Read fast, write expensive. Fan-out read — query on read. Write fast, read expensive. Hybrid для celebrity pattern.

Q9: Кэширование — какие patterns? A: Cache-aside (read-through): app reads cache, miss → DB, fills cache. Write-through: write goes via cache to DB. Write-behind: write cache, async to DB. Refresh-ahead: pre-populate.

Q10: Что такое circuit breaker? A: Pattern для fault tolerance. После N failures circuit “trips” — все calls fail-fast (не делать call). Через timeout — half-open (try again). Если success — close. Hystrix, go-circuit.

Q11: Что такое idempotency и зачем? A: Same request executed multiple times — same result. Critical для retries (network failure → retry). Реализация: idempotency keys, conditional updates (compare-and-swap), upserts.

Q12: Saga pattern? A: Distributed transaction. Несколько local transactions с compensating actions если failure. Choreography (event-driven) или orchestration (central coordinator).

Q13: Что такое CQRS? A: Command Query Responsibility Segregation. Разные models для writes (commands) и reads (queries). Allows optimize independently. Часто с event sourcing.

Q14: Event sourcing? A: Store events (state changes), не current state. Current state = replay events. Pros: audit log, time travel. Cons: complexity, event versioning.

Q15: Что такое back-pressure? A: Mechanism для slow consumer protect от fast producer. Queue length, blocking writes, dropping. Без BP — OOM, cascade failure.

Q16: Что такое retry storm? A: При failure все clients retry одновременно → DDoS на recovering system. Защита: exponential backoff с jitter, circuit breaker.

Q17: Multi-region — как design? A: Active-passive (failover) или active-active. Trade-offs: latency (cross-region 100ms+), consistency (eventual или conflict resolution), cost. Подходы: geo-DNS routing, eventual sync, multi-leader DB.

Q18: GDPR/data residency requirements? A: Хранить EU data в EU. Multi-region deployment. Data subject rights (delete, export). Audit logs. Encryption at rest и in transit.

Q19: Как scale relational DB? A: 1) Vertical (bigger box). 2) Read replicas. 3) Caching. 4) Sharding. 5) Partitioning tables. 6) Move analytics to OLAP (Snowflake, ClickHouse).

Q20: Что такое DDD bounded context? A: Domain-Driven Design. Bounded context — logical boundary где domain model consistent. Микросервисы часто = bounded contexts. Aggregates, entities, value objects, events внутри context.

Q21: Сколько серверов нужно для 1M RPS? A: Зависит от endpoint. Simple Go API на single instance — 50-100k RPS. 1M RPS = 10-20 instances + LB + DB cluster + cache. Capacity estimation важен.

Q22: Какие компоненты в любой production architecture? A: 1) CDN/LB. 2) API Gateway / reverse proxy. 3) App servers. 4) Database (SQL/NoSQL). 5) Cache (Redis). 6) Queue (Kafka). 7) Object storage (S3). 8) Search (Elasticsearch). 9) Observability (Prometheus, Loki, Tempo). 10) Auth (OAuth/IdP).

Q23: Что такое microservices? Когда не использовать? A: Service per bounded context, independently deployed. Pros: independent scaling, team autonomy. Cons: distributed complexity, latency, debugging. НЕ использовать: small team, simple domain, не proven scale need.

Q24: Async vs sync communication? A: Sync (REST/gRPC) — immediate response, tight coupling. Async (queue) — loose coupling, retry, but eventual consistency. Используй async где не нужен immediate response.

Q25: Что такое eventual consistency и пользовательский опыт? A: Записал данные → может появиться через секунды. UX: optimistic UI (show immediately, rollback if fail), polling, WebSocket для updates.

Q26: Polling vs WebSocket vs SSE? A: Polling — простой, latency. WebSocket — bidirectional, real-time. SSE — server → client only, simpler than WS. Choice по use case.

Q27: Database connection pool sizing? A: Rule of thumb: pool size = (CPU cores × 2) + disk count. Postgres default 100 connections. Под нагрузкой — measure: queue waits → increase, idle → decrease.

Q28: Что такое write-amplification? A: 1 logical write → много physical writes. LSM trees (Cassandra, RocksDB) compact’ятся → write amp. Trade-off: read efficiency vs write efficiency.

Q29: ETL vs ELT? A: ETL — Extract, Transform, Load. Transform до load в warehouse. ELT — Load raw, transform внутри warehouse. ELT modern (BigQuery, Snowflake) — power compute warehouse.

Q30: Как handle data migration без downtime? A: 1) Dual write (write to old + new). 2) Backfill historical data. 3) Switch reads to new. 4) Stop writes to old. 5) Deprecate. Strangler pattern.

6. Practice

URL Shortener implementation: реализуй full в Go. Postgres + Redis cache. Test с k6 на 1000 RPS.
Rate limiter: 3 алгоритма (token bucket, sliding window log, sliding window counter). Сравни.
Distributed cache: реализуй consistent hash ring + LRU node. Test redistribution при add/remove node.
Chat WebSocket: simple chat с WebSocket + Redis для presence + Postgres для history.
Twitter feed: реализуй fan-out write timeline в Redis sorted set.
Search: build inverted index из текстовых документов. Сравни с Elasticsearch.
Payment idempotency: API с Idempotency-Key. Test что 100 параллельных requests дают same result.
Job scheduler: realize asynq-like queue в Go с retry + DLQ.
Mock interview: запиши себя на 45-мин system design (один из above). Review и improve.
Read papers: Dynamo (Amazon), Chubby (Google), Bigtable. Понимание fundamentals.

7. Источники

“System Design Interview” — Alex Xu (V1 + V2).
“Designing Data-Intensive Applications” — Martin Kleppmann (must-read).
System Design Primer: https://github.com/donnemartin/system-design-primer.
High Scalability blog: http://highscalability.com/.
AWS Well-Architected Framework: https://aws.amazon.com/architecture/well-architected/.
“Site Reliability Engineering” Google book: https://sre.google/sre-book/.
“Microservices Patterns” — Chris Richardson, Manning.
Google research papers: Bigtable, Chubby, Spanner, Zanzibar.
InfoQ architecture articles: https://www.infoq.com/architecture-design/.
ByteByteGo newsletter: https://blog.bytebytego.com/.