Перейти к содержимому

System Design кейсы: 10 классических интервью

Зачем знать: System Design — это половина технического интервью на middle 2 и выше. Compании смотрят, можешь ли ты спроектировать систему “с нуля”: задать правильные вопросы, оценить нагрузку, выбрать архитектуру, объяснить trade-offs. В 2026 году вопросы стандартизированы — есть 10-15 классических кейсов (URL shortener, chat, feed, rate limiter), которые повторяются. Подготовка к ним — это база. После понимания этих 10 ты сможешь спроектировать почти любую систему: они покрывают всё ключевое (caching, sharding, async, queues, consistency).

  1. Концепция: подход к System Design интервью
  2. Production-практики: 10 кейсов
  3. Gotchas: что часто упускают
  4. Real cases: реальные архитектуры из индустрии
  5. Вопросы для собеседования (общие принципы)
  6. Practice
  7. Источники

Стандартный flow:

  1. Requirements clarification (5-10 мин) — задавай вопросы, не jump’ай в дизайн.
  2. Capacity estimation (5 мин) — RPS, storage, bandwidth.
  3. API design (5 мин) — REST/gRPC endpoints.
  4. Data model (5 мин) — SQL/NoSQL schema.
  5. High-level architecture (10 мин) — boxes-and-arrows diagram.
  6. Deep dive (15-20 мин) — focus на 1-2 components.
  7. Bottlenecks / scale (5 мин) — где упрётся, как решать.

Functional — что система делает:

  • Создавать URL → возвращать short link.
  • Click counter, analytics.

Non-functional — как делает:

  • 100M URLs/day, 10:1 read/write.
  • Availability 99.99%.
  • Latency P99 < 100ms.
  • Durability — не теряем data.

Базовые числа (back-of-envelope):

  • 1M = 10^6.
  • 1B = 10^9.
  • 86400 секунд в дне ≈ 100k.

Storage:

  • 1 byte char ASCII, 4 bytes UTF-8 worst case.
  • 1 byte int8, 8 bytes int64.
  • 1 row TPC = 100-1000 bytes typical.

Throughput:

  • HDD: 100 IOPS.
  • SSD: 100k IOPS.
  • Network 1Gbps = 125 MB/s.

Latency:

  • L1 cache: 0.5ns.
  • RAM: 100ns.
  • SSD: 100μs.
  • Network round-trip same DC: 0.5ms.
  • Cross-region: 50-200ms.

Caching layers:

  • Client cache (browser).
  • CDN (CloudFlare, CloudFront).
  • API Gateway cache.
  • Application cache (in-memory, Redis).
  • DB cache (query result).

Sharding strategies:

  • Range-based (id < 1M → shard 1).
  • Hash-based (hash(id) % N).
  • Geographic (US → shard A, EU → shard B).
  • Consistent hashing — minimal redistribution при add/remove shard.

Replication:

  • Leader-follower (Postgres replica).
  • Multi-leader (CockroachDB, Cassandra).
  • Leaderless (Dynamo, Cassandra).

Async patterns:

  • Queue (Kafka, RabbitMQ, SQS).
  • Pub/Sub.
  • CDC (Change Data Capture).

Consistency:

  • Strong (CP) — Postgres, etcd.
  • Eventual (AP) — Cassandra, S3.
  • Bounded staleness — Cosmos DB.
ChoiceProsCons
SQLACID, joinsVertical scale
NoSQL (KV)Horizontal scale, simpleNo joins, no transactions
Sync APISimple, immediateTight coupling
Async (queue)Loose coupling, retryComplexity, eventual
Cache asideSimpleStale data
Write-through cacheConsistentSlower writes
ShardingScaleCross-shard queries hard
Read replicasScale readsLag

Requirements:

  • Functional: shorten long URL → short link, redirect, analytics.
  • Non-functional: 100M URLs/day, 10:1 read/write, 99.99% availability, P99 < 100ms.

Capacity:

  • Writes: 100M / 86400 ≈ 1200 RPS.
  • Reads: 12000 RPS (10x writes).
  • Storage: 100M * 500 bytes (URL + metadata) ≈ 50 GB/day, 18 TB/year.

API:

POST /shorten
Body: { "url": "https://very-long-url" }
Resp: { "short": "https://sho.rt/abc123" }
GET /:code
Redirect 301 to original URL

Data model (Postgres):

CREATE TABLE urls (
code VARCHAR(7) PRIMARY KEY,
original TEXT NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
user_id BIGINT,
expires_at TIMESTAMP
);
CREATE TABLE clicks (
id BIGSERIAL,
code VARCHAR(7),
timestamp TIMESTAMP,
user_agent TEXT,
ip INET,
referrer TEXT
);

Architecture:

[Client] → [CDN/LB] → [API Gateway] → [App Servers]
┌───────────┼───────────┐
v v v
[Cache: Redis] [DB: Postgres] [Kafka]
[Analytics Pipeline]
[ClickHouse]

Short code generation:

  • Base62 (a-z, A-Z, 0-9) — 62 чарактеров.
  • 7 chars: 62^7 ≈ 3.5 trillion URLs.
  • Generation: counter-based (centralized counter + base62) или hash + check collision.

Caching:

  • Hot URLs (Pareto: 20% URLs = 80% reads) в Redis.
  • Cache key: url:CODE, value: original_url.
  • TTL: 24h, evict LRU.

Analytics:

  • Write click → Kafka (async, не blocking redirect).
  • Pipeline: Kafka → ClickHouse для time-series.
  • Realtime counts: Redis HyperLogLog.

Bottlenecks:

  • DB writes: shard by code prefix.
  • Read hotspots: cache aside.
  • Hot URLs: pre-warm.

Requirements:

  • Per user/IP/API key rate limit.
  • 10M users, 1000 requests per minute limit.
  • P99 < 10ms.
  • Multi-region.

Algorithms:

Token bucket:

  • Bucket с N tokens, refills R per second.
  • Request consumes 1 token.
  • Burst allowed up to bucket size.
type TokenBucket struct {
capacity float64
tokens float64
refillRate float64 // per second
last time.Time
}
func (b *TokenBucket) Allow() bool {
now := time.Now()
elapsed := now.Sub(b.last).Seconds()
b.tokens = min(b.capacity, b.tokens+elapsed*b.refillRate)
b.last = now
if b.tokens >= 1 {
b.tokens--
return true
}
return false
}

Leaky bucket:

  • Queue requests, process at fixed rate.
  • Smooths traffic, no burst.

Sliding window log:

  • Хранит timestamps requests за окно.
  • Count в окне = limit check.
  • Memory: O(N) per user.

Sliding window counter (hybrid):

  • Current + previous bucket counts.
  • Weighted average.
  • O(1) memory.

Architecture (distributed):

[Client] → [API Gateway] → [Rate Limiter Middleware] → [Backend]
[Redis Cluster]
(atomic INCR + EXPIRE)

Redis Lua script (atomic):

local key = KEYS[1]
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local current = redis.call('INCR', key)
if current == 1 then
redis.call('EXPIRE', key, window)
end
if current > limit then
return 0
end
return 1

Per-user/IP/key:

  • Key format: rate:USER_ID:ENDPOINT.
  • Separate limits для tier (free 100/min, paid 10000/min).

Edge case: Redis down → fail open (allow) or fail close (deny)?

  • API call critical (auth) — fail close.
  • Bulk endpoint — fail open.

Response:

  • 429 Too Many Requests.
  • Retry-After: 30 header.
  • X-RateLimit-Remaining: 0 header.

Requirements:

  • Key-value store, in-memory.
  • 100M keys, 1KB average value.
  • Total: 100 GB RAM.
  • 1M QPS.
  • P99 < 1ms.

Architecture:

[Client] → [Consistent Hash Ring] → [Cache Node 1] [Cache Node 2] [Cache Node 3]
(LRU) (LRU) (LRU)

Consistent hashing:

  • Hash ring 0 to 2^32.
  • Each node mapped to multiple positions (virtual nodes, 100-200).
  • Key hashed, find next node clockwise.
  • Add/remove node — minimal redistribution (1/N keys).
type Ring struct {
nodes map[uint32]string // hash → node
sorted []uint32
}
func (r *Ring) AddNode(node string, virtual int) {
for i := 0; i < virtual; i++ {
h := hash(fmt.Sprintf("%s-%d", node, i))
r.nodes[h] = node
r.sorted = append(r.sorted, h)
}
sort.Slice(r.sorted, func(i, j int) bool { return r.sorted[i] < r.sorted[j] })
}
func (r *Ring) GetNode(key string) string {
h := hash(key)
idx := sort.Search(len(r.sorted), func(i int) bool { return r.sorted[i] >= h })
if idx == len(r.sorted) { idx = 0 }
return r.nodes[r.sorted[idx]]
}

LRU eviction:

  • Doubly-linked list + hash map.
  • O(1) get/put/evict.

TTL:

  • Lazy expiration (check on get).
  • Background sweeper (sample-based, like Redis).

Replication:

  • Primary + N replicas.
  • Read from any, write to primary.
  • Or consistent hash to N nodes.

Bottlenecks:

  • Hot keys → replicate to multiple nodes.
  • Network — colocate cache and app.

Requirements:

  • Channels: email, SMS, push.
  • 1M notifications/day per channel.
  • Rate limiting per user (max 10 emails/day).
  • Templates.
  • Delivery guarantees.

Architecture:

[Service A] [Service B] → [Notification API]
[Service C] ↓
[Kafka: notifications]
┌───────────────┼───────────────┐
v v v
[Email Worker] [SMS Worker] [Push Worker]
↓ ↓ ↓
[SendGrid] [Twilio] [FCM/APNS]
↓ ↓ ↓
[Status] [Status] [Status]
└───────────────┴───────────────┘
[Notifications DB]

API:

POST /notifications
Body: {
"user_id": "...",
"template": "order_shipped",
"channels": ["email", "push"],
"params": { "order_id": "..." }
}

Templates:

  • Database или files.
  • Multi-language.
  • HTML + plain text.

Rate limiting per user:

  • Aggregate in Redis: notif:USER_ID:email:DAY → counter.
  • Skip if exceeds.

Idempotency:

  • Notification ID hash(user_id + template + params + day).
  • Дедуп в DB.

Retry logic:

  • Exponential backoff: 1m, 5m, 30m, 4h, 24h.
  • After 5 retries — dead letter queue.

Delivery tracking:

  • Webhook от SendGrid/Twilio.
  • Update notification status in DB.
  • Metrics: delivery rate, bounce rate.

Requirements:

  • 1B users.
  • 100M concurrent connections.
  • Messages delivered <1s.
  • Group chats до 1000 members.
  • E2EE optional.

Architecture:

[Mobile/Web Client]
↓ WebSocket
[Gateway (LB)] → sticky sessions
[WebSocket Server] (presence + message receive)
[Kafka: messages]
[Message Processor]
[Cassandra: messages] [Redis: online users]
[Push notifier] → [APNS/FCM]

Connection management:

  • WebSocket persistent.
  • 100M connections → ~1000 servers (100k connections each).
  • Sticky session (client → same server).

Message storage (Cassandra):

CREATE TABLE messages (
chat_id UUID,
timestamp TIMESTAMP,
message_id UUID,
sender_id UUID,
body TEXT,
PRIMARY KEY ((chat_id), timestamp, message_id)
) WITH CLUSTERING ORDER BY (timestamp DESC);

Partition by chat_id, sorted by timestamp → fast fetch recent messages.

Online presence (Redis):

  • SADD online_users USER_ID при connect.
  • SREM online_users USER_ID при disconnect.
  • TTL для heartbeat.

Group chats (1000 members):

  • Fan-out на write — copy message to 1000 inboxes (heavy write).
  • Fan-out on read — single message store, members fetch (heavy read).
  • Hybrid: small groups fan-out write, large groups fan-out read.

E2EE (Signal protocol):

  • Client encrypts message с recipient public key.
  • Server только relay (no plaintext).
  • Forward secrecy через ephemeral keys.

Notification:

  • Если user offline → push через APNS/FCM.

Requirements:

  • 300M users, 50M tweets/day.
  • Feed: tweets followed users in reverse chronological order.
  • P99 feed < 200ms.

Two approaches:

Fan-out write (push):

  • User tweets → write to followers’ timelines.
  • Pros: read fast.
  • Cons: hot celebrities (Justin Bieber 100M followers → 100M writes).

Fan-out read (pull):

  • User tweets → just store.
  • Read feed: query last tweets of followed users.
  • Pros: write fast.
  • Cons: read slow для users following 1000 people.

Hybrid (Twitter actual):

  • Normal users — fan-out write.
  • Celebrities (>1M followers) — fan-out read (special handling).
  • Feed = merge(precomputed timeline + celebrities tweets).

Architecture:

Tweet → [Tweet Service] → [Tweets DB]
[Fan-out service]
┌────────┴────────┐
v v
[Normal: write inbox] [Celeb: skip]
[Redis Timeline]
Feed ← [Feed Service] ← [Celeb merger]

Timeline storage (Redis sorted set):

  • Key: timeline:USER_ID.
  • Score: timestamp.
  • Member: tweet_id.
  • Capped at last 800 tweets (paginate older from DB).

Ranking (ML):

  • Engagement prediction.
  • Personalization.

Requirements:

  • 1B documents indexed.
  • 1000 QPS search.
  • P99 < 200ms.
  • Ranking.

Architecture:

[Crawler] → [Storage (S3)] → [Indexer] → [Index (Elasticsearch)]
[Query] → [Query Service] → [Index shards] → [Ranker] → [Results]

Inverted index:

word → [doc1, doc2, doc3, ...]
"golang" → [doc_42, doc_100, doc_2003]

Search:

  1. Tokenize query.
  2. Lookup each token → posting lists.
  3. Intersect (AND) or union (OR).
  4. Score each result (TF-IDF, BM25).
  5. Top-K.

Sharding:

  • Document-based (each shard owns N docs).
  • Term-based (each shard owns N terms) — rare.

Crawler:

  • Distributed (multiple workers).
  • Politeness (rate limit per domain).
  • Deduplication.
  • robots.txt respect.

Ranking signals:

  • TF-IDF / BM25.
  • PageRank (link structure).
  • ML signals (clicks, dwell time).

Elasticsearch для production обычно — built-in inverted index, ranking.

Requirements:

  • Transactional (no money lost).
  • Idempotent.
  • 1000 TPS.
  • Audit log.
  • Reconciliation.

Critical principles:

  1. Strong consistency — DB transactions.
  2. Idempotency — re-submit safe.
  3. Audit log — every change recorded.
  4. Reconciliation — periodic balance check.

Architecture:

[Client] → [Payment API] → [Payment Service]
┌────────┴────────┐
v v
[Postgres] [Event Store]
(accounts) (Kafka)
[Stripe/PSP]

Idempotency:

POST /payments
Idempotency-Key: client-generated-uuid
Body: { ... }
→ Server checks if key exists in last N hours.
→ If yes, return previous result.
→ If no, process and store.

Database schema:

CREATE TABLE accounts (
id UUID PRIMARY KEY,
balance NUMERIC(18,2) NOT NULL CHECK (balance >= 0),
version BIGINT NOT NULL DEFAULT 0 -- optimistic lock
);
CREATE TABLE transfers (
id UUID PRIMARY KEY,
idempotency_key TEXT UNIQUE,
from_account UUID,
to_account UUID,
amount NUMERIC(18,2),
status TEXT,
created_at TIMESTAMP
);
CREATE TABLE audit_log (
id BIGSERIAL,
entity TEXT,
entity_id UUID,
action TEXT,
before JSONB,
after JSONB,
actor TEXT,
timestamp TIMESTAMP
);

Transfer (atomic):

BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 'A' AND balance >= 100;
UPDATE accounts SET balance = balance + 100 WHERE id = 'B';
INSERT INTO transfers (...) VALUES (...);
INSERT INTO audit_log (...) VALUES (...);
COMMIT;

Saga для multi-step (multiple services):

  • Choreography (events).
  • Orchestration (central coordinator).
  • Compensating transactions если failure.

Reconciliation:

  • Daily job sumлирует все ledger entries.
  • Compare с balances.
  • Alert если diff.

Requirements:

  • 1M vehicles, locations каждые 5 секунд.
  • 200k events/second.
  • Real-time dashboard.
  • Geofencing.
  • Historical playback.

Architecture:

[Vehicles] → [MQTT broker] → [Kafka]
┌─────────────┼─────────────┐
v v v
[Time-series DB] [Stream] [Cache]
(TimescaleDB) Processor (Redis last location)
[Dashboard WebSocket]

Time-series storage:

  • TimescaleDB — Postgres + hypertable.
  • ClickHouse — column store, fast aggregations.
  • InfluxDB — alternative.

Real-time (WebSocket):

  • Subscribe to vehicle updates.
  • Push location changes.

Geofencing:

  • Predefined polygons.
  • Check if location inside (spatial query).
  • Trigger event (entered/exited).

Historical query:

SELECT * FROM vehicle_locations
WHERE vehicle_id = 'V123'
AND time BETWEEN '2026-05-01' AND '2026-05-08'
ORDER BY time;

Requirements:

  • Schedule background jobs.
  • Cron support.
  • Retry с backoff.
  • 10000 jobs/sec.
  • Visibility (job status).

Architecture:

[Producer] → [Queue (Redis/PostgreSQL)] → [Worker Pool]
[Job DB]
[Retry Queue]

Queue backend options:

  • Redis (asynq, machinery) — fast, simple.
  • PostgreSQL (river, neoq) — durable, transactional.
  • Kafka — high throughput.
  • RabbitMQ.

Worker pool:

type Worker struct {
queue chan Job
}
func (w *Worker) Start(n int) {
for i := 0; i < n; i++ {
go func() {
for job := range w.queue {
w.process(job)
}
}()
}
}

Retry with backoff:

func (w *Worker) process(job Job) {
if err := execute(job); err != nil {
job.Attempts++
if job.Attempts < 5 {
delay := time.Duration(math.Pow(2, float64(job.Attempts))) * time.Minute
scheduleRetry(job, delay)
} else {
sendToDLQ(job)
}
}
}

Cron:

  • Master process schedules.
  • Distributed lock (Redis SETNX) — only one master.
  • Alternative: k8s CronJob.

Visibility:

  • Job status (pending, running, completed, failed) в DB.
  • Dashboard.

Идемпотентность:

  • Job ID — unique.
  • Worker checks if already executed.

Сразу draw’ить boxes — антипаттерн. Всегда сначала clarify: scale, latency, consistency, durability требования.

Без чисел не понятно, нужен ли sharding, in-memory, async. Calculate: RPS, storage, bandwidth.

Каждый component should быть redundant. Database → primary + replicas. App servers → multiple instances behind LB.

Even sharding может быть uneven (celebrity на Twitter). Plan для hot keys: replication, separate handling.

A calls B calls C calls D — каждый round trip latency, и failure всех вверх. Use async где possible.

“Use Cassandra” без обоснования. Сначала single Postgres, потом justify NoSQL по requirements (scale, model).

Network может retry → side effects дважды. Always design APIs idempotent (idempotency keys, conditional updates).

CAP не “выбери 2 из 3”. Это “в случае network partition выбери C or A”. Под normal conditions можешь иметь все.

“There are only two hard things in CS: cache invalidation, naming things, and off-by-one errors”. Думай о TTL, write-through vs write-aside, pub/sub invalidation.

Если используешь queue / async — пользователь увидит eventual consistency. Опиши acceptable window (1 sec, 1 min) и UX implications.

Auth, rate limiting, input validation, encryption — должны быть упомянуты, даже если не deep dive.

Каждый production system needs: metrics, logs, traces, alerts. Если не упомянул — minus.

Что если region down? Backups? RPO/RTO targets?

“1M users → нужен Kafka”. 1M users в день — это 12 RPS. Postgres ОК.


Public talks от bit.ly:

  • Postgres for canonical URLs.
  • Redis для hot cache.
  • Kafka для click events.
  • Vertica/ClickHouse для analytics.
  • Erlang/OTP (Concurrency!).
  • FreeBSD на серверах.
  • 2M connections per server.
  • ~50 engineers servingbillion users.
  • Cassandra → ScyllaDB (2017 migration story public).
  • Elixir + Erlang OTP.
  • 500M+ daily users.
  • Postgres core.
  • Strict ACID.
  • Idempotency keys API-level.
  • API uptime 99.999%.
  • Real-time matching рек+водителей.
  • Geo-sharded.
  • H3 indexing (Uber’s hexagonal grid).
  • Eureka — service discovery.
  • Hystrix — circuit breakers.
  • Zuul — API gateway.
  • Chaos Monkey — disaster simulation.
  • Manhattan (in-house KV store).
  • Earlybird (real-time search).
  • Hybrid fan-out (described above).
  • Monolith! Rails.
  • MySQL + Vitess для sharding.
  • Spokes — Git replication.

Авито (РФ) — отлично документированная архитектура (доклады):

  • Микросервисы Go.
  • Kafka между сервисами.
  • ClickHouse для analytics.
  • Vertica для DWH.
  • KPHP (PHP compiler).
  • Custom database engines.
  • 100M+ users.
  • Один из крупнейших в мире.

5. Вопросы для собеседования (общие принципы)

Заголовок раздела «5. Вопросы для собеседования (общие принципы)»

Q1: Какой первый шаг в System Design интервью? A: Clarify requirements (functional + non-functional). Никогда не jump в дизайн без понимания scale, latency, consistency.

Q2: Как делать capacity estimation? A: Считай по основным числам: RPS (transactions/day / 86400), storage (RPS × size × retention), bandwidth (RPS × payload). Это влияет на выбор stack.

Q3: SQL vs NoSQL — как выбрать? A: SQL для transactional, structured, joins, ACID critical. NoSQL для horizontal scale, denormalized data, high QPS simple queries. Не “лучше/хуже” — разные tools.

Q4: Что такое CAP theorem? A: В distributed system при network partition выбираешь между Consistency и Availability. CP (Postgres, etcd) — отказ во время partition, но consistent. AP (Cassandra, Dynamo) — продолжает работать, но может быть stale.

Q5: Strong vs eventual consistency? A: Strong — каждый read видит latest write (Postgres). Eventual — eventually все нодыsync (Cassandra). Strong дороже, eventual проще scale.

Q6: Что такое consistent hashing? A: Hash function такой, что при add/remove node перераспределяется только 1/N keys. Используется в caches, distributed databases (Cassandra, Dynamo).

Q7: Sharding strategies? A: Range-based (by id range), hash-based (hash % N), geographic (by region), consistent hashing. Trade-offs: range — easy queries, но hotspots. Hash — uniform, но joins hard.

Q8: Fan-out write vs read? A: Fan-out write — copy data ко всем downstream (Twitter timeline). Read fast, write expensive. Fan-out read — query on read. Write fast, read expensive. Hybrid для celebrity pattern.

Q9: Кэширование — какие patterns? A: Cache-aside (read-through): app reads cache, miss → DB, fills cache. Write-through: write goes via cache to DB. Write-behind: write cache, async to DB. Refresh-ahead: pre-populate.

Q10: Что такое circuit breaker? A: Pattern для fault tolerance. После N failures circuit “trips” — все calls fail-fast (не делать call). Через timeout — half-open (try again). Если success — close. Hystrix, go-circuit.

Q11: Что такое idempotency и зачем? A: Same request executed multiple times — same result. Critical для retries (network failure → retry). Реализация: idempotency keys, conditional updates (compare-and-swap), upserts.

Q12: Saga pattern? A: Distributed transaction. Несколько local transactions с compensating actions если failure. Choreography (event-driven) или orchestration (central coordinator).

Q13: Что такое CQRS? A: Command Query Responsibility Segregation. Разные models для writes (commands) и reads (queries). Allows optimize independently. Часто с event sourcing.

Q14: Event sourcing? A: Store events (state changes), не current state. Current state = replay events. Pros: audit log, time travel. Cons: complexity, event versioning.

Q15: Что такое back-pressure? A: Mechanism для slow consumer protect от fast producer. Queue length, blocking writes, dropping. Без BP — OOM, cascade failure.

Q16: Что такое retry storm? A: При failure все clients retry одновременно → DDoS на recovering system. Защита: exponential backoff с jitter, circuit breaker.

Q17: Multi-region — как design? A: Active-passive (failover) или active-active. Trade-offs: latency (cross-region 100ms+), consistency (eventual или conflict resolution), cost. Подходы: geo-DNS routing, eventual sync, multi-leader DB.

Q18: GDPR/data residency requirements? A: Хранить EU data в EU. Multi-region deployment. Data subject rights (delete, export). Audit logs. Encryption at rest и in transit.

Q19: Как scale relational DB? A: 1) Vertical (bigger box). 2) Read replicas. 3) Caching. 4) Sharding. 5) Partitioning tables. 6) Move analytics to OLAP (Snowflake, ClickHouse).

Q20: Что такое DDD bounded context? A: Domain-Driven Design. Bounded context — logical boundary где domain model consistent. Микросервисы часто = bounded contexts. Aggregates, entities, value objects, events внутри context.

Q21: Сколько серверов нужно для 1M RPS? A: Зависит от endpoint. Simple Go API на single instance — 50-100k RPS. 1M RPS = 10-20 instances + LB + DB cluster + cache. Capacity estimation важен.

Q22: Какие компоненты в любой production architecture? A: 1) CDN/LB. 2) API Gateway / reverse proxy. 3) App servers. 4) Database (SQL/NoSQL). 5) Cache (Redis). 6) Queue (Kafka). 7) Object storage (S3). 8) Search (Elasticsearch). 9) Observability (Prometheus, Loki, Tempo). 10) Auth (OAuth/IdP).

Q23: Что такое microservices? Когда не использовать? A: Service per bounded context, independently deployed. Pros: independent scaling, team autonomy. Cons: distributed complexity, latency, debugging. НЕ использовать: small team, simple domain, не proven scale need.

Q24: Async vs sync communication? A: Sync (REST/gRPC) — immediate response, tight coupling. Async (queue) — loose coupling, retry, but eventual consistency. Используй async где не нужен immediate response.

Q25: Что такое eventual consistency и пользовательский опыт? A: Записал данные → может появиться через секунды. UX: optimistic UI (show immediately, rollback if fail), polling, WebSocket для updates.

Q26: Polling vs WebSocket vs SSE? A: Polling — простой, latency. WebSocket — bidirectional, real-time. SSE — server → client only, simpler than WS. Choice по use case.

Q27: Database connection pool sizing? A: Rule of thumb: pool size = (CPU cores × 2) + disk count. Postgres default 100 connections. Под нагрузкой — measure: queue waits → increase, idle → decrease.

Q28: Что такое write-amplification? A: 1 logical write → много physical writes. LSM trees (Cassandra, RocksDB) compact’ятся → write amp. Trade-off: read efficiency vs write efficiency.

Q29: ETL vs ELT? A: ETL — Extract, Transform, Load. Transform до load в warehouse. ELT — Load raw, transform внутри warehouse. ELT modern (BigQuery, Snowflake) — power compute warehouse.

Q30: Как handle data migration без downtime? A: 1) Dual write (write to old + new). 2) Backfill historical data. 3) Switch reads to new. 4) Stop writes to old. 5) Deprecate. Strangler pattern.


  1. URL Shortener implementation: реализуй full в Go. Postgres + Redis cache. Test с k6 на 1000 RPS.

  2. Rate limiter: 3 алгоритма (token bucket, sliding window log, sliding window counter). Сравни.

  3. Distributed cache: реализуй consistent hash ring + LRU node. Test redistribution при add/remove node.

  4. Chat WebSocket: simple chat с WebSocket + Redis для presence + Postgres для history.

  5. Twitter feed: реализуй fan-out write timeline в Redis sorted set.

  6. Search: build inverted index из текстовых документов. Сравни с Elasticsearch.

  7. Payment idempotency: API с Idempotency-Key. Test что 100 параллельных requests дают same result.

  8. Job scheduler: realize asynq-like queue в Go с retry + DLQ.

  9. Mock interview: запиши себя на 45-мин system design (один из above). Review и improve.

  10. Read papers: Dynamo (Amazon), Chubby (Google), Bigtable. Понимание fundamentals.


  1. “System Design Interview” — Alex Xu (V1 + V2).
  2. “Designing Data-Intensive Applications” — Martin Kleppmann (must-read).
  3. System Design Primer: https://github.com/donnemartin/system-design-primer.
  4. High Scalability blog: http://highscalability.com/.
  5. AWS Well-Architected Framework: https://aws.amazon.com/architecture/well-architected/.
  6. “Site Reliability Engineering” Google book: https://sre.google/sre-book/.
  7. “Microservices Patterns” — Chris Richardson, Manning.
  8. Google research papers: Bigtable, Chubby, Spanner, Zanzibar.
  9. InfoQ architecture articles: https://www.infoq.com/architecture-design/.
  10. ByteByteGo newsletter: https://blog.bytebytego.com/.