System Design

Quick notes

User <--> DNS
User <--> Web server <--> Database
Vertical scaling
Horizontal scaling
1. Load balancers
2. Multiple servers
3. Database replication
  1. Master/slave relationships
  2. Sharding
    Resharding data
    Celebrity problem
    Join and de-normalization
Performance improvements
1. CDN
2. Cache
Stateless web tier
Data centers
Message queue
Logging, metrics, automation
L1 and L2 cache
Mutex lock/unlock
Branch mispredict
Disk seek
Back of the envelope estimation
Cloud microservices
API gateway
1. rate limiting
2. SSL termination
3. authentication
4. IP whitelisting
5. servicing static content
6. etc.
Rate limiting algorithms
1. Token bucket
2. Leaking bucket
3. Fixed window counter
4. Sliding window log
5. Sliding window counter
Rate limiter in a distributed environment
1. Race condition
2. Synchronization issue
  1. Sticky sessions
  2. Redis
3. Performance optimization
4. Monitoring
Consistent hashing
1. The rehashing problem
2. Hotspot key problem
3. Virtual nodes
Key-value store
1. Single server key-value store
2. Distributed key-value store
  1. CAP (Consistency, Availability, Partition)
  2. CAP Theorem
    CP, AP, CA
3. System components
  1. Data paritition
  2. Data replication
  3. Consistency
    Quorum consensus
    N, W, R
    Consistency models
    Strong consistency
    Weak consistency
    Eventual consistency
  4. Inconsistency resolution
    Versioning
    Detection and reconciliation
    Vector clock
  5. Handling failures
  6. System architecture diagram
  7. Write path
  8. Read path
  9. Gossip protocol
  10. Hinted handoff
  11. Anti-entropy protocol
    Merkle tree
Unique ID generator
1. Multi master replication
2. UUID
3. Ticket server
4. Twitter snowflake approach

Applications

URL Shortener

Use cases
1. URL shortening
2. URL redirecting
3. High availability
Back of the envelope estimation
1. No. of urls to store: to decide hash function
2. Storage requirement
Design
1. 301 redirect: permanently redirected
2. 302 redirect: temporarily redirected
3. Hash functions
  1. Hash + collisions resolution
  2. Base-n conversion of unique ids

Web crawler

Use cases
1. Search engine indexing
2. Web archiving
3. Web mining
4. Web monitoring
Characteristics of a good web crawler
1. Scalability: efficiency, parallelization, etc.
2. Robustness: Handle edge cases like traps, bad HTML, unresponsive servers, etc.
3. Politeness: Do not send too many requests
4. Extensibility: Extend with minimal changes to crawl more stuff in future like images.
Queries per second
1. Peak QPS
2. Average web page size
3. Storage requirements
Design
1. Seed urls
2. URL Frontier
3. HTML downloader
4. DNS resolver
5. Content parser
6. Content seen?
7. Content storage
8. URL Extractor
9. URL Filter
10. URL seen?
11. URL storage
12. Web crawler workflow
13. Priority
  1. Front queues: manage prioritization
  2. Back queues: manage politeness
14. Freshness
15. Performance optimization
  1. Distributed crawl
  2. Cache DNS resolver: Keep mapping and update periodically using cron jobs.
  3. Locality
  4. Short timeout
16. Robustness
  1. Consistent hashing: To distribute load among downloaders
  2. Save crawl states and data
  3. Exception handling
  4. Data validation
17. Extensibility
18. Avoid problems
  1. Redundant content
  2. Spider traps
  3. Data noise
19. Server side rendering
20. Filtering
21. Analytics

Notification system

Requirements
1. Types: Push notification, SMS, email
2. Real time?: Soft real time?
3. Supported devices: IOS, android, laptop/desktop
4. Trigger notifications?
5. opt-out?
6. how many?
High level: types, contact info gathering, sending/receiving flow
APNS, FCM, SMS, Email
Design
1. Notification servers
2. Cache
3. DB
4. Message queues
5. Workers
6. Third-party servers
7. IOS, Android, SMS, Email

News Feed

High level: Feed publishing, feed building
1. User <--> Load balancer <--> Web servers <--> [Post service, fanout service, notification service]
2. Post service <--> Post cache
3. News feed service <--> News feed cache
4. Post cache <--> Post DB
Fanout service
1. Fanout on write
  1. Hotkey problem
2. Fanout on read
Cache architecture
1. New feed: news feed
2. Content: host cache, normal
3. Social Graph: follower, following
4. Action: liked, replied, others
5. Counters: like counters, reply counter, other counters

Chat system

Requirements
1. A one-on-one chat with low delivery latency
2. Small group chat
3. Online status
4. Multiple device support. The same account can be logged in to multiple devices at the same time.
5. Push notifications
Sender <--> Chat service [store message, relay message] <--> receiver
HTTP is client-initiated connection
Server-initiated connection
1. Polling
2. Long polling
3. web socket
Apache zookeeper
User online status

Search autocomplete

Back of the envelope estimation: Peak QPS, storage
High-level design: Data gathering service, query service
Trie
Stream processing: Apache Hadoop MapReduce, Apache Spark Streaming, Apache Storm, Apache Kafka

YouTube

Back of the envelope estimation: Average video size, daily storage space need, CDN cost, etc.
Transcoding
MPEG-DASH

Proximity Service

Two-dimensional search: Evenly divided grid, Geohash, Quadtree, Google S2 (Hilbert Curve)
Blue/green deployment

Quick terms

High availability
High scalability
Automatic scaling
Heterogeneity
Tunable consistency
Low latency
Asynchronous
High speed networks
Bloom filter
Dedupe mechanism

Questions

What DB to use relational/non-relational?
How non-relational databases differ from each other?
How load-balancers work?
Latency vs throughput?
ACID properties?
Memcached and Redis?
Anycast?
Why use vector clocks and not twitter snowflake approach for versioning

Quick links

Should you go Beyong Relational Databases?:
Caching Strategies and How to Choose the Right One:
Active-Active for Multi-Regional Resiliency:
What it takes to run Stack Overflow:
Better Rate Limiting With Redis Sorted Sets:
How we built rate limiting capable of scaling to millions of domains:

PreviousDesign Patterns NextReact.js

Last updated 1 year ago