Catalog
- Clarify the requirements
- Capacity Estimation
- System APIs
- High-level System Design
- Data Storage
- Scalability
Step1: Clarify the requirements
Clarify requirements and goals of the system
- Requirements
- Traffic size(e.g. Daily Active User)
Nobody expect you do design a complete system in 30-40 mins
Discuss the functionalities, align with interviewers or components to focus
Type1: Functional Requirement
- Tweet
- a. Create
- b. Delete
- Timeline/Feed
- a. Home
- b. User
- Follow a user
- Like a tweet
- Search tweets
...
Type2: Non-Functional Requirement
- Consistency
- Every read receives the most recent write or an error
- Sacrifice: Eventual consistency
- Availability
- Every request receives a response, without the guarantee that it contains the most recent write
- Scalable
- Performance: low latency
- Partion tolerance(Fault Tolerance)
- The system continues to operate despite an arbitrary number of messages being dropped by the network between nodes
Step2: Capacity Estimation
Assumption:
- 200 million DAU, 100 million new tweets
- Each user: visit home timeline 5 times; other user timeline 3 times
- Each timeline/page has 20 tweets
- Each tweet has size 280 bytes, matadatda 30 bytes
- per photo: 200kb, 20% tweets have images
- per video: 2mb, 10% tweets have video, 30% videos will be watched
Storage Estimate
- Write size daily:
- Text:
- 100M new tweets*(280+30)bytes/tweet = 31GB/day
- Image:
- 4TB/day
- Video:
- 20TB/day
- Text:
- Total
- 24TB/day
Bandwidth Estimate (Social Networking => read heavy)
Daily Read Tweets Volume:
- 200M * (5 home visit + 3 user visit) * 20 tweets/page = 32B tweets/day
Daily Read Band
- Text: 23B * 280bytes / 86400 = 100MB/s
- Image: 14GB/s
- Video: 20GB/s
- Total: 35GB/s
Step3: System APIs
postTweet(userToken, string tweet)
deleteTweet(userToken, string tweetId)
likeOrUnlikeTweet(userToken, string tweetId, bool like)
readHomeTimeLine(userToken, int pageSize, opt string pageToken)
readUserTimeLine(userToken, int pageSize, opt string pageToken)
Step4: High-Level System Design:
- post tweets
- user timeline(push/pull mode)
https://medium.com/@winapp/read-fast-with-fan-out-write-f25257117297
Home Timeline (cant d)
Fan out on write
- Not efficient for users with huge amount of followers(like Taylor Swift)
Hybrid Solution
-
Non-hot users:
- fan out on write(push)
-
Hot users:
- fan in on write(pull): read during timeline request from tweets cache, and aggregate with results from non-hot users
Step5: Data Storage
principles
- SQL database:
- e.g, user table
- NoSQL database:
- e.g, timelines
- File system:
- media file: image, audio, video
Step6: Scalability
- Identify potential bottlenecks
- Discussion solutions, focusing on tradeoffs
- Data sharding
- data store, cache
- Load balancing
- user <-> application server
- application server <-> cache server
- application server <-> db
- Data caching
- read heavy
- Data sharding
Sharding
Why?
- impossible to store/process all data in a single machine
How?
- Break large tables into smaller shards on multiple servers
Pros
- Horizontal scaling
Cons
- Complexity(distributed query, resharding...)
Option 1: shard by tweets' creation time
Pros:
- Limited shards to query
Cons:
- Hot/Cold data issue
- New shards fill up quickly
Option 2: Shard by hash(userId): store all the data of user on a single shard
Pros:
- Simple
- Query user timeline is straightforward
Cons:
- Home timeline stall needs to query multiple shards
- Non-uniform distribution of storage
- Hot users
- Availability
Option 3: Shard by hash(tweetId)
Pros:
- uniform distribution
- high availability
Cons:
- need to query all shards in order to generate user/home timeline(cache solution)
Caching
Why?
- social networks have heavy read traffic
- queries can be slow and cosyly
How?
- store hot/ precompuyed data in memory, reads can much faster
Timeline service
- user timelinme: user_id -> {tweet_id}
- home timeline: user_id -> {tweet_id}
- tweets: tweet_id -> tweet
Topics:
- caching policy
- sharding
- performance
ref
https://www.youtube.com/watch?v=PMCdWr6ejpw&list=PLLuMmzMTgVK4RuSJjXUxjeUt3-vSyA1Or&index=1