Design Instagram

Similar: Twitter/Facebook feeds

Scenarios

Use cases, i.e. functional requirements ( 2-3 min)

  • Users can send a new post: image/video + description

  • Users can follow other users

  • Users can view a feed of posts from following users

  • Non-goals: search user/post, hashtag, location, mention others, visibility, trending, etc.

Non-functional requirements

  • high availability, near 0 downtime

  • reliable, no lost data

  • latency: 200ms for loading feed

  • security & privacy

Estimations (2 min)

user scale:

  • 100m DAU

  • Read feeds:

    • every user open app 3 times, in total 10 requests to server

    • 100m * 10 / 24h = 12k qps -> peak: *3 = 36k qps

  • Write(post) feeds:

    • 1m posts every day

    • 12 qps -> peak: 36 qps

storage

  • photo size: 200kb -> ~200GB everyday

  • Next 5 years: 200G * 365 = 71T * 5 = 350TB

memory(cache)

networking

Services

  • User service(register, login, profile)

  • Relationship service

  • Feed service(read)

  • Post service(write)

  • Media service(serve read/write)

  • Advance: fanout, search, recommendation, trending, notification, logging, monitoring, analysis

Media Service

Topic: design file systems

Feed generation service

  • Store N(500) posts in memory. Fine tune the N. If a user request more than that, generate more on demand.

    • Use LRU cache to pre-generate feed for active users. Smarter: based on usage pattern to pre-generate feeds, e.g. a user usually active on Friday night.

  • Pull model

    • cons

      • high latency

      • heavy read on hot users

      • resources wastage if there is no new data in some users/nodes.

  • Push model

    • cons

      • celebrity user cause heavy load on fanout service

      • waste resources for generating feeds for inactive users

  • Hybrid model

    • Fanout for non-celebrity users, push posts to follower's feed table.

    • Pull celebrity posts when generating feeds for a user.

  • Feed ranking

    • Simple: creation_time

    • Advance: ranking score, sort feed by score + time desc.

Storage

Table schemas, DB choices

  • Post table

    • schema: post_id, user_id, content, creation_time

      • PK on post_id, secondary on user_id

    • shard on user_id

      • pros: faster query

      • cons: hot spot

    • shard on post_id

      • pros: has to query all servers -> high latency

    • shard on post id and creation time

      • post_id: timestamp + auto-increase id

  • Feed table

    • Schema: user_id, post_id, owner_id, content, post_creation_time

      • PK: user_id, secondary: post_creation_time

    • Handle mutable post: mark old post invalid, Feed Generator delete/invalid post in Feed Table, Feed service send tombstone to client.

    • Alternative schema: don't store content, user_id, post_id, post_creation_time

      • pros:

        • smaller data, no duplicate data stored, maybe fit in memory?

        • tolerate mutable post

      • cons:

        • need to fetch posts by list of post_id, then merge. lost the benefit of Feed table.

  • Relationship table

    • Graph DB

    • MySQL: A follows B,user_id, following_user_id

      • Composite PK on both columns.

  • User table

Scalability

  • DB replicas

    • Replicate to next 2 nodes in consistent hash ring

    • Leader-follower pattern

      • leader election

  • Data sharding

    • Global unique ID generator

    • Hot spot issue

    • shard on PostId

      • latency issue

    • shard on UserId

      • hot spot

      • availability: single point of failure

    • plan for growth: consistent sharding

  • Server replicas, redundancy

Advanced topics

Like count

  • API: likePost(postId)

  • Like Table schema: id, userId, postId, timestamp

  • Process: 1) send to Like service; 2) check if user already liked it before by querying DB or cache in memory. 3) periodically update <postId, count> to Post table.

  • Scalability

    • shard on userId.

    • quick response: maintain cache <userId, list of historical liked post>. or bloom filter + DB query.

    • every 1s/10s, aggregate <postId, likeCount>, send to Message Queue.

    • another aggregation service listen to the queue. maintain cache <postId, likeCount>, periodically batch update Post table.

    • Key point: buffered aggregator to reduce the load on DB.

Follow & unfollow

Reference

Last updated