Design Cache

  • 15

    dooby_do 10 months ago

    I was asked this problem in Google's interview for L5 position.

    reply
    • -4

      kaidul 9 months ago

      For curiosity, did you make it?

      reply
  • 4

    kumar955 10 months ago

    I think in most of the interviews, candidate is expected to calculate Storage, RAM SIZE, Upstream and DownStream bandwidth, Num of Servers etc. (I meant we cannot assume most of things except one or two like Data Size).

    reply
  • 1

    venknar 10 months ago

    It is somewhat same as designing a distributed hash table.

    reply
    • 0

      rui_tong 3 months ago

      What's the difference?

      reply
  • 3

    deep_saxena 10 months ago

    How the QPS is calculated here? I am not getting the exact calculation. Can somebody explain it ?

    reply
    • 2

      venknar 10 months ago

      The way QPS is calculated is as follows : Earlier we mentioned that each machine would have a RAM of 72 GB of RAM. For serving 30TB of cache, the number of machines required would be 30 TB / 72G which is close to 420. Assume that we have 420 machines to server 30 TB of distributed cache. Now regarding the QPS the requirement was 10 M

      reply
    • 4

      venknar 10 months ago

      Now per machine the QPS would be 10M / 420 = Approximately 23000 QPS. So this meant per machine should be able to handle 23,00 QPS. The approach is similar to how we decided on the number of machines based on the per machine RAM and the total cache size. Similarly for the QPS, it is based on the total QPS / number of machines.

      reply
    • 1

      venknar 10 months ago

      Next assuming that a machine has to serve 23,000 QPS then we look at each machine has 4 core and then we calculate the per request time as - CPU time available per query = 4 * 1000 * 1000 / 23000 microseconds = 174us (Note everything is converted to milliseconds.) So the machines have to return the query in 174 us. This is the way the QPS is derived. Then based on the read / write traffic and the latency numbers as per the https://gist.github.com/jboner/2841832, the QPS is further refined by increasing the number of machines.

      reply
      • 0

        skcoder 10 months ago

        I am coming up with 23,000 queries / 1 sec.* 1,000,000 us/ 1 sec = 1 query / 40us. So 40us per query service time is required. I don't see why we would calculate this based on CPU time, or why my # is so far off. Only thing I can think of is that if I multiply by 4 CPUs I have each CPU servicing 1 query every 160us (close to the 174 above)- but why would we calculate this per CPU rather than per server?

        reply
        • 0

          skcoder 10 months ago

          OK, I figured it out. Each CPU has to return the query in 174us, but each CPU has to service ~6K queries/sec, not 23K. 23K is per machine.

          reply
      • 0

        amandeep_gautam_181 3 months ago

        How do we get this: 4 * 1000 * 1000 as CPU time?

        reply
  • 3

    daniele_broccolo 10 months ago

    I clicked on one of the links for the implementation of the least recently used cache.... the counter started, but I did not have the time to complete the task. I think you should advice before open the problems pages

    reply
    • 0

      catlover 10 months ago

      Yeah, that's right. There should be an advice. It also cause problem when landing from google search results.

      reply
  • 1

    amitvc 10 months ago

    What about sharding algorithms ? How does the caller know which server to go for a specific key ? Like memcache uses hashing to determine which server the key resides on.

    reply
    • 1

      kousik_nath 10 months ago

      You can use consistent hashing to determine the exact shrad for the key. But in case of cache server failure, consistent hashing can't alone help to recover the data. You must have replication cache servers so that disaster recovery happens from the replicated cache server data. You can google about consistent hashing. There are a lot of articles about it.

      reply
      • 1

        amitvc 10 months ago

        Thanks Koushik. I am aware about consistent hashing. My question was more about no discussion on the topic of sharding mentioned in the problem.

        reply
        • 1

          rashmiiiith 10 months ago

          I don't know how relevant to your question but this paper is worth reading .

          reply
        • 0

          rashmiiiith 10 months ago

          http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf

          reply
          • 1

            amitvc 10 months ago

            Thank you. I believe Cassandra also uses consistent hashing

            reply
  • 1

    velupula 9 months ago

    partitioning and fault tolerance are important topics that must have been covered in this design.

    reply
  • 0

    aniket15 7 months ago

    Latency is always an important metric for distributed and scalable systems. Can there be scenarios where its not that important?

    reply
    • 0

      snehasis_ghosh 3 months ago

      Something like Dropbox / Google Drive.

      reply
      • 1

        snehasis_ghosh 3 months ago

        Oops. Hit enter quickly. For example, suppose I am sharing a folder with many people. I accidently added a document I didn't intend to be public. When I delete that file from the folder, I would assume that nobody can access the file immediately. Everything else is secondary. If others have to wait a few seconds before they can access the folder, that's ok. In this case Consistency > Availability. And if it introduces some latency to guarantee Consistency, so be it.

        reply
  • 0

    Titan99 6 months ago

    I have a doubt, please correct me if I am wrong. We divided 30 TB of data into 420 machines having each 72GB of data. Isn't this also called sharding ? And, then we again split each machine data to 16GB shards ? Aren't we doing sharding twice, if seen from higher level ?

    reply
    • 0

      shino_kurian 5 months ago

      The case in discussion is either 420 machine of 72GB each or 1875 machines of 16GB each.

      reply
  • 1

    mag 6 months ago

    What is actually meaning of distributed here? all data spread across multiple nodes in a cluster (sharding)? or in case all data can reside in one node and implement master-slave or master-master replication? what is the goal? availability or fastest access? as per my understanding, performance is the primary goal of a cache system and to achieve that, if we are designing a very large hash table which cannot reside on a single node and we have split it to multiple nodes, would you say it distributed or sharded cache system?

    reply
  • 0

    niufei8888 5 months ago

    This tutorial rocks!

    reply
  • 0

    rkouj 4 months ago

    Scale is not more than 60k per second.

    reply
  • -2

    Clove 4 months ago

    Has anyone solved the above problem?

    reply
  • -1

    haandol 4 months ago

    This is just amazing.

    reply
  • 0

    tehmine_grigoryan 2 months ago

    Thank You!!!

    reply
  • 0

    jin_park 2 months ago

    Wow. Amazing stuff!

    reply
  • 0

    abhishekseth8887 about 2 months ago

    I guess this question asks about non-persistent distributed cache.

    reply
  • 0

    praveen2789 about 1 month ago

    When we are working with this all the map and linklist is in the same JVM how are we spreading it across the JVM's of different server?

    reply
  • 0

    a_bhishek 10 days ago

    1.

    reply
    • 0

      a_bhishek 10 days ago

      We must have a dynamic way of handling the most probable searches of the user in the twitter. therefore the best way would be Least Recently Used Page Replacement Algorithm using Linked List and a Queue

      reply
Click here to jump start your coding interview preparation