Highly Available Database

  • -14

    cesar_augusto_guzman_alvarez 12 months ago

    nice course. I only think it is a little bit long.

    reply
    • 1

      ran_yu 2 months ago

      I think it worth wise.

      reply
  • -4

    jeffery 12 months ago

    100T +

    reply
    • 1

      jeffery 12 months ago

      Sorry mistype, I wanted to leave this on the note area.

      reply
  • 0

    henry_henry 12 months ago

    How exactly do peer-to-peer systems work? Doesn't there still have to be a high-level router that routes traffic to one of the systems, which represents a SPOF? Any good reads on these?

    reply
    • 1

      tusharbisht 12 months ago

      The major idea behind a peer to peer system is that all the nodes are similar in nature/power. Here we use the idea in the sense that multiple nodes can be the owner of a data D. Depending on the settings, more than one node have to go down for this data to become unavailable. The high-level router you talk about can consist of multiple machines since it doesn't require any actual information about that data other than where it might be stored.

      reply
      • 0

        henry_henry 12 months ago

        I see, I was originally a bit confused about consistent hashing using peer-peer systems. Out of curiosity, how do we get rid of "Single point of failure"? Will there not always be 1 SPOF regardless of what system we create? In a general sense, if a system goes down on any level of an architecture, doesn't there have to be a router of some sort that directs traffic?
        reply

        reply
        • 0

          kuzmin 10 months ago

          You could have DNS-level routing that points to multiple load balancers

          reply
  • 0

    henry_henry 12 months ago

    I see, I was originally a bit confused about consistent hashing using peer-peer systems. Out of curiosity, how do we get rid of "Single point of failure"? Will there not always be 1 SPOF regardless of what system we create? In a general sense, if a system goes down on any level of an architecture, doesn't there have to be a router of some sort that directs traffic?

    reply
    • 0

      henry_henry 12 months ago

      Meant to write this as a reply

      reply
      • 0

        munir_mehta 11 months ago

        I am assuming you are talking about load balancer who directs traffic in peer to peer system and worried about that going down. Highly available system has redundant load balancer as well which will come back online if it doesnt hear back from master. Highly available system has everything redudant starting from load balancer to hubs and switches as well.

        reply
  • 0

    sarang 11 months ago

    For a read to be consistent(return the latest write), we need to keep W + R > P.

    reply
    • 0

      sarang 11 months ago

      first thing first, what is P? is it number of shards for a key?

      reply
      • 0

        rajputnr 11 months ago

        P is the replication factor here as explained above

        reply
      • 0

        sumit_007 11 months ago

        Yes, It is the number of shards which contains data for a paticular key.

        reply
        • 0

          kartikeya_singh 10 months ago

          P is the no. of peers(machines) in a shard, shards don't inter-replicate data.

          reply
  • 0

    rajputnr 11 months ago

    P is the replication factor here as explained above

    reply
  • 0

    sumit_007 11 months ago

    Does P, W and R predefined ? Or they change dynamically as per new machines addition/deletion ?
    Ex: For a total of 15 machines, I keep 'P' to be 5, 'W' and 'R' both to be 3.
    Q1. What is the criteria to decide 'P' as 5 out of total 15 machines ? Any good articles/paper ?
    Q2. How to handle if machines left available goes down below 'W' ? How sacrificing data consistency means here?
    Q3. After addition of 10 more machines, will value of P, W, and R remains same ?

    reply
    • 1

      tusharbisht 11 months ago

      Q. Does P, W and R predefined ? Or they change dynamically as per new machines addition/deletion ?
      A. P, W and R should be defined according to the use case for the system. For example, a system with high reads and low writes, we can keep R as 1 and W as P(hypothetically) so that we spend less time on reads while it is okay to spend more time on writes. P, W and R should be adjusted so that it represents how important the data is(P) and what is the ratio of reads vs writes.


      Q. How to handle if machines left available goes down below 'W' ? What sacrificing data consistency means here?
      A. Sacrificing data consistency would mean that if we are only able to perform a write on a subset of W nodes, while reading we may get an outdated value reading from R nodes. For example, assuming P = 5, W = R = 3. Let's name these machines M1, M2 .. M5. Assume before writing a value machines M1, M2 and M3 dies. For this write, we will have to do with M4 and M5. Now after some time, M1, M2 and M3 comes up. Now, if we try to read the earlier value and if we only read from M1, M2 and M3, we won't have the latest value for the data and hence we are compromising consistency.


      Q. After addition of 10 more machines, will value of P, W, and R remains same ?
      A. Yes. P, W and R are parameters about a single data value itself. More machines means more data to store with the same configuration.

      reply
      • 0

        karthik579 about 1 month ago

        From my understanding of the article, when a read request is received by a node, it then acts as a coordinator node and identifies the replica nodes for this request. For Q2, the write has happened on M4,M5. When a read request is received on say M1, it acts as the coordinating node and identifies the replica nodes as M4,M5 and sends the response back.

        reply
  • 1

    kaidul 10 months ago

    Size of the value for a key can increase as mentioned earlier. But the solution wasn't discussed :O

    reply
  • 1

    tersduz88 9 months ago

    You say " we would need to compromise with consistency if we have availability and partition tolerance. ". This is incorrect, we don't need to sacrifice anything if there is no network partition, in other words, you don't need to sacrifice consistency/availability unless there is a network partition.

    reply
    • 1

      ScammyPrinceOfNigeria 6 months ago

      Not sure why this is downvoted, but this is correct. The tradeoff is encountered ONLY when the partition happens. The CAP theorem states that only 2 are guaranteed (come what may). Doesn't mean the third is sacrificed at all times.

      reply
  • 0

    alik_rogotner 6 months ago

    great

    reply
  • 1

    xiaomi_mao 5 months ago

    One thing I don't get is previously it mentioned 10K QPS and the size of one value can increase up to 1G. It doesn't say how to tackle it. Did I miss anything? Just reading 1G from ssd will take around 1 sec.

    reply
    • 0

      samira_allahverdiyeva 5 months ago

      Moreover, as i understood, 1 GB it's just value. it means, the total value can be 10K*1Gb = 10K Gb? For 1 node? please, someone, explain it if you know

      reply
      • 0

        sunnyk 4 months ago

        My understanding - 1GB is the max, so not all values would probably have 1GB. Also, the total value is the QPS i.e. 10K queries need to be served per second. If the machine cannot cope up with the demand, we can always increase the number of machines to that the QPS becomes manageable.

        reply
    • 0

      sunnyk 4 months ago

      One thing I know is that the QPS of a machine is further spread out between each CPU core, but your question still is valid - the bottleneck would be reading 1GB data from disk and that takes a while. My assumption is that in this case, we just need to increase the number of machines such that the QPS per machine goes down enough so that each machine can fetch data in required time.

      reply
Click here to start solving coding interview questions