Highly Available Database

  • -24

    cesar_augusto_guzman_alvarez about 1 year ago

    nice course. I only think it is a little bit long.

    reply
    • 4

      ran_yu 4 months ago

      I think it worth wise.

      reply
  • -4

    jeffery about 1 year ago

    100T +

    reply
    • 1

      jeffery about 1 year ago

      Sorry mistype, I wanted to leave this on the note area.

      reply
  • 0

    henry_henry about 1 year ago

    How exactly do peer-to-peer systems work? Doesn't there still have to be a high-level router that routes traffic to one of the systems, which represents a SPOF? Any good reads on these?

    reply
    • 1

      tusharbisht about 1 year ago

      The major idea behind a peer to peer system is that all the nodes are similar in nature/power. Here we use the idea in the sense that multiple nodes can be the owner of a data D. Depending on the settings, more than one node have to go down for this data to become unavailable. The high-level router you talk about can consist of multiple machines since it doesn't require any actual information about that data other than where it might be stored.

      reply
      • 0

        henry_henry about 1 year ago

        I see, I was originally a bit confused about consistent hashing using peer-peer systems. Out of curiosity, how do we get rid of "Single point of failure"? Will there not always be 1 SPOF regardless of what system we create? In a general sense, if a system goes down on any level of an architecture, doesn't there have to be a router of some sort that directs traffic?
        reply

        reply
        • 0

          kuzmin about 1 year ago

          You could have DNS-level routing that points to multiple load balancers

          reply
  • 0

    henry_henry about 1 year ago

    I see, I was originally a bit confused about consistent hashing using peer-peer systems. Out of curiosity, how do we get rid of "Single point of failure"? Will there not always be 1 SPOF regardless of what system we create? In a general sense, if a system goes down on any level of an architecture, doesn't there have to be a router of some sort that directs traffic?

    reply
    • 0

      henry_henry about 1 year ago

      Meant to write this as a reply

      reply
      • 0

        munir_mehta about 1 year ago

        I am assuming you are talking about load balancer who directs traffic in peer to peer system and worried about that going down. Highly available system has redundant load balancer as well which will come back online if it doesnt hear back from master. Highly available system has everything redudant starting from load balancer to hubs and switches as well.

        reply
  • 1

    sarang about 1 year ago

    For a read to be consistent(return the latest write), we need to keep W + R > P.

    reply
    • 0

      sarang about 1 year ago

      first thing first, what is P? is it number of shards for a key?

      reply
      • 0

        rajputnr about 1 year ago

        P is the replication factor here as explained above

        reply
      • 0

        sumit_007 about 1 year ago

        Yes, It is the number of shards which contains data for a paticular key.

        reply
        • 0

          kartikeya_singh 12 months ago

          P is the no. of peers(machines) in a shard, shards don't inter-replicate data.

          reply
  • 0

    rajputnr about 1 year ago

    P is the replication factor here as explained above

    reply
  • 0

    sumit_007 about 1 year ago

    Does P, W and R predefined ? Or they change dynamically as per new machines addition/deletion ?
    Ex: For a total of 15 machines, I keep 'P' to be 5, 'W' and 'R' both to be 3.
    Q1. What is the criteria to decide 'P' as 5 out of total 15 machines ? Any good articles/paper ?
    Q2. How to handle if machines left available goes down below 'W' ? How sacrificing data consistency means here?
    Q3. After addition of 10 more machines, will value of P, W, and R remains same ?

    reply
    • 2

      tusharbisht about 1 year ago

      Q. Does P, W and R predefined ? Or they change dynamically as per new machines addition/deletion ?
      A. P, W and R should be defined according to the use case for the system. For example, a system with high reads and low writes, we can keep R as 1 and W as P(hypothetically) so that we spend less time on reads while it is okay to spend more time on writes. P, W and R should be adjusted so that it represents how important the data is(P) and what is the ratio of reads vs writes.


      Q. How to handle if machines left available goes down below 'W' ? What sacrificing data consistency means here?
      A. Sacrificing data consistency would mean that if we are only able to perform a write on a subset of W nodes, while reading we may get an outdated value reading from R nodes. For example, assuming P = 5, W = R = 3. Let's name these machines M1, M2 .. M5. Assume before writing a value machines M1, M2 and M3 dies. For this write, we will have to do with M4 and M5. Now after some time, M1, M2 and M3 comes up. Now, if we try to read the earlier value and if we only read from M1, M2 and M3, we won't have the latest value for the data and hence we are compromising consistency.


      Q. After addition of 10 more machines, will value of P, W, and R remains same ?
      A. Yes. P, W and R are parameters about a single data value itself. More machines means more data to store with the same configuration.

      reply
      • 0

        karthik579 4 months ago

        From my understanding of the article, when a read request is received by a node, it then acts as a coordinator node and identifies the replica nodes for this request. For Q2, the write has happened on M4,M5. When a read request is received on say M1, it acts as the coordinating node and identifies the replica nodes as M4,M5 and sends the response back.

        reply
  • 1

    kaidul about 1 year ago

    Size of the value for a key can increase as mentioned earlier. But the solution wasn't discussed :O

    reply
  • 3

    tersduz88 12 months ago

    You say " we would need to compromise with consistency if we have availability and partition tolerance. ". This is incorrect, we don't need to sacrifice anything if there is no network partition, in other words, you don't need to sacrifice consistency/availability unless there is a network partition.

    reply
    • 2

      ScammyPrinceOfNigeria 8 months ago

      Not sure why this is downvoted, but this is correct. The tradeoff is encountered ONLY when the partition happens. The CAP theorem states that only 2 are guaranteed (come what may). Doesn't mean the third is sacrificed at all times.

      reply
      • 0

        ivs about 1 month ago

        Yes. But I have a doubt. Doesn't it depend on how we build the system initially. We first have a design goal (this case, provide more 'A' and 'P' than 'C') and then develop the architecture. Even though we may not want to sacrifice 'C', shouldn't this be more obvious when partition happens and we end up in an inconsistent state for a certain period of time.

        reply
  • 0

    alik_rogotner 8 months ago

    great

    reply
  • 1

    xiaomi_mao 7 months ago

    One thing I don't get is previously it mentioned 10K QPS and the size of one value can increase up to 1G. It doesn't say how to tackle it. Did I miss anything? Just reading 1G from ssd will take around 1 sec.

    reply
    • 0

      samira_allahverdiyeva 7 months ago

      Moreover, as i understood, 1 GB it's just value. it means, the total value can be 10K*1Gb = 10K Gb? For 1 node? please, someone, explain it if you know

      reply
      • 0

        sunnyk 6 months ago

        My understanding - 1GB is the max, so not all values would probably have 1GB. Also, the total value is the QPS i.e. 10K queries need to be served per second. If the machine cannot cope up with the demand, we can always increase the number of machines to that the QPS becomes manageable.

        reply
    • 1

      sunnyk 6 months ago

      One thing I know is that the QPS of a machine is further spread out between each CPU core, but your question still is valid - the bottleneck would be reading 1GB data from disk and that takes a while. My assumption is that in this case, we just need to increase the number of machines such that the QPS per machine goes down enough so that each machine can fetch data in required time.

      reply
  • 0

    puffitim about 1 month ago

    we are writing multiple copies of data on different nodes and since we are doing SHARDING as well, isn't this make contrary to sharding principle that no two node share the same copy of data ?

    reply
    • 1

      ivs about 1 month ago

      May be you are relating 2 concepts - Replication (which is to provide more availability when one node goes down) vs Sharding (which is to distribute load with other machines for faster retrieval)

      reply
  • 0

    naman_pceit13523 23 days ago

    Yes I also think that the shrding is the best way

    reply
Click here to start solving coding interview questions