Monday, June 11, 2012

Configuring data consistency options in Cassandra


Cassandra shows excellent performance in data distribution and availability. In Cassandra, consistency refers to how up-to-date and synchronized a row of data is on all of its replicas. Cassandra extends the concept of eventual consistency by offering tunable consistency. For any given read or write operation, the client application decides how consistent the requested data should be. In addition to tunable consistency, Cassandra has a number of built-in repair mechanisms to ensure that data remains consistent across replicas.

Cassandra supports the following write consistency levels.
  • ANY, having the lowest consistency with high availability. This mentions that a write must be written to at least one node. If all replica nodes for the given row key are down, the write can still succeed once a hinted handoff has been written.
  • ALL, having the highest consistency with lowest availability. A write must be written to the commit log and memory table on all replica nodes in the cluster for that row key.
  • QUORUM is a good middle-ground ensuring strong consistency, yet still tolerating some level of failure. A write must be written to the commit log and memory table on a quorum of replica nodes.
  • ONE: A write must be written to the commit log and memory table of at least one replica node.
  • LOCAL_QUORUM: A write must be written to the commit log and memory table on a quorum of replica nodes in the same data center as the coordinator node. Avoids latency of inter-data center communication.
  • EACH_QUORUM: A write must be written to the commit log and memory table on a quorum of replica nodes in all data centers.

For read consistency Cassandra supports the below given levels
  • ONE:  If latency is a top priority, consider a consistency level of ONE (only one replica node must successfully respond to the read or write request). There is a higher probability of stale data being read with this consistency level (as the replicas contacted for reads may not always have the most recent write).
  • ANY: If it is an absolute requirement that a write never fail, you may also consider a write consistency level of ANY. This consistency level has the highest probability of a read not returning the latest written values (see hinted handoff).
  • QUORUM: Returns the record with the most recent timestamp once a quorum of replicas has responded.

Creating a Cassandra connection configuration in .NET

[TestMethod]
public void CassandraConnectionConfigBuilderTest()
{
    var hosts = new[] { "a", "b", "c" };
    var builder = new CassandraConnectionConfigBuilder
    {
        Hosts = hosts,
        Port = 9160,
        ConsistencyLevel = ConsistencyLevel.QUORUM,
        Timeout = TimeSpan.FromSeconds(100),
        IsFramed = true,
    };


    var config = new CassandraConnectionConfig(builder);
    CollectionAssert.AreEqual(hosts, config.Hosts);
    Assert.AreEqual(9160, config.Port);
    Assert.AreEqual(ConsistencyLevel.QUORUM, config.ConsistencyLevel);
    Assert.AreEqual(TimeSpan.FromSeconds(100), config.Timeout);
    Assert.AreEqual(true, config.IsFramed);
           
}


No comments: