Tuesday 10 September 2013

How to setup Apache Cassandra

* Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store.

* Cassandra brings together the distributed systems technologies from Dynamo and the data model from Google's BigTable.

* Like BigTable, Cassandra provides a ColumnFamily-based data model richer than typical key/value systems.


1. Install Java
    Java >= 1.6 (OpenJDK and Sun have been tested)

This short guide will walk you through getting a basic one node cluster up and running, and demonstrate some simple reads and writes.

  * tar -zxvf apache-cassandra-$VERSION.tar.gz
  * cd apache-cassandra-$VERSION
  * sudo mkdir -p /var/log/cassandra
  * sudo chown -R `whoami` /var/log/cassandra
  * sudo mkdir -p /var/lib/cassandra
  * sudo chown -R `whoami` /var/lib/cassandra

The sample configuration files in conf/ determine the file-system locations Cassandra uses for logging and data storage. You are free to change these to suit your own environment and adjust the
 path names used here accordingly.

Now that we're ready, let's start it up!

  *  bin/cassandra -f
  *  /usr/local/cassandra/bin/cassandra -f & -------- replace with your locations, where you install cassandra.

# ./cassandra ------ this is just enough to start cassandra, after that to start cassandra-command-line interface to read or write data just run below command

Unix: Running the startup script with the -f argument will cause Cassandra to remain in the foreground and log to standard out.

Now let's try to read and write some data using the command line client. Cassandra ships with a very basic interactive command line interface. Using the CLI you can connect to remote nodes in the cluster to create or update your schema and set and retrieve records.

bin/cassandra-cli --host localhost
# ./cassandra-cli --host 192.168.1.67  ------------------- If it is connected correctly it will display the below messages,
Connected to: "Test Cluster" on 192.168.1.67/9160
Welcome to Cassandra CLI version 1.0.7

Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.

[default@unknown]
The command line client is interactive so if everything worked you should, be sitting in front of a prompt...

  Connected to: "Test Cluster" on localhost/9160
  Welcome to cassandra CLI.

As the banner says, you can use 'help;' or '?' to see what the CLI has to offer, and 'quit;' or 'exit;' when you've had enough fun.


  [default@unknown] create keyspace Keyspace1;
  ece86bde-dc55-11df-8240-e700f669bcfc
  [default@unknown] use Keyspace1;
  Authenticated to keyspace: Keyspace1
  [default@Keyspace1] create column family Users with comparator=UTF8Type and default_validation_class=UTF8Type and key_validation_class=UTF8Type;
  737c7a71-dc56-11df-8240-e700f669bcfc

  [default@KS1] set Users[jsmith][first] = 'John';
  Value inserted.
  [default@KS1] set Users[jsmith][last] = 'Smith';
  Value inserted.
  [default@KS1] set Users[jsmith][age] = long(42);
  Value inserted.
  [default@KS1] get Users[jsmith];
  => (column=last, value=Smith, timestamp=1287604215498000)
  => (column=first, value=John, timestamp=1287604214111000)
  => (column=age, value=42, timestamp=1287604216661000)
  Returned 3 results.

If your session looks similar to what's above, congrats, your single node.
 

cluster is operational! But what exactly was all of that? Let's break it down into pieces and see.

set Users[jsmith][first] = 'John';
        \      \        \          \
         \      \_ key   \          \_ value
          \               \_ column
           \_ column family


Data stored in Cassandra is associated with a column family (Users), which in turn is associated with a keyspace (Keyspace1). In the example above, we set the value 'John' in the 'first' column for
key 'jsmith'.


Configuration

cassandra.yaml: main Cassandra configuration file
log4j-server.proprties: log4j configuration file for Cassandra server.


Optional configuration files

cassandra-topology.properties: used by PropertyFileSnitch


cd /opt/apache-cassandra-1.0.7/conf/
 

vim cassandra.yaml
- seeds: "192.168.1.67"


# - seeds: "cassandra1.ctechz.com,cassandra2.ctechz.com,cassandra3.ctechz.com"

# Setting this to 0.0.0.0 is always wrong.
listen_address: 192.168.1.67
# listen_address: cassandra1.ctechz.com


# Leaving this blank has the same effect it does for ListenAddress,
# (i.e. it will be based on the configured hostname of the node).
rpc_address: 192.168.1.67
# rpc_address: cassandra1.
ctechz.com




No comments:

Post a Comment