You might want to run HBase on Cloudera's Virtual Machine to get a quick start to a prototyping setup. In theory you download the VM, start it and you are ready to go. The main issue though is that the current Hadoop Training VM does not include HBase at all (yet?). Apart from that the install of a local HBase instance is a straight forward process.
Here are the steps to get HBase running on Cloudera's VM:
- Download VM
Get it from Cloudera's website.
- Start VM
As the above page states: "To launch the VMWare image, you will either need VMware Player for windows and linux, or VMware Fusion for Mac."
Note: I have Parallels for Mac and wanted to use that. I used Parallels Transporter to convert the "cloudera-training-0.3.2.vmx" to a new "cloudera-training-0.2-cl4-000001.hdd", create a new VM in Parallels selecting Ubuntu Linux as the OS and the newly created .hdd as the disk image. Boot up the VM and you are up and running. I gave it a bit more memory for the graphics to be able to switch the VM to 1440x900 which is the native screen resolution on my MacBook Pro I am using.
Finally follow the steps explained on the page above, i.e. open a Terminal and issue:
$ cd ~/git $ ./update-exercises --workspace
- Pull HBase branch
We are using the brand new HBase 0.20.2 release. Open a new Terminal (or issue a$ cd ..
in the open one), then:
$ sudo -u hadoop git clone http://git.apache.org/hbase.git /home/hadoop/hbase $ sudo -u hadoop sh -c "cd /home/hadoop/hbase ; git checkout origin/tags/0.20.2" Note: moving to "origin/tags/0.20.2" which isn't a local branch If you want to create a new branch from this checkout, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b <new_branch_name> HEAD is now at 777fb63... HBase release 0.20.2
First we clone the repository, then switch to the actual branch. You will notice that I am usingsudo -u hadoop
because Hadoop itself is started under that account and so I wanted it to match. Also, the default "training" account does not have SSH set up as explained in Hadoop's quick-start guide. Whensudo
is asking for a password use the default, which is set to "training".
You can ignore the messages git prints out while performing the checkout.
- Build Branch
Continue in Terminal:
$ sudo -u hadoop sh -c "cd /home/hadoop/hbase/ ; export PATH=$PATH:/usr/share/apache-ant-1.7.1/bin ; ant package" ... BUILD SUCCESSFUL
- Configure HBase
There are a few edits to be made to get HBase running.
$ sudo -u hadoop vim /home/hadoop/hbase/build/conf/hbase-site.xml <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://localhost:8022/hbase</value> </property> </configuration> $ sudo -u hadoop vim /home/hadoop/hbase/build/conf/hbase-env.sh # The java implementation to use. Java 1.6 required. # export JAVA_HOME=/usr/java/jdk1.6.0/ export JAVA_HOME=/usr/lib/jvm/java-6-sun ...
- Rev up the Engine!
The final thing is to start HBase:
$ sudo -u hadoop /home/hadoop/hbase/build/bin/start-hbase.sh $ sudo -u hadoop /home/hadoop/hbase/build/bin/hbase shell HBase Shell; enter 'help<RETURN>' for list of supported commands. Version: 0.20.2, r777fb63ff0c73369abc4d799388a45b8bda9e5fd, Thu Nov 19 15:32:17 PST 2009 hbase(main):001:0>
Done!
Let's create a table and check if it was created OK.
hbase(main):001:0> list 0 row(s) in 0.0910 seconds hbase(main):002:0> create 't1', 'f1', 'f2', 'f3' 0 row(s) in 6.1260 seconds hbase(main):003:0> list t1 1 row(s) in 0.0470 seconds hbase(main):004:0> describe 't1' DESCRIPTION ENABLED {NAME => 't1', FAMILIES => [{NAME => 'f1', COMPRESSION => 'NONE', VERS true IONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => ' false', BLOCKCACHE => 'true'}, {NAME => 'f2', COMPRESSION => 'NONE', V ERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY = > 'false', BLOCKCACHE => 'true'}, {NAME => 'f3', COMPRESSION => 'NONE' , VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMOR Y => 'false', BLOCKCACHE => 'true'}]} 1 row(s) in 0.0750 seconds hbase(main):005:0>
Just keep in mind that this is for prototyping only! With such a setup you will only be able to insert a handful of rows. If you overdo it you will bring it to its knees very quickly. But you can safely use it to play around with the shell to create tables or use the API to get used to it and test changes in your code etc.
Finally a screenshot of the running HBase UI:
I just wanted to observe that the VM appliance shipped by Cloudera also opens as expected in VM Fusion, and works as expected, and very well, there, too.
ReplyDeleteSuper helpful walkthrough Lars! Thanks a lot...
ReplyDelete