Wednesday, September 30, 2015

3D Data visualization using R - configure: error: X11 not found but required. missing required header GL/gl.h

3D Data visualization using R

3D Data Visualization using R : Using the rgl package, rglplot plots a graph in 3D. The plot can be zoomed, rotated, shifted, etc. but the coordinates of the vertices is fixed.

Installing RGL package in R

install.packages("rgl")


Error

configure: using libpng-config configure: using libpng dynamic linkage checking for X... no configure: error: X11 not found but required, configure aborted. ERROR: configuration failed for package ‘rgl’ * removing ‘/home/bdalab/R/x86_64-pc-linux-gnu-library/3.1/rgl’ Warning in install.packages :
Solution

install
sudo apt-get install xorg
sudo apt-get install libx11-dev

then try again.
Error

In my case again I got
configure: error: missing required header GL/gl.h
Solution

install
sudo apt-get install libglu1-mesa-dev
then try again.

Next : RHadoop

Friday, September 4, 2015

Running SQL Query on Hadoop : Apache Hive Alternatives

Running SQL Query on Hadoop : Apache Hive Alternatives

Hive is the SQL programmer friendly tool for running SQL query on Hadoop HDFS File system. While running query Hive will convert SQL like query into MapReduce.

Hive is not the only tool will do the same. This post will let give synopsis on open source alternative of Hive.


1) spark sql (previously Shark - Sql on Spark) - will be the best alternative of Hive over Spark. Spark SQL is Spark's module for working with structured data.
2) Cloudera Impala - like Hive but it uses its own execution daemons which we need to install every datanodes in Hadoop cluster. Impala do BI-style Queries on Hadoop.
3) Facebook Presto - like Impala need to install all datanodes. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.
4) Apache Drill - Schema free SQL for Hadoop. It support multiple datastores HDFS, MongoDB and Hbase