Pages

Wednesday, November 25, 2015

Hidden Gem in HDP sandbox. SSH Web Server port 4200

If you go to port 4200 on your HDP sandbox http://sandbox.hortonworks.com:4200/

Wednesday, November 11, 2015

Got access to zeppelin hub beta

The sync function to the hub and enabling collaboration is going to make Spark development so much better now

 

--NO_CROSS_POST

Tuesday, November 10, 2015

IntelliJ and Maven HDP public Repos to index

If you are trying to add the HDP repo's to InteliJ Maven repo list, so you  would get all the versions in autocomplete not just local repo's and you are getting an error when you try to run the update. Like screenshot below:


This is a known bug. Thanks Shane Kumpf to pointing out a fix to get InteliJ Maven Repo working with autocomplete



This is a bug in Intellij 14.1 (and many earlier versions).
See IDEA-102693 which includes a zip with the fixed maven plugin jars. Replace your intellij jars with those from the zip file.
If that doesn't work, take a look at your idea.log (sudo find / -name idea.log to locate it) for any exceptions and research those and/or post your stack trace here.

--NO_CROSS_POST

How to search only in Gmail Primary tab Inbox

Provide search term and filter other categories using - operator:

search_term in:inbox -category:{social promotions updates forums}

--NO_CROSS_POST

Monday, November 9, 2015

External jars not getting picked up in zeppelin or cli in spark, issue with mysql and spark dependency

If you cant get External jars picked up in zeppelin or cli in spark , this will help you. I first tried pointing to local jars but that did not work, you can resolve using maven dependencies as you will see below In this example I am trying to use a mysql jdbc jar in zeppelin and cli and getting errors But getting an exception : java.sql.SQLException: No suitable driver found for jdbc:mysql://localhost:3306/hive at java.sql.DriverManager.getConnection(DriverManager.java:596) at java.sql.DriverManager.getConnection(DriverManager.java:187) Apparently this is not very documented feature of Spark (and not an issue with Zeppelin itself) Here is the code that works for me and solves the similar issue: Dependency loading When your code requires external library, instead of doing download/copy/restart Zeppelin, you can easily do following jobs using %dep interpreter. Load libraries recursively from Maven repository Load libraries from local filesystem Add additional maven repository Automatically add libraries to SparkCluster (You can turn off) Dep interpreter leverages scala environment. So you can write any Scala code here. Here's usages.
%dep
z.reset() // clean up previously added artifact and repository

// add maven repository
z.addRepo("RepoName").url("RepoURL")

// add maven snapshot repository
z.addRepo("RepoName").url("RepoURL").snapshot()

// add artifact from filesystem
z.load("/path/to.jar")

// add artifact from maven repository, with no dependency
z.load("groupId:artifactId:version").excludeAll()

// add artifact recursively
z.load("groupId:artifactId:version")

// add artifact recursively except comma separated GroupID:ArtifactId list
z.load("groupId:artifactId:version").exclude("groupId:artifactId,groupId:artifactId, ...")

// exclude with pattern
z.load("groupId:artifactId:version").exclude(*)
z.load("groupId:artifactId:version").exclude("groupId:artifactId:*")
z.load("groupId:artifactId:version").exclude("groupId:*")

// local() skips adding artifact to spark clusters (skipping sc.addJar())
z.load("groupId:artifactId:version").local()
Note that %dep interpreter should be used before %spark, %pyspark, %sql. Thanks to Ali and Neeraj from HWX for help in solving this issue.