This post explains kerberizing an existing Hadoop cluster using Ambari. Kerberos helps with the
Authentication part of enterprise security (while
data protection being the remaining parts).
HDP uses Kerberos, which is an industry standard for authenticate users and resources and providing strong identity for users. Apache Ambari can kerberize an existing cluster by using an existing MIT key distribution center (KDC) or Microsoft’s Active Directory.
Configuring Active Directory
For the sake of this post simplicity, lets assume an active directory already exists and can communicate with the HDP cluster. HDP requires secure LDAP connectivity so on the DC, Active Directory Certificate Services must be installed and configured. Below are a series of screenshots explaining this configuration:
Add necessary roles
Choose the AD server
Select certification authority
Go ahead and Install the role.
On the server manager, click this notification and click “configure AD certificate services”
Choose the certification authority
If you are generating your own certificates, the option
Enterprise CA must be checked. I will choose a
The CA type should be
Create a new private key
Use defaults for Cryptography for CA
And then, specify a name for a CA. After choosing a validity period, click configure.
Create Users and Containers for Cluster
Create a container, kerberos admin, and permissions for the cluster
From advanced features, create a container:
Lets call the container HDP.
Similarly, create another container called sandbox
Create a user,
sandboxadmin and delegate control of the container to the user
Choose delegation to “create, delete and manage user accounts”
Enabling Kerberos on Existing Cluster
Now the action shifts to HDP cluster and Ambari. Go to Admin tab on Ambari console and enable Kerberos.
While toggling kerberos setting, Ambari will warn formatting ResourceManager state. Since Kerberos will be (or should be) done during the initial setup of the cluster, this is fine.
Integrating with Active Directory
Ambari’s security wizard will take you through options to choose a KDC (in this case, we will choose Microsoft Active Directory) a list of prerequisites
Check all of them and go to next page. Under configure Kerberos, provide relevant values for the KDC. Note that AD is our KDC in this scenario and your values may change based on the AD server name etc
Kdadmin host will be the AD host and the admin user will be the user we created in the previous section:
When you click “Next” the wizard will install Kerberos clients on all the nodes in the cluster:
Confirm configuration and next
The services will be stopped momentarily before kerberizing the cluster
Thats about it! Once the services are restarted, you should be able to play around with your shiny secure HDP cluster! :)