Cloudera Enterprise 6.0 Beta | Other versions

Installing the Latest CDH 6 Release

This page explains how to do an unmanaged deployment of CDH 6 from the command line. For a managed deployment, see Production Installation of Cloudera Manager and CDH.

Before You Begin Installing CDH 6 Manually

  Note: Running Services

Use the service command to start, stop, and restart CDH components, instead of running scripts in /etc/init.d directly. The service command creates a predictable environment by setting the current working directory to / and removing most environment variables (passing only LANG and TERM). With /etc/init.d, existing environment variables remain in force and can produce unpredictable results. When you install CDH from packages, service is installed as part of the Linux Standard Base (LSB).

Steps to Install CDH 6 Manually

Step 1: Add or Build the CDH 6 Repository

On RHEL-compatible Systems

Use one of the following methods to install CDH 6 on RHEL-compatible systems.

Do this on all the systems in the cluster.

To add the CDH 6 repository:

Download the repo file. Copy the link for your RHEL compatible system in the table, and download the file to /etc/yum.repos.d/ on each cluster node.

OS Version CDH 6 Repository
RHEL 6 Compatible cloudera-cdh6.repo
RHEL 7 Compatible cloudera-cdh6.repo

Continue to (Optional) Step 2: Add a Repository Key.

  Note: Clean repository cache.
Before proceeding, clean cached packages and headers to ensure your system repos are up-to-date:
sudo yum clean all

OR: To build a Yum repository:

Follow the instructions at Creating a Local Yum Repository to create your own yum repository:
  • Download the appropriate repo file
  • Create the repo
  • Distribute the repo and set up a web server.

Continue to (Optional) Step 2: Add a Repository Key.

  Note: Clean repository cache.
Before proceeding, clean cached packages and headers to ensure your system repos are up-to-date:
sudo yum clean all

On SLES Systems

Use one of the following methods to download the CDH 6 repository or package on SLES systems.

To add the CDH 6 repository:

  1. Run the following command:
    $ sudo zypper addrepo -f https://archive.cloudera.com/cdh6/sles/12/x86_64/cdh/cloudera-cdh.repo
  2. Update your system package index by running:
    $ sudo zypper refresh

Continue with (Optional) Step 2: Add a Repository Key.

  Note: Clean repository cache.
Before proceeding, clean cached packages and headers to ensure your system repos are up-to-date:
sudo zypper clean --all

OR: To build a SLES repository:

If you want to create your own SLES repository, create a mirror of the CDH SLES directory by following these instructions that explain how to create a SLES repository from the mirror.

Continue to (Optional) Step 2: Add a Repository Key.

  Note: Clean repository cache.
Before proceeding, clean cached packages and headers to ensure your system repos are up-to-date:
sudo zypper clean --all

On Ubuntu

Use one of the following methods to download the CDH 6 repository or package.

To add the CDH 6 repository:

  • Download the appropriate cloudera.list file by issuing one of the following commands. You can use another HTTP client if wget is not available, but the syntax may be different.
      Important: Ubuntu 14.04 (Trusty)

    For Ubuntu Trusty systems, you must perform an extra step after adding the repository. See "Additional Step for Trusty Ubuntu Trusty and Debian Jessie" below.

    OS Version Command
    Ubuntu 16 Xenial
    $ sudo wget 'https://archive.cloudera.com/cdh6/ubuntu/xenial/amd64/cdh/cloudera.list' \
        -O /etc/apt/sources.list.d/cloudera.list
    Ubuntu 14 Trusty
    $ sudo wget 'https://archive.cloudera.com/cdh6/ubuntu/trusty/amd64/cdh/cloudera.list' \
        -O /etc/apt/sources.list.d/cloudera.list
    Ubuntu 12 Precise
    $ sudo wget 'https://archive.cloudera.com/cdh6/ubuntu/precise/amd64/cdh/cloudera.list' \
        -O /etc/apt/sources.list.d/cloudera.list
  Note: Clean repository cache.
Before proceeding, clean cached packages and headers to ensure your system repos are up-to-date:
sudo apt-get update
Additional step for Ubuntu Trusty and Debian Jessie

This step ensures that you get the right ZooKeeper package for the current CDH release. You need to prioritize the Cloudera repository you have just added, such that you install the CDH version of ZooKeeper rather than the version that is bundled with Ubuntu Trusty or Debian Jessie.

To do this, create a file at /etc/apt/preferences.d/cloudera.pref with the following contents:
Package: *
Pin: release o=Cloudera, l=Cloudera
Pin-Priority: 501
  Note: You do not need to run apt-get update after creating this file.

Continue to (Optional) Step 2: Add a Repository Key.

OR: To build a repository:

If you want to create your own apt repository, create a mirror of the CDH Ubuntu directory and then create an apt repository from the mirror.

Continue to (Optional) Step 2: Add a Repository Key.

(Optional) Step 2: Add a Repository Key

Before installing YARN: Add a repository key on each system in the cluster. Add the Cloudera Public GPG Key to your repository by executing one of the following commands:

  • RHEL 6 compatible:
    sudo rpm --import https://archive.cloudera.com/cdh6/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
  • RHEL 7 compatible:
    sudo rpm --import https://archive.cloudera.com/cdh6/redhat/7/x86_64/cdh/RPM-GPG-KEY-cloudera
  • SLES:
    sudo rpm --import https://archive.cloudera.com/cdh6/sles/12/x86_64/cdh/RPM-GPG-KEY-cloudera
  • Ubuntu:
    OS Version Command
    Ubuntu 16 Xenial
    wget https://archive.cloudera.com/cdh6/ubuntu/xenial/amd64/cdh/archive.key -O archive.key
    sudo apt-key add archive.key
    Ubuntu 14 Trusty
    wget https://archive.cloudera.com/cdh6/ubuntu/trusty/amd64/cdh/archive.key -O archive.key
    sudo apt-key add archive.key
    Ubuntu 12 Precise
    wget https://archive.cloudera.com/cdh6/ubuntu/precise/amd64/cdh/archive.key -O archive.key
    sudo apt-key add archive.key

This key enables you to verify that you are downloading genuine packages.

Step 3: Install CDH 6

  Note: When configuring HA for the NameNode, do not install hadoop-hdfs-secondarynamenode. After completing the HA software configuration, follow the installation instructions under Deploying HDFS High Availability.
  1. Install and deploy ZooKeeper.
      Important: Cloudera recommends that you install (or update) and start a ZooKeeper cluster before proceeding. This is a requirement if you are deploying high availability (HA) for the NameNode.

    Follow instructions under ZooKeeper Installation.

  2. Install each type of daemon package on the appropriate systems(s), as follows.

    Where to install

    Install commands

    Resource Manager host (analogous to MRv1 JobTracker) running:

     

    RHEL compatible

    sudo yum clean all; sudo yum install hadoop-yarn-resourcemanager

    SLES

    sudo zypper clean --all; sudo zypper install hadoop-yarn-resourcemanager

    Ubuntu

    sudo apt-get update; sudo apt-get install hadoop-yarn-resourcemanager

    NameNode host running:

     

    RHEL compatible

    sudo yum clean all; sudo yum install hadoop-hdfs-namenode

    SLES

    sudo zypper clean --all; sudo zypper install hadoop-hdfs-namenode

    Ubuntu

    sudo apt-get install hadoop-hdfs-namenode

    Secondary NameNode host (if used) running:

     

    RHEL compatible

    sudo yum clean all; sudo yum install hadoop-hdfs-secondarynamenode

    SLES

    sudo zypper clean --all; sudo zypper install hadoop-hdfs-secondarynamenode

    Ubuntu

    sudo apt-get install hadoop-hdfs-secondarynamenode

    All cluster hosts except the Resource Manager running:

     

    RHEL compatible

    sudo yum clean all; sudo yum install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce

    SLES

    sudo zypper clean --all; sudo zypper install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce

    Ubuntu

    sudo apt-get install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce

    One host in the cluster running:

     

    RHEL compatible

    sudo yum clean all; sudo yum install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver

    SLES

    sudo zypper clean --all; sudo zypper install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver

    Ubuntu

    sudo apt-get install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver

    All client hosts running:

     

    RHEL compatible

    sudo yum clean all; sudo yum install hadoop-client

    SLES

    sudo zypper clean --all; sudo zypper install hadoop-client

    Ubuntu or Debian

    sudo apt-get install hadoop-client

  Note: The hadoop-yarn and hadoop-hdfs packages are installed on each system automatically as dependencies of the other packages.

(Optional) Step 4: Install LZO

This section explains how to install LZO ( Lempel–Ziv–Oberhumer) compression. For more information, see Choosing and Configuring Data Compression
  Note: If upgrading (rather than installing for the first time), remove the old LZO version first. For example, on a RHEL system:
yum remove hadoop-lzo
  1. Add the repository on each host in the cluster. Follow the instructions for your OS version:
    For OS Version Do this
    RHEL 6 compatible Go to this link and save the file in the /etc/yum.repos.d/ directory.
    RHEL 7 compatible Go to this link and save the file in the /etc/yum.repos.d/ directory.
    SLES
    1. Run the following command:
       $ sudo zypper addrepo -f
      https://archive.cloudera.com/gplextras5/sles/12/x86_64/gplextras/
      cloudera-gplextras5.repo
    2. Update your system package index by running:
       $ sudo zypper refresh
    Ubuntu Go to this link and save the file as /etc/apt/sources.list.d/gplextras.list.
      Important: Make sure you do not let the file name default to cloudera.list, as that will overwrite your existing cloudera.list.
  2. Install the package on each host as follows:
    For OS version Install commands
    RHEL compatible
    sudo yum install hadoop-lzo
    SLES
    sudo zypper install hadoop-lzo
    Ubuntu
    sudo apt-get install hadoop-lzo
  3. Continue with installing and deploying CDH. As part of the deployment, you will need to do some additional configuration for LZO, as shown under Configuring LZO.
      Important: Be sure to do this configuration after you have copied the default configuration files to a custom location and set alternatives to point to it.

Step 5: Deploy CDH and Install Components

Page generated March 7, 2018.