Azkaban 2
Getting started
Azkaban needs 3 different components to run.
1. Setup the DB.
Azkaban uses MySQL to run. You will need to download the mysql jdbc connector jar since Azkaban doesn’t distribute it. Download it here: http://www.mysql.com/downloads/connector/j/
Once the MySQL DB is set up, you will need to create the tables that Azkaban will use. Extract the azkaban-sql-script archive. Use the *.sql scripts to create the tables.
2. Setup the Web UI
Extract the azkaban-web-server archive to the install directory. Copy the mysql jdbc connector to the ./extlibs directory.
Azkaban uses SSL, and so a keystore will need to be created. Follow the following instructions on how to configure Azkaban jetty for SSL. http://docs.codehaus.org/display/JETTY/How+to+configure+SSL
In the ./conf directory, there are several settings files. The azkaban.properties file is used by azkaban for its general settings. The azkaban-users.xml is used by the XmlUserManager for authentication, and the global.properties are the properties that are passed as shared properties to every workflow and job.
By default, the method of authentication is the azkaban.user.XmlUserManager class, and uses the azkaban-users.xml. By implementing the azkaban.user.UserManager interface, you can override the authentication method.
To use the XmlUserManager, just add a user entry to the xml file. Note that you’ll have to restart to server to pick up new users.
Example.
<azkaban-users>
<user username="azkaban" password="azkaban" roles="admin" groups="azkaban"/>
<role name="admin" permissions="ADMIN" />
</azkaban-users>
For hadoop, you will want to start Azkaban with the environtment variable HADOOP_HOME pointing to the hadoop cluster.
The settings that will need to be set are the mysql settings to connect to the mysql db, and the azkaban executor server settings. When setting up the ExecutorServer, the ports must be the same.
The following are the properties you may be able to set.
General Properties
Property | Description | Default |
azkaban.name | The name of the azkaban instance that will show up in the UI. Useful if you run more than one Azkaban instance. | Local |
azkaban.label | A label to describe the Azkaban instance. | My Local Azkaban |
azkaban.color | Hex value that allows you to set a style color for the Azkaban UI. | #FF3601 (red) |
web.resource.dir | Sets the directory for the ui’s css and javascript files | src/web |
default.timezone | The timezone that will be displayed by Azkaban. | America/Los_Angeles |
user.manager.class | The user manager that is used to authenticate a user. The default is an XML user manager, but it can be overwritten to support other authethentication methods, such as JDNI. | azkaban.user.XmlUserManager |
mail.sender | The email address that azkaban uses to send emails. | |
mail.host | The email server host machine | |
mail.user | The email server user name | |
mail.password | The email password user name | |
azkaban.should.proxy | Used by the HDFS browser. Set to true if using Hadoop 1.0+ with security turned on. Will soon be removed. | false |
proxy.keytab.location | Used by the HDFS browser. Set to true if using Hadoop 1.0+ with security turned on. Will soon be removed. | |
proxy.user | The proxy user |
Jetty Properties
Property | Description | Default |
jetty.maxThreads | Max request threads | 25 |
jetty.ssl.port | The ssl port | 8443 |
jetty.keystore | The keystore file | keystore |
jetty.password | The jetty password | password |
jetty.keypassword | The keypassword | password |
jetty.truststore | The trust store | keystore |
jetty.trustpassword | The trust password | password |
MySQL Connection Properties
Property | Description | Default |
database.type | The database type. Currently, the only database supported is mysql. | mysql |
mysql.port | The port to the mysql db | 3306 |
mysql.host | The mysql host | localhost |
mysql.database | The mysql database | azkaban2 |
mysql.user | The mysql user | azkaban |
mysql.password | The mysql password | password |
mysql.numconnections | The number of connections that Azkaban web client can open to the database | 100 |
Executor Server Properties
Properties | Description | Default |
executor.port | The port for the azkaban executor server | 12321 |
executor.host | The host for azkaban executor server | localhost |
There are two pieces of Azkaban that must be installed: the web client and the executor server.
3. Setting up the Executor Server
Download the azkaban executor tarball and extract it to the executor install directory.
Just like setting up the Web Server, you will need to copy the mysql jdbc connector jar to the ./extlib directory. You will also need to change the conf/azkaban.properties to set the executor port and the mysql db settings information.
You may need to also set the azkaban proxy if using hadoop 1.0 security.
4. Running everything
Start both the web server and the executor server to run azkaban.
Upgrading Azkaban
Having azkaban as a web server and a executor server in separate processes gives us the ability to roll the upgrade and not shut down long lived jobs.
To do this, install the newer executor server and change the executor port. You’ll need to also update the executor server port for the web client. Restarting the web client should point to the new executor server.
Any running jobs in the old executor should be auto detected by the web client. When the old executor finishes running its flow, it should be safe to shut the executor server down.
Note that the scheduler runs in the web server. If you shut down the web server, you may skip any scheduled jobs during this time.