azkaban-aplcache

flow trigger service enhancement (#1686) This PR includes …

3/22/2018 7:43:04 PM

several flow trigger service enhancement:

Run cancelling of context asynchronously given the possibility where cancelling context is subject to blocking.

Sending failure email asynchronously when flow trigger instance fails given email sending is blocking operation.

Flow trigger instance of such case won't be rerun when web server restarts: flow trigger instances succeed, but fails to trigger flow execution.

Cheng Ren

Commit: 4c66c37

Tree: 55f5dfc

Parents: 1dad0f8

move project json out of execution_dependencies table (#1704) The …

3/22/2018 7:42:01 PM

initial motivation of keeping project object as part of trigger instance and persist it into db is for consistency purpose - when a new project zip is uploaded, even if web server restarts after that, the running trigger instance would still pick up the old version of the flow to execute for consistency.

However due to the huge size of project object, the cost of keeping project object for each dependency instance/flow trigger in db is extremely high(suppose a big project containing hundreds of flows, one of which are using flow trigger, then each row of execution_dependencies will contain a large blob field of project json, rendering deserialization costly when recovering incomplete trigger instances before web server restarts).

So we decided to only keep the project id instead of the project object in database. And we will populate the project object of dependency instance by looking up the id from project database. The downside is we might compromise consistency as we mentioned above since looking up by project id will always fetch the latest-uploaded project.

Cheng Ren

Commit: 1dad0f8

Tree: 6d079ad

Parents: ed093e4

Dynamically generate job link URL. (#1695) * Dynamically …

3/21/2018 8:54:17 PM

generate job link URL.

* Address comments.

* Change 'Hadoop/Spark Job Log' button color to red if the job fails.

jamiesjc

Commit: ed093e4

Tree: 28da25d

Parents: bb2f297

Delete unused code in email-related classes (#1690) The …

3/21/2018 5:57:47 PM

removed code is not used even by ReportalMailCreator.java in azkaban-plugins repo.

Juho Autio

Commit: bb2f297

Tree: f66ca78

Parents: 224d278

optimize flow traversal (#1677) (#1680) In order to provide …

3/21/2018 5:36:53 PM

better support for very large flows, we need to optimize the flow traversals that occur when loading a new project and verifying it. The included unit test shows the issue occurring, and without the fixes takes ~40s (on my laptop) to complete, but with the fixes takes ~2s to complete.

The first optimization is to prevent reprocessing nodes when executing constructFlow(). This is acheieved by keeping track of all previously visited nodes and then checking if we have previously visited this node or not before processing.

The second optimization is in azkaban.flow.Flow#initialize. This code also does a traversal of the DAG in order to set the edges and the levels of the nodes, and again the solution here was optimize the traversal to ensure that we dont reprocess nodes which have previously been processed. This was done using a breadth first ordering so that the nodes from each level are only visited once. A node could exist at multiple levels and will be traversed multiple times in this case, but that is expected behaviour since we want the node's level to be the MAX level it exists at in the flow.

senecaso-sf

Commit: 224d278

Tree: a9ff24c

Parents: c0ea77b

Node, NPM, and moment version number updates (#1682) * Node, …

3/20/2018 4:22:47 PM

NPM, and moment version number updates

This change is because github flagged the version of moment
used by Azkaban as having a CVE. While updating moment, I also
updated Node & NPM as we now have a package-lock.json file for
helping manage dependencies.

Adam Faris

Commit: c0ea77b

Tree: 4fd3efe

Parents: b705247

job filter style fixed (#1685) UI fix: Graph > Job filter floats …

3/18/2018 5:03:39 PM

over the execution graph pane"

Michal Trna

Commit: b705247

Tree: 8637c86

Parents: 0ff43b3

SessionCache documentation enhanced; config keys moved to …

3/13/2018 1:49:42 AM

Constants class (#1626)

* session.ttl config key moved to Constants class

* Add unit tests

Michal Trna

Commit: 0ff43b3

Tree: f90c307

Parents: c891282

pass dependency name and trigger instance id to dependency …

3/12/2018 6:45:25 PM

instance context

Dependency name and trigger instance id should be passed into dependency instance config and runtime props when creating dependency instance context so that trigger instance id and dependency name is obtainable by dependency instance context to identify itself.
E.x.
Dependency instance context could put trigger instance id and dependency name in the log, thus user can easily scope all logging messages produced by a particular dependency instance.

Cheng Ren

Commit: c891282

Tree: 80cac2c

Parents: f12debb

Move ExecutorManager props to Constants (#1660) As requested …

3/9/2018 8:22:31 PM

by @HappyRay: #1654 (comment).

Refactoring only - no functional changes.

Juho Autio

Commit: f12debb

Tree: a14133e

Parents: ddb6ebb