azkaban-aplcache

block deactivating executor until all flow preparation work …

3/1/2019 9:20:17 PM

3.69.0

completes (#2133)

This PR makes deactivation process block until all flow preparation work finishes.

When deploying new executor, old running executor will be deactivated before new one is activated and only one executor is allowed to delete/hard-linking project directories to avoid race condition(see #2130). So to make deactivation process block until flow preparation work finishes guarantees the old executor won't access FlowPreparer#setup after deactivation.

Cheng Ren

Commit: ecc7606

Tree: ee8f754

Parents: c1f2503

Issue #2135 - Improve Getting Started document page Property …

3/1/2019 1:22:30 PM

Overrides table format (#2136)

taox

Commit: c1f2503

Tree: 26a0707

Parents: 87b4cb3

Implement "useExecutor" feature for new dispatching logic …

2/28/2019 10:16:39 PM

(#2129)

* Implement "useExecutor" feature for new Dispatching Logic (Poll model)

When launching a new execution, an Azkaban admin can choose the executor for it by specifying a useExecutor parameter in the request.
New executions are inserted into execution_flows table in the database. A newly added column to this table, "use_executor", will hold the executor id passed as a parameter in the request. Active executors will poll executions with (use_executor == null or use_executor == pollingExecutorId) Inactive executors will only poll executions with their ids (use_executor == pollingExecutorId)

Yeni Bermudez

Commit: 87b4cb3

Tree: 45accea

Parents: ce8191b

Adding an Ajax call endpoint to check if user has WRITE access …

2/27/2019 3:14:44 AM

to the project or not (#2131)

In this PR changes are done to add an AJAX endpoint which will provide the information if the user have WRITE access to a given project. This API will be useful for **TuneIn** which is a part of **Dr.Elephant**. Currently, TuneIn show many details like suggested parameters for the job, algorithm etc on the job page and the user would be able to modify these parameters. But since these properties are used by Azkaban(one of the supported Workflow managers by Dr.Elephant), so the user must be authorized to change these properties. For this purpose, Dr.Elephant will call this API with **session_id**(provided by Azkaban after successful authentication) and **project_name** as query params.

This API will have Azkaban user session_id and the project name as query params and as a result a Boolean value will be returned determining if the user whose session_id was passed as query param have WRITE access to the project which was also passed as a query param. So to know if some user have Write permission in the respective project the client calling the API must have the user's session_id. In this manner, this API cannot be used by any client to expose other Azkaban users' access to a project.

There are some other existing APIs like `getPermission` and `fetchprojectusers` which provide users and their permissions in a project, but these APIs doesn't provide information about `user who is not owner/user of project but is a part of group which has WRITE or greater permissions` in the project, so eventually this user will also be able to WRITE in the project but we won't be able to determine with these mentioned APIs.

ShubhamGupta29

Commit: ce8191b

Tree: 3b07043

Parents: 41dd197

Refactor and bug fix on Job History page pagination (#2122) Fixes …

2/27/2019 3:07:23 AM

a pagination issue on Job History page where it would show an additional empty page if the last page was full. For example:
-with page size 10 and 30 elements it would show 4 pages with the last one being empty.

Simplifies implementation by delegating common pagination functionality to https://github.com/josecebe/twbs-pagination jQuery pagination plugin instead of doing it manually.

With this change we are still loading the entire page every time a user interacts with the pagination controls. Ultimately we want to create an API endpoint that returns pages as data and that we only need to update the view with the new data, but this will not be done now because we are planning to redesign existing APIs soon.

Yeni Bermudez

Commit: 41dd197

Tree: d9091fb

Parents: 2273b77

Unify jvm memory settings validation & improve upload error …

2/26/2019 11:02:12 PM

message (#2111)

If there's an empty value like `"job.max.Xmx="` in some properties, upload fails with this error:

> Installation Failed.
> For input string: ""

So it doesn't help much because it doesn't even tell the name of the problematic property.

This PR improves it by returning a better error message ie. one that includes the property name.

Currently user would need to grab the stack trace, get azkaban source code, and find the line, to know which property caused the upload failure.

Juho Autio

Commit: 2273b77

Tree: 5eebe20

Parents: 77fcd8f

Add metrics for submission and time in queue (#2128) * Add …

2/26/2019 7:49:48 PM

metrics for submission and time in queue

Add the following new metrics:
- submit flow success, fail and skip
- queue wait time (time between when a flow is submitted, to when an executor starts executing)
- flow setup time (time to setup a flow, before executing).

* Add metrics for submission and time in queue

The time that a flow spends in PREPARING state is queue wait time + flow setup time. These metrics
will help give more insight into how much time is spent in preparing state, and in which phases.

Flow submission is when a user requests a flow to be executed, or when a flow is scheduled to run.
Flow submission will add the flow to the queue. Flow dispatch is when the flow is assigned to an
executor; currently this time also includes the time to setup the flow.

* Add metrics for submission and time in queue

The time that a flow spends in PREPARING state is queue wait time + flow setup time. These metrics
will help give more insight into how much time is spent in preparing state, and in which phases.

* FlowPreparer refactor (#2130)

This PR refactors FlowPreparer in various aspects:

1. Allow multi-threading project download for more concurrent flow preparation.
2. Synchronize on project cache clean-up/create execution directory by hard-linking from project directory to avoid avoid complicated race conditions which could arise when multiple threads are deleting/hard-linking the same project. (Note: it doesn't prevent multiple executor processes interfering with each other triggering race conditions. So it's important to operationally make sure that only one executor process is setting up flow execution against the shared project directory.
3. Move project cache cleaning logic to a separate class for better testability.
4. Move log4j to slf4j.

* Add metrics for submission and time in queue

The time that a flow spends in PREPARING state is queue wait time + flow setup time. These metrics
will help give more insight into how much time is spent in preparing state, and in which phases.

edwinalu

Commit: 77fcd8f

Tree: e8cc3b1

Parents: a75deeb

FlowPreparer refactor (#2130) This PR refactors FlowPreparer …

2/25/2019 9:52:42 PM

in various aspects:

1. Allow multi-threading project download for more concurrent flow preparation.
2. Synchronize on project cache clean-up/create execution directory by hard-linking from project directory to avoid avoid complicated race conditions which could arise when multiple threads are deleting/hard-linking the same project. (Note: it doesn't prevent multiple executor processes interfering with each other triggering race conditions. So it's important to operationally make sure that only one executor process is setting up flow execution against the shared project directory.
3. Move project cache cleaning logic to a separate class for better testability.
4. Move log4j to slf4j.

Cheng Ren

Commit: a75deeb

Tree: e802a83

Parents: e8e531c

Optimize slow test TriggerInstanceProcessorTest#testProcessTermination …

2/18/2019 7:39:52 PM

(#2116)

Squashing the lowest hanging fruit from #2114.

1st commit: Prove that sendEmailLatch is not incremented in testProcessTermination

* Adding this assertion makes the test fail

2nd commit: Fix mocking of overloaded method -> test passes quickly now

It's actually best to leave the new assertion on CountdownLatch in place even though the problem was fixed. Because it will make the test fail instead of just being slow & succeeding, if the bug with skipping CountdownLatch is ever introduced again.

Juho Autio

Commit: e8e531c

Tree: 6ce758f

Parents: 3e5693a

Fix pagination bugs on Flow page: (#2120) -Next and Previous …

2/4/2019 5:47:39 PM

buttons don’t work when clicked -Invalid pagination is generated in some cases. For example: * if the number of executions yields 3 pages in total, when the last page(3) is clicked the pagination displayed will be -1 0 1 2 3.

Separate the implementation of Flow Trigger tab from that of Executions tab by creating a Backbone Model and View for the former.

Yeni Bermudez

Commit: 3e5693a

Tree: 48004d3

Parents: c68e1a3