azkaban-aplcache

Add a test to run a v2 flow file with the new DAG engine (#1840) * …

7/9/2018 7:14:20 PM

3.49.0

Add a test to run a v2 flow file with the new DAG engine

This is a step towards building an integration test for the new DAG engine.

Refactored the DagBuilder API to make it easier to use.
Instead of using a DagBuilder class to link nodes, use the name of the
nodes directly.

This facility can also be used in the future to build the tools to run
flows locally for testing purposes.

* Fix a copy and paste error.

Should check non null of the dagProcessor parameter.

HappyRay

Commit: e49b22d

Tree: d2fbb59

Parents: 4b9dcce

Revert changes related to project LRU cache (#1841) More …

7/9/2018 7:07:52 PM

improvements needs to be done on project LRU cache feature:

1. The feature needs to be turned off by default.
2. Running disk calculation on our production clusters whose project dir size is 1.8T takes >= 10 mins.
3. Cleaning thread will be hanging likely due to https://stackoverflow.com/questions/13008526/runtime-getruntime-execcmd-hanging.
4. An alternative design is considered:
Create a file in each project directory and write the size of the project to the file when the project is created. The project files are not supposed to change after creation. Touch this file each time the project is used. This way, we can have a more efficient LRU algorithm based on last access time, not creation time.

Maintain the total size of the project cache in memory to avoid the overhead of re-calculating it. The size shouldn't change too often. This way we can afford to run the check more frequently.


Revert the change to get the release going forward for now and will make the improvements later.

Cheng Ren

Commit: 4b9dcce

Tree: 883ba0e

Parents: 1c9e645

Solo server to start executor first (#1831) This is the correct …

7/9/2018 6:12:59 PM

dependency order. Otherwise it can happen that web server tries to launch an execution, but it fails as executor server hasn't started yet.

Also start ExecutorManager's threads (that periodically call executors to get updates) not sooner than when the web server is started. Those threads used to be started already as soon as Guice injection creates the ExecutorManager instance (which happens before executor server is started – even now that the order is fixed).

This is one step towards having everything run in "multi-executor" mode.

As you may know, currently there are two possible modes, controlled by azkaban.use.multiple.executors=true|false:

multi-executor mode, based on the executors DB table
single executor mode, based on fixed props executor.host & executor.port
That adds extra complexity and different branches of code are executed depending on the mode. However, everything can just as well be run as "multi-executor" (but just one executor if that's what you want, and naturally so in case of solo-server). It makes it also easier to reach a better test coverage when testing is always done in the same "mode" that's used in production configurations.

I also noticed that ExecutorManager loads executors from the DB at the time of Guice injection (before executor server is started). This may need to be changed as well to properly support multi-executor mode on a single server.

Juho Autio

Commit: 1c9e645

Tree: f9f4312

Parents: c851948

Avoid sending metrics for email sending failures caused by …

6/29/2018 6:54:09 PM

invalid address from users. (#1827)

Jamie Sun

Commit: c851948

Tree: 25a7d76

Parents: f8f47f8

Remove unused method fetchActiveFlowByExecId (#1832)

6/29/2018 6:05:29 PM

Juho Autio

Commit: f8f47f8

Tree: 3c1792e

Parents: 4a847bb

Remove the single end node restriction for a flow. (#1821)

6/28/2018 2:02:01 PM

Jamie Sun

Commit: 4a847bb

Tree: d1dd1c7

Parents: 7efe67b

when trigger is still running, endtime should show "-" (#1814) Before …

6/22/2018 10:00:47 PM

endtime was showing current time when trigger instance is running.

Cheng Ren

Commit: 7efe67b

Tree: f457cc0

Parents: 21dc5e9

fix node level computation of a flow (#1794) * fix node level …

6/21/2018 5:24:38 PM

computation of a flow

level++ returns the level variable BEFORE it is incremented. This means
the level of a node is never incremented and is always zero.

Hence, we need to increment the value by one before passing it to the
function.

Fixes #1793.

* followup: add tests for node level computation

* followup: refactor test according to review comments

Sami Jaktholm

Commit: 21dc5e9

Tree: ec0fe34

Parents: 41a1e35

Make testCancelAfterJobProcessCreation more reliable (#1810) Issue: The …

6/20/2018 9:15:46 PM

test failed today on trunk

azkaban.jobExecutor.ProcessJobTest > testCancelAfterJobProcessCreation FAILED org.junit.ComparisonFailure: expected:<[fals]e> but was:<[tru]e> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at azkaban.jobExecutor.ProcessJobTest.testCancelAfterJobProcessCreation(ProcessJobTest.java:228)

Cause:

I suspect it is because of the travis CI machine being too busy to cancel
the job before it finishes.

Solution:

Increase the running time of the job from 1 second to 5 seconds.
This will not increase the test running time when the test is successful
since the job will be canceled as soon as possible.
However it will increase the test running time if the cancel logic
doesn't interrupt the job as expected.

Also removed the printout of the stacktrace from the job. This information
doesn't deliver much value and only makes the log noisier.

HappyRay

Commit: 41a1e35

Tree: 3aa03c9

Parents: fb2e72e

adding text to indicate that columns are sortable (#1795) As …

6/20/2018 9:00:02 PM

mentioned in issue #1778, having a tooltip on the table would let the user know that it can be sorted when the header is clicked. This pull request will add the tooltip on scheduling page and history page which are being mostly used.

Sample screenshot:
image

This is a duplicate PR (#1780) but from a different branch, I moved these changes from master->sort-text in my repo.

Ryan

Commit: fb2e72e

Tree: 168cf86

Parents: e9dfd50