azkaban-aplcache

uploadExecutableNode improvement (#1166) Include stack …

6/5/2017 5:23:50 PM

trace in error logging.

Juho Autio

Commit: b9b9327

Tree: 7aed30f

Parents: 7c52eda

EndDate Expiring Schedule Introduction (#1110) * Implementing …

6/5/2017 3:40:41 PM

End Date feature to AZ schedules

Today, when we specify a flow schedule in Azkaban, the flow will run infinitely without termination as per the schedule. As #721 proposes, should schedule have an end date, the flow will have more flexibilities to be managed. In this PR, I implement the end date feature to allow users to specify the expiration date given a schedule. UI has not been changed yet in this patch (will not be implemented in near future), but people will be able to call API to leverage the new feature.

The API command sample:
`curl -k -d ajax=scheduleCronFlow -d projectName=wtwt -d flow=azkaban -d endSchedTime=14XXXXXXXXXX --data-urlencode cronExpression="0 * * ? * 6" -b "azkaban.browser.session.id=XXXXXXXXXXXXXX" http://localhost:8081/schedule`

I reuse BasicTimeChecker, as part of Expire Condition, in order to not break Old Azkaban flows. Also, Added one parameter `endSchedTime` to schedule API. If the users don't specify this parameter, Azkaban would use the default value for `01/01/2050`, and the schedule will be almost running infinitely. Besides, I add Unit tests as much as possible to guarantee that this change will not break out.

Liang Tang

Commit: 7c52eda

Tree: 2e5e7b9

Parents: e480ea5

Fixing DefaultCharset warnings in azkaban-common (#1164) * …

6/5/2017 3:32:52 PM

Fixing DefaultCharset warnings in azkaban-common

* Used #1045 to base changes off of

Charlie Summers

Commit: e480ea5

Tree: a31dd80

Parents: ce9e4a2

Test cleanup: rename checkEventExists -> assertEvents (#1153)

6/5/2017 12:22:02 PM

Juho Autio

Commit: ce9e4a2

Tree: cbb9fdf

Parents: 6d3b081

Change InteractiveTestJob.testJobs back to private (#1163)

6/5/2017 12:16:13 PM

Juho Autio

Commit: 6d3b081

Tree: ebac953

Parents: a21b1cb

Test stability: FlowRunnerTest2 (#1162) See also #1160. Clear …

6/4/2017 6:03:28 PM

previous test jobs so that .succeedJob() is called on the retried job instance.
So now we can remove all @Ignores.

How does this help?

When a job run is started, the created job type instance gets added into InteractiveTestJob's map. If a job is retried, another instance is created, but for InteractiveTestJob the key=jobname is the same.

My commit message says:

Clear previous test jobs so that .succeedJob() is called on the retried job instance.
So occasionally it happened that InteractiveTestJob.getTestJob("jobb:innerJobB").succeedJob was targeted at the original failed instance. After that the retry attempt instance was added into InteractiveTestJob's map, but succeedJob was never called on it, so it was left in RUNNING state.

Juho Autio

Commit: a21b1cb

Tree: d405a16

Parents: 0aeced8

Test cleanup: FlowRunnerTestBase (#1154) Simplify and …

6/4/2017 5:16:59 PM

remove duplicate code

Juho Autio

Commit: 0aeced8

Tree: e33def2

Parents: e70459e

Fix JobRunner bug when job is cancelled early (#1148) This …

6/4/2017 5:01:01 PM

bug was revealed by JobRunnerTest#testDelayedExecutionCancelledJob:

If job was cancelled early, there was no initial "upload" for the ExecutableNode, just an eventual "update". I believe this would also fail when running against the DB, because there would be an UPDATE without a preceding INSERT.

* Added assert that caught the error before fix in JobRunner

Juho Autio

Commit: e70459e

Tree: 8882bde

Parents: 69c4140

Test stability: FlowRunnerTest2 (#1158) Calling ExecutableFlowBase#isFlowFinished …

6/4/2017 4:56:26 PM

from the test thread wasn't a good idea.

In rare cases it failed with this exception (see #1157 (comment)):

azkaban.execapp.FlowRunnerTest2 > testCancel FAILED
java.lang.NullPointerException
at azkaban.executor.ExecutableFlowBase.isFlowFinished(ExecutableFlowBase.java:398)
at azkaban.execapp.FlowRunnerTestBase.lambda$assertThreadShutDown$0(FlowRunnerTestBase.java:32)
at azkaban.execapp.FlowRunnerTestBase$$Lambda$9/1157486615.apply(Unknown Source)
at azkaban.execapp.FlowRunnerTestBase.waitFlowRunner(FlowRunnerTestBase.java:42)
at azkaban.execapp.FlowRunnerTestBase.assertThreadShutDown(FlowRunnerTestBase.java:31)
at azkaban.execapp.FlowRunnerTest2.testCancel(FlowRunnerTest2.java:729)
Extract from ExecutableFlowBase (this method is actually deleted in this PR): /** * Only returns true if the status of all finished nodes is true. */ public boolean isFlowFinished() { for (final String end : getEndNodes()) { final ExecutableNode node = getExecutableNode(end); // THIS IS WHERE THE NPE HAPPENED, so node was null if (!Status.isStatusFinished(node.getStatus())) { return false; } } return true; }
It must be that the NPE happened because this method was being called from the test main thread, and there is no synchronization on the "end nodes". So it was somehow possible that the test got a name for an end node but then couldn't get the ExecutableNode to match it.

I used isFlowFinished() (it existed previously, I didn't add it) in the test because it seemed like a handy way to check that, but seems that it didn't work reliably. And now I realized that this method is not being used at all otherwise, so it can be just removed.

Now checking flow completion in a thread-safe way.

Also delete unused methods in ExecutableFlowBase:

isFlowFinished
findNextJobsToRun

Juho Autio

Commit: 69c4140

Tree: e8c3f77

Parents: 02d36bc

Make ExecutableNode.status volatile (#1159) This gives …

6/4/2017 4:34:25 PM

guarantees especially for unit tests that need to check for statuses in a multi-threaded environment. But also FlowRunner thread vs. JobRunner threads benefit from this, although I haven't been able to prove that there would be any critical issues there.

Juho Autio

Commit: 02d36bc

Tree: 2f8af42

Parents: 3eb3ddb