7/28/2017 9:46:55 PM
disk space (#1295)
The execution directory can bloat to be relatively large, which could lead to problems if the executor is shutdown and stale executions remain sitting on the executor occupying disk space.
This change deletes the execution directory on executor shutdown. The directory itself is created on executor initialization, so the directory isn't necessary to reset the executor later.
Note that the execution directory will not be deleted until AFTER all flows are done if the executor is brought down via shutdown() - so it will not delete partially-written logs before they are uploaded to the database.
|
7/28/2017 1:38:48 PM
execution cleanup (#1292)
Setting the gid bit on the group for a directory causes all items created within that directory to inherit the group of the directory. So all items users create in their /execution/<exec_id> directory will automatically be a part of the azkaban group. This allows the azkaban cleanup thread to properly remove user-generated files/directories.
The gid bit is system-specific, so there is no java standard library api for setting it. The solution I proposed spawns a subprocess that performs the chmod command. I don't think this is very clean, but I haven't been able to find a better way. Anybody have any ideas?
Another option would be to build this executionDirectory as part of our build process, but that isn't how we do things right now (tested by running deploy on holdem4 jenkins job without any of the other steps and verifying no /executions directory is created until system start).
Note that this change is not covered by testing due to the difficulty of dealing simultaneously with filesystems and subprocesses within a testing environment. If disk space usage increases in the future, it has possible that this change has been regressed. In order to confirm that this change is working in production on particularly large clusters (where problems have been seen), I'll be keeping an eye on it when it's released.
|
7/27/2017 10:46:17 PM
would reduce the memory overhead for session to some extend so that we can increase the size of session cache more.
|
7/27/2017 8:26:53 PM
Add metrics for sending email successs/failure.
* Guicify Emailer class and refactor some test cases.
|
7/27/2017 6:49:31 PM
in DB. (#1288)
|
7/25/2017 6:40:06 PM
to classes's annotations
This patch refactors the Guice uses, and mainly move all singleton
binding to respective classes with singleton annotations. The corresponding
tests are added as well. A bit more context is at #1285 .
|
7/25/2017 12:31:14 AM
type for killing a job
The action is to kill a job and retry it based on the retry configuration of that job.
Previously only killing a flow is allowed when SLA is missed even if SLA is set on job level.
New action will kick in when user sets SLA rule on a job and enforce kill action on missing the SLA. There's no UI change.
Testing it manually with following flows:
jobA(retry num: 2)->jobB(retry num: 2)
SLA rule: if job A doesn't succeed in 1 min, kill the job
SLA rule: if job B doesn't succeed in 1 min, kill the job
jobA->jobB, jobA->jobC, jobB->jobD, B retry number is set to 2.
SLA rule: if job B doesn't succeed in 1 min, kill it.
|
7/24/2017 2:22:05 PM
(#1279)
|
7/23/2017 6:28:08 PM
is intended to prevent the bug happened on #1283. I simply annotate ProjectManager singleton in this patth. When developers guicify other classes relying on ProjectManager in future, they will not need to worry about generating bugs.
|
|