8/10/2017 8:04:13 PM
trigger (#1328)
Previously TriggerManager won't persist updated schedule when schedule gets updated(e.x. add new SLA rule to a schedule). It gets persisted only when schedule gets triggered. So if we add a SLA rule to a schedule or do a reschedule, and restart the web server after it, these updates will be lost.
|
|
|
8/10/2017 4:52:15 PM
instance. (#1320)
Current features:
- version
- used Memory
- Max memory: XMX value
- Database availability: SELECT 1 check
- Map of current executors
Example:
```
{ "version": "3.33.0-11-g1ac90f03", "installationPath": "/full/path/to/azkaban-solo-server/lib/azkaban-web-server-3.33.0-11-g1ac90f03.jar", "usedMemory": 65144664, "xmx": 3817865216, "isDatabaseUp": true, "executorStatusMap": { "104": { "id": 104, "host": "abc.def.company.com", "port": 15462, "isActive": true } }
}
```
|
8/9/2017 9:40:47 PM
log looks like this:
Starting job 1 attempt 1 at 1502324648910
it makes people misinterpret this as the 1st attempt of running the job, this PR changes the message:
Starting job 1 retry 1 at 1502324648910
|
8/9/2017 5:13:59 PM
related to this class and I don't want to make a PR with both changes, so sending off a quick PR for just running the SaveAction command on the ProcessJob class first.
|
8/9/2017 1:01:32 AM
is to make it easy to add more routes to the web server. A lot of static code exists which makes it difficult to access dependent classes directly.
Refactor details: - `prepareAndStartServer` and related methods were converted to instance method - deprecating the `app` member which is a static reference to `AzkabanWebServer`. Not removing it because it may be used by downstream code.
|
8/8/2017 11:20:23 PM
class (#1316)
No logic has changed
|
8/8/2017 11:08:38 PM
class (#1317)
No Logic change introduced.
|
8/8/2017 10:32:46 PM
(#1315)
Refactoring logic into separate smaller methods. The only logic change is a different logging statement compared to the previous one.
|
8/8/2017 7:37:00 PM
Fix for failing to kill a job due to a race condition
Problem:
After a flow starts, kill the flow quickly.
The flow and the job will show as killed. However the job actually runs
to completion.
Analysis:
Jobrunner thread runs: azkaban.jobExecutor.utils.process
.AzkabanProcess#run
A jetty thread processes the kill command:
azkaban.jobExecutor.ProcessJob#cancel
azkaban.jobExecutor.utils.process.AzkabanProcess#softKill
azkaban.jobExecutor.utils.process.AzkabanProcess#checkStarted
here it throws exception, because the job process has not been created
yet at this point.
This exception is caught and ignored.
Users are informed that the kill action is completed and reflected in
the job and flow pages.
Fix:
Synchronize the killing thread and the jobrunner thread.
The jetty thread will wait for the job process to be created if needed
before killing the job process.
If the jetty thread cancels the job before the job runner thread checks the killed flag, simply set the flag and allow the job runner thread to abort itself. ====
This is a fix built on the idea proposed in #1289
The previous fix #1253 for the same race condition is not complete.
* Fix a typo in comments
|