4/6/2018 5:27:25 PM
jar first (#1706)
Flow Trigger plugin manager currently prefers class provided by azkaban platform even if there exists same class in plugin jar. This is designated behavior of URL class loader - when loading a class, it searches parent class loader first. (https://docs.oracle.com/javase/7/docs/api/java/net/URLClassLoader.html).
But our Linkedin's internal dependency plugin jar(Dali: https://engineering.linkedin.com/blog/2017/11/dali-views--functions-as-a-service-for-big-data) has a class conflicting with a dependent class with downgraded version from hive library by Azkaban platform, causing dependency initialization failure. As for now the quick workaround is to implement custom class loader prefering child classloader first.
Also there could be some other alternatives:
1. make the conflicting class same by upgrading jar from AZ(short term resolution)
2. let plugin run in isolated process and manage its own classpath.
|
4/6/2018 3:47:36 PM
followup of #1712. We move jobtype module from azkaban-plugins to the main AZ repo, including their original tests.
|
3/29/2018 10:09:44 PM
unit tests
* Optimize wait in SleepJavaJob
- no extra wait
- code that better highlights the intention
* Document testing of createDeepHardlink
* Remove sleep in MetricManagerTest
|
3/29/2018 3:20:42 PM
reportal is at azkaban-plugin repo, an old and seldom maintained development environment. There exists much inconvenience to do development there:
* Ant is old and not industry Standard. It requires much debugging if something goes wrong.
* Can not set up Ant in Intellij. Because of lacking IDE support, we walk through/ View code extremely inefficiently.
This PR moves reportal codebase to main AZ repo and will be able to facilitate reportal development a lot. Fromat/Minor refactor is done through intellij plugin. More refactor are expected in future PRs
|
3/28/2018 10:07:06 PM
3.45.1
was changed to use POST for all requests, but I missed that azkaban-web also makes calls to /serverStatistics, /jmx & /stats via the gateway, not just /executor.
The problem wasn't seen in manual tests with AzkabanSingleServer because it doesn't use multi-executor mode.
|
3/23/2018 4:18:43 PM
3.45.0
to using POST with form params instead of GET with URL params, so that the number of execution ids passed can be longer.
Also deleting some unused code.
Tested manually that it works like this:
run AzkabanSingleServer in IDEA with debugger
start a flow via the UI (http://localhost:8081/)
set breakpoint to check that requests come to azkaban.execapp.ExecutorServlet and are handled successfully
set breakpoint to check that ExecutorManager can get the execution updates
(initial PR & discussion in #1655)
Builds on #1707 to validate the fix.
|
3/22/2018 9:24:51 PM
(#1705)
|
3/22/2018 7:43:04 PM
several flow trigger service enhancement:
Run cancelling of context asynchronously given the possibility where cancelling context is subject to blocking.
Sending failure email asynchronously when flow trigger instance fails given email sending is blocking operation.
Flow trigger instance of such case won't be rerun when web server restarts: flow trigger instances succeed, but fails to trigger flow execution.
|
3/22/2018 7:42:01 PM
initial motivation of keeping project object as part of trigger instance and persist it into db is for consistency purpose - when a new project zip is uploaded, even if web server restarts after that, the running trigger instance would still pick up the old version of the flow to execute for consistency.
However due to the huge size of project object, the cost of keeping project object for each dependency instance/flow trigger in db is extremely high(suppose a big project containing hundreds of flows, one of which are using flow trigger, then each row of execution_dependencies will contain a large blob field of project json, rendering deserialization costly when recovering incomplete trigger instances before web server restarts).
So we decided to only keep the project id instead of the project object in database. And we will populate the project object of dependency instance by looking up the id from project database. The downside is we might compromise consistency as we mentioned above since looking up by project id will always fetch the latest-uploaded project.
|
3/21/2018 8:54:17 PM
generate job link URL.
* Address comments.
* Change 'Hadoop/Spark Job Log' button color to red if the job fails.
|