8/3/2018 2:53:47 PM
(#1885)
|
|
|
|
7/30/2018 3:37:14 PM
PR adds following information to the failure email template:
overall failed execution among last 72 hours' execution(e.x 3/25 failed)
past 72 hours execution tables(status, start/end time).
The motivation is to capture people's attention if the flow has been failing too many times recently.
|
7/27/2018 4:13:20 PM
to banner when it's showing up. After the dismiss button is clicked, the banner will disappear until a new banner message is posted.
|
7/26/2018 5:25:19 PM
loading some old executions (at least in order to finalize them), it's possible that the serialized flow_data doesn't include the conditionOnJobStatus at all.
|
|
7/25/2018 8:25:28 PM
to solve is detailed in #1803
Drawback of previous design is detailed in #1841
New design:
Create a file in each project directory and write the size of the project to the file when the project is created. The project files are not supposed to change after creation. Touch this file each time the project is used. This way, we can have a more efficient LRU algorithm based on last access time, not creation time.
Maintain the total size of the project cache in memory to avoid the overhead of re-calculating it. The size shouldn't change too often. This way we can afford to run the check more frequently. Project dir size check and corresponding deletion will be performed when a new project is downloaded.
This PR implements part 2.
Next step is to shorten execution dir retention period to really free up space, given there's always a hard link from execution to project directory.
|
7/25/2018 7:05:58 PM
was calling log.error(Object message), so we didn't get a stack trace at all. For example this was logged, which is not too helpful: ERROR [ExecutorManager] [Azkaban] java.lang.NullPointerException (the value of e.toString(), as you can see).
|