6/20/2018 5:33:56 PM
3.48.0
change configurable key name
change default value for max cache size to 128GB which is a more standard number.
change default value for stop cleanup threshold to 60 to allow more space to be freed.
change data type of stop cleanup threshold from double to int.
rename the parameter name(projectDirMaxSizeInMB - > projectDirMaxSizeInMb) to align with coding standard.
|
6/19/2018 6:16:35 PM
This PR implements LRU project purging to prevent project files eating up disk space on executor.
We will encapsulate following logic inside the background cleanup thread:
if(disk space consumed by projects >= predefined threshold) List all project files and sort them by creation time in ascending order Iterate over the project file list, delete the file if the project is not running until disk size of projects drops down below a predefined lower threshold or no more projects to delete.
Why using creation time not last access time:
Directory last access time is not maintained by most file systems so we cannot rely on file system API to get last access time. An alternative considered is to modify the project dir every time the code reads it, then last modification time would be equivalent of last access time. The associate cons are code complexity and overhead of disk IO. We will use last creation time as the indicator of oldness of the project. Although creation time might not be as indicative as last access time to determine the hotness of a project, we are still ok with using it based on the assumption that the older the project is, more likely it’s not being used.
This PR also changes execution dir retention from 1 day to 2 hours. Since execution dir is hard linked to project dir, so disk space will be released only when no reference to project dir exists. That's why we want to shorten the execution dir retention time so that disk pressure can be alleviated sooner.
Project cleanup is executed by active executor only to prevent ensure only one thread will be performing deletion. Currently it relies on FlowRunnerManager#isExecutorActive to determine whether itself is active. But isExecutorActive is reliable only when azkaban admin uses activate API to active the executor. If admin manually updates the executors table in database and call reloadExecutors API, then the flag won't be set which inactive executor to perform deletion. So improvement need to be done here to make sure isExecutorActive is authentic.
|
6/19/2018 4:49:46 PM
(#1808)
This is the response from a comment in #1759
|
6/15/2018 6:07:47 PM
Change activeExecutors from HashSet to ImmutableSet to guarantee thread safety.
|
|
6/7/2018 4:07:48 PM
a user to view HDFS as their own authenticated user or as any other proxy user. Looking at the validation logic, 3 branches exist for obtaining the username of the current user which the plugin proxies as:
Current user from session
Proxy user via session attribute which validates the user has permissions
A "proxyname" parameter when "action" is set to "goHomeDir"
The final option is implemented as follows:
plugins/hdfsviewer/src/azkaban/viewer/hdfs/HdfsBrowserServlet.java
if(hasParam(req, "action") && getParam(req, "action").equals("goHomeDir")) {
username = getParam(req, "proxyname");
}
This means a user can "proxy" as any other valid user by simple appending "?action=goHomeDir&proxyname=$username" to the URL.
This PR removes goHomeDir action.
|
6/7/2018 3:53:17 PM
error message was hard-coded static string. But now, we created a configurable error message.
|
6/6/2018 9:12:54 PM
(#1798)
Since processSucceed is an asyn call, test needs to wait for updateAssociatedFlowExecId to be called before verification of its invocation. So when updateAssociatedFlowExecId is called, the associated countdownlatch is decremented to zero and unit test will wait on the countdown latch before verifying invocation of updateAssociatedFlowExecId.
|
6/5/2018 8:46:11 PM
3.47.2
(#1790)
This change moves VM files to resources. Tested in staging cluster.
|
6/5/2018 5:22:38 PM
'.' to separate name spaces and '_" to separate words in the same namespace. e.g. * azkaban.job.some_key
|