tfcache-comparison

Name Message Date
adapters
analysis
applications
approaches
docker
.gitignore
clean.sh
compile.sh
configure
functions.sh
generate-recommendations.sh
pull.sh
README.md
reduce.sh
run.sh
status.sh
trace.sh

APL-Caching Approaches Comparison

This is a framework developed by Prosoft Research Group at Federal University of Rio Grande do Sul. It's purpose is to extract traces from web applications as well as compare its thoughput and caching performance.

  • All resulting plots are available in PDF under the folder /analysis
    • Resulting tables are generated by commanding Rscript to run R files within this folder.
    • In order to re-execute such R scripts, the compressed files under /applications/output shall be extracted.
    • Typping bash configure automatically extract them, although it also downloads and install the whole experiment structure.
  • To download, extract, configure and compile the whole experiment without affecting you environment, there are docker machines under /docker with the properly version of software used.
    • Docker machine cannot be used to reproduce the experiment.
    • Any file under /caching-approaches-comparison is available both inside and outside docker container.
    • To execute type:
      1. docker-compose -f docker/docker-compose.yml up --build -d
      1. docker-compose -f docker/docker-compose.yml exec caching-approaches-comparison /bin/bash
        • ssh root@localhost -p 5001 will also work with the password caching-approaches-comparison
      1. cd /caching-approaches-comparison
      1. bash configure docker to download and extract needed files
        • two more arguments can be given in order to provide username and password for git cloning
      1. bash compile.sh to compile adapters and approaches with maven
      1. cd analysis && bash plot.sh in order to plot all R scripts
      1. Hit ctrl+d to leave interactive mode inside the container
      1. Dot not forget to shutdown docker container by typing docker-compose -f docker/docker-compose.yml down -v
        • -v argument deletes the container
  • Data under /adapters, /approaches and /applications are downloaded automatically during the configuration, which are hosted in different repositories.

Reproducing

  • The experiment can either be executed with a single host configuration or with---the ideal configuration of---two hosts, which one dedicated to make the requests whereas the other aims host the application to be logged/measured.
  • In order to execute the experiment with two hosts is just required that both machines are properly configured with the following steps and that the RemoteExecuter is running in the application-machine, whereas the requester-machine commands it through its IP address.

Configuring

  1. git clone --depth=3 https://github.com/rmeloca/caching-approaches-comparison.git caching-approaches-comparison
  2. cd caching-approaches-comparison
  3. bash configure
  4. bash compile.sh

Executing

  1. bash traces.sh [<host> ["<application-list>"]]
  2. bash run.sh [<host> ["<version-list>" ["<application-list>"]]]
  3. bash reduce.sh [<host> ["<version-list>" ["<application-list>" [<reduce> [<overwrite>]]]]]
  • <host> ::= localhost or the <ip> of the application-machine
    • If not informed, localhost is assumed
  • <application-list> ::= * | <application-name> | <application-name> <application-list> - Each <application-name> stands for the some folder under /applications/uncached - * means that the default value shall be assumed - If not informed, * will be assumed - For run.sh the default value * means $(echo applications/uncached/*/) - For reduce.sh the default value * will check for applications under each version within the <version-list> provided
  • <version-list> ::= <version-name> | <version-name> <version-list>
    • Each <version-name> stands for some folder under /applications that holds the applications to be measured
    • If not informed, uncached developers aplcache memoizeit is assumed
  • <reduce> ::= * | requests | cache
    • If not informed, * is assumed
    • Means whether only requests-handled.csv will be generated inside requests-machine or it will be commanded to application-machine also generate hits-distribution.csv and uncached-parameters.csv
    • Every CSV output is generated under /application/output folder in its respective machine

Adding or changing applications

  1. Remove all caching statements of the desired application and put it into /applcations/uncached
  2. [For MemoizeIt] Generate callgraphs with java-cg
    1. Compile the desired application to <compiled-application> folder
    2. zip -r <compiled-application>.zip <compiled-application>
    3. java -jar adapters/java-callgraph/target/javacg-0.1-SNAPSHOT-static.jar <compiled-application>.zip > applications/callgraphs/<application>
  3. Create a database dump---if needed--- and put it into /applications/dumps
  4. Create a workload file into /applications/workloads according to the rules described below
  5. Include ApplicationTracer as a maven or gradle dependency of the application
     <dependency>
         <groupId>br.ufrgs.inf.prosoft.applicationtracer</groupId>
         <artifactId>ApplicationTracer</artifactId>
         <version>1.0</version>
     </dependency>
    
  6. Trace with bash trace.sh
    • Pay attention to the environment variables under docker-compose.yml as described below
  7. Generate recomendations, analyse and cache them
    • APLCache
      • java -jar approaches/APLCache/target/APLCache-1.0.jar --trace=applications/traces/<application-name> --output=applications/output/aplcache-<application-name>-parameters.json
      • Do not forget to include APLCache's output to <application-machine> in the equivalent folder
    • MemoizeIt
      • java -jar approaches/MemoizeIt/target/MemoizeIt-1.0.jar --callgraph=applications/callgraphs/<application-name> --trace=applications/traces/<application-name> [--kernel=<iterative|exhaustive>]
    • Take a look into the caching examples section below
  8. Each cached version of the application should be placed into its respective folder /applications/<aplcache|memoizeit|developers>
    • APLCache
        <dependency>
            <groupId>br.ufrgs.inf.prosoft.aplcache</groupId>
            <artifactId>APLCache</artifactId>
            <version>1.0</version>
        </dependency>
      
    • Memoizeit and Developers
        	<dependency>
        	    <groupId>br.ufrgs.inf.prosoft.cache</groupId>
        	    <artifactId>Cache</artifactId>
        	    <version>1.0</version>
        	</dependency>
      
  9. Execute the application by commanding bash run.sh
  10. Reduce samples into CSV outputs by commanding bash reduce.sh
    • Each result will be available in its respective machine: requests-handled.csv in <requests-machine> & hits-distribution.csv and uncached-parameters.csv in application-machine

Caching examples

import br.ufrgs.inf.prosoft.cache.*;

public static GetterCache<Vet> findAllCache = new GetterCache<>("findAllCache");
return findAllCache.computeIfAbsent(() -> {}, 60000);

public static SingleCache<Parameters, PetType> singleCache = new SingleCache<>("singleCache");
return singleCache.computeIfAbsent(new Parameters(text, locale), () -> {}, 60000);

public static MultiCache<Parameters, PetType> parseCache = new MultiCache<>("parseCache");
return parseCache.computeIfAbsent(new Parameters(text, locale), () -> {}, 60000);
import br.ufrgs.inf.prosoft.aplcache.caching.APLCache;

public static APLCache<Type> methodCache = new APLCache<>("methodCache");
return methodCache.computeIfAbsent(Thread.currentThread(), new Object[]{parameter}, () -> {}, 60000);

Environment variables

  • JAVA_OPTS
    • For requester-machine: JAVA_OPTS="-Xms4096m -Xmx6124m"
    • For application-machine: JAVA_SERVER_OPTS=${JAVA_SERVER_OPTS:-"-Xmx30000m"}
  • Tracing
    • TRACER_ENABLE=${TRACER_ENABLE:-true}
      • It does enable or disable the tracing while running an uncached version
    • TRACER_MINIMUM_EXECUTION_TIME=${TRACER_MINIMUM_EXECUTION_TIME:-1}
      • It sets how much milliseconds a given method should last in order to be logged
    • TRACER_SERIALISE_INTERNALS=false
      • It sets if classes within Java core should be serialised or not
    • TRACER_VERBOSE=true
      • If enabled, every logged method that lasts longer than 5ms will be echoed
    • TRACER_BLACKLIST="$(pwd)/blacklist"
      • Points to the folder of methods that shall be ignored
    • TRACER_TRACES="$(pwd)/traces"
      • Points to the file where traces shall be logged
    • TRACER_IGNORED_PACKAGES="$(pwd)/ignored"
      • Points to the file that lists the packages that shall be ignored
    • TRACER_WHITELIST="$(pwd)/whitelist"
      • Points to the file that lists packages that will not be echoed, but still will be serialised
      • Useful only for development purposes
    • TRACER_LOG="$(pwd)/tracer.log"
      • Prints logged methods in a file
      • Useful for development purposes
  • Measuring
    • CACHE_EVENTS=${CACHE_EVENTS:-"$(pwd)/cache"}
      • Output file where to log caching events
    • CACHE_REGISTER_SIZE=false
      • Choose whether to log the size of the cached object
    • APLCache
      • APLCACHE_CACHEABLE_PARAMETERS="$(pwd)/aplcache-parameters.json"
        • Points to the file where the recommended inputs were written by APLCache
      • APLCACHE_LOG="$(pwd)/aplcache-parameters.log"
        • Output file where to log the uncached inputs for APLCache
      • TRACER_SERIALISE_INTERNALS and TRACER_IGNORED_PACKAGES might be provided accordingly to its values for the tracing phase, in order to APLCache behave properly

Requests Graph Syntax

<reference>                  ::= <string>
<method>                     ::= POST
                               | GET
                               | PUT
                               | DELETE
<url>                        ::= http://<string>/<url-definition>
<url-definition>             ::= <string>
                               | <variable>
                               | <random>
                               | <optional>
                               | <url-definition><url-definition>
<header>                     ::= Cookie: <string>=<variable>; <string>: <variable> <optional>
<form>                       ::= <string>=<variable>&<string>=<variable> <optional>
<data>                       ::= <string>
                               | <variable>
                               | <optional>
                               | <data><data>
<random>                     ::= $
                               | $[<number>]
                               | $[<number>-<number>]
<variable>                   ::= #{<variable-definition>}
                               | #{<string>@<variable-definition>}
<store-field>                ::= <store-variable-definition>
                               | <string>@<store-variable-definition>
<variable-definition>        ::= <object>
                               | <optional>
                               | <array>
                               | <variable-definition><variable-definition>
<store-variable-definition>  ::= <object>
                               | <optional>
                               | <store-array>
                               | <store-variable-definition><store-variable-definition>
<array>                      ::= [$]
                               | [<number>]
<store-array>                ::= [<random>]
                               | [<number>]
<object>                     ::= <string>
                               | #<string>
<optional>                   ::= <<optional-definition>>
<optional-definition>        ::= <string>|
                               | <optional-definition><optional-definition>
<link-references>            ::= <reference>
                               | <copy-reference>
                               | <ignore>
<copy-reference>             ::= *<reference>
<ignore>                     ::= *<copy-reference>
{
	"<reference>": {
		"method": "<method>",
		"URL": "<url>",
		"headers": "<header>*",
		"forms": "<form>*",
		"data": "<data>",
		"storeFields": [
			"<store-field>"
		],
		"requirementsReferences": [
			"<reference>"
		],
		"linksReferences": [
			"<link-reference>"
		]
	}
}

RequestExecuter Lifecycle

Generating

read profile
parallel foreach user
	create session
	while not timeout
		if probability leave
			break
		choose request
			generate probability
			generate random
			choose optionals
		load variables
		fire
		store variables
		log

Executing

read profile
read logs
parallel foreach thread
	create session
	foreach request
		load variables
		fire
			store variables