Processing results
Assuming that you successfully run experiments during previous stage, let's analyze gathered results. Simplest way to do so is to run report subcommand of tsexperiment tool:
test1 STEP COUNT ONTIME ONSTEP DISCARD WAIT TIME (ms) SERVICE TIME (ms) MEAN STDDEV MEAN STDDEV 0 8000 3492 8000 2428 -0.054 0.107 0.141 0.082 1 8000 2972 8000 3143 -0.056 0.130 0.145 0.089 2 6302 2372 6302 2459 -0.071 0.219 0.153 0.145 3 7952 3121 6254 2849 -0.059 0.176 0.142 0.114 4 7966 3274 6220 2657 -0.058 0.140 0.137 0.089 5 8033 3047 6253 3098 -0.067 0.258 0.145 0.133 6 9747 3442 8000 4109 -0.064 0.184 0.147 0.119 7 8000 3170 8000 2952 -0.061 0.122 0.140 0.088 8 6352 2491 6352 2363 -0.058 0.111 0.142 0.078 9 9648 3622 8000 3739 -0.071 0.318 0.141 0.141 10 6303 2580 6303 2170 -0.060 0.115 0.138 0.080 11 8058 2972 6361 3162 -0.060 0.176 0.140 0.101 12 8015 2919 6376 3201 -0.057 0.116 0.141 0.083 13 8057 2682 6433 3542 -0.061 0.188 0.145 0.108 14 9567 3097 8000 4449 -0.063 0.223 0.146 0.134 15 6490 2138 6490 2904 -0.058 0.133 0.150 0.094 16 7960 2622 6450 3573 -0.068 0.208 0.149 0.130 17 9550 3039 8000 4491 -0.057 0.141 0.144 0.099 18 7233 2463 7233 3138 -0.056 0.119 0.143 0.082 19 8767 2694 8000 4272 -0.053 0.110 0.138 0.080 total 160000 58209 141027 64699 -0.061 0.177 0.143 0.106
It provides very basic statistics about number of requests that was run during experiments, service and wait times. Let's explain some anomalies that could be seen here.
First of all, count of requests and count of requests that was run during their step is much different than requested (8000). That's because the nature of think-time request scheduler that was chosen for workload 'test1'. It advances arrival times of requests by service time of previous request, so request simply misses their step. There is also a lot of discarded requests because deadline that we set for workload 'test1' (100us) is very small.
Also, there is very strange negative mean wait time of requests. That's because it is very hard to create simulate request arrival because of so-called OS jitter and overheads brought by TSLoad itself. So TSLoad tries to mitigate them by making arrivals earlier than they should be. It's behavior could be controlled by tp_worker_min_sleep and tp_worker_overhead tunables.
However, report
subcommand provides not much information, so you may want to use some external software, like R. To do so you need to export experiment data from binary tsf format to csv or json:
$ ./bin/tsexperiment -e var/tsload/sample/ export -F csv -d /tmp/ 11 test1 Exported workload 'test1' to file /tmp/panther.mylocal-run-8k-11-test1.csv
Let's look at file that was generated by tsexperiment tool:
$ head /tmp/panther.mylocal-run-8k-11-test1.csv rq_step,rq_request,rq_chain_request,rq_thread,rq_user,rq_sched_time,rq_start_time,rq_end_time,rq_queue_length,rq_flags,num_cycles 0,0,-1,1,1,124984,4256561,0,16,1,30000 0,1,-1,0,0,249968,4256559,0,18,1,40000 0,2,-1,0,3,374952,4257247,0,17,1,40000 0,3,-1,0,2,499936,4257620,0,16,1,40000 0,4,-1,0,1,624920,4257976,0,15,1,40000 0,5,-1,1,0,749904,4257143,0,15,1,100000 0,6,-1,0,3,874888,4258344,0,14,1,40000 0,7,-1,1,2,999872,4257528,0,14,1,40000 0,8,-1,1,1,1124856,4257863,0,13,1,40000 ... 0,40,-1,1,1,5124344,5414334,0,1,1,30000 0,41,-1,0,0,5425968,5266827,5384069,0,15,40000 0,42,-1,0,3,5560785,5513423,5759908,0,15,100000 0,43,-1,0,2,5599871,5912120,0,2,1,100000 0,44,-1,0,1,5624280,5912332,0,1,1,30000 0,45,-1,0,0,6043146,5912551,5985735,0,15,30000 0,46,-1,1,3,6307206,6090695,6168348,0,15,30000 0,47,-1,0,2,6099807,6141445,6218931,1,11,30000 0,48,-1,1,1,6124216,6287430,0,1,1,100000 ...
As you can see, a lot of requests was thrown away due to deadline policy and interstep effects. The file itself contains several mandatory fields:
-
rq_step - id of step to which request belongs
-
rq_request - id of request. It is unique for workload during step.
-
rq_chain_request - id of request from chained workload. In our case test1 doesn't have chained workloads so it is set to -1
-
rq_thread - id of worker that executed that request
-
rq_user - id of user (only for think-time request scheduler)
-
rq_sched_time - request arrival time
-
rq_start_time - start of request service
-
rq_end_time - end of requests service
-
rq_flags - request flags (bitmask)
Other fields are per-request parameters and may vary. In our case it is parameter num_cycles, which is specific to busy_wait workload that we used in our demonstration.
Processing results with R
First of all let's run R (which is obvious) and load experiment results to it:
$ R Copyright (C) 2012 The R Foundation for Statistical Computing ... (some output is omitted) ... > rq <- read.table('/tmp/panther.mylocal-run-8k-11-test1.csv', sep=',', header=TRUE) > dim(rq) [1] 160000 11
Now we have data frame that consists of 11 columns and 160000 rows. Let's get rid of requests which was not finished successfully:
> erq <- rq[rq$rq_flags == 15 | rq$rq_flags == 11, ] > dim(erq) [1] 95301 11
Now we could do some statistical computations on these data: let's create vector of wait and service times, and calculate mean values of them:
> wait_time <- erq$rq_start_time - erq$rq_sched_time > svc_time <- erq$rq_end_time - erq$rq_start_time > mean(wait_time) ; mean(svc_time) [1] -60613.16 [1] 143021.6
If you want to work with data on per-step basis, you have to create new data frame by using aggregate function:
> aggregate(erq$rq_start_time - erq$rq_sched_time, by=list(erq$rq_step), FUN=mean) Group.1 x 1 0 -54113.36 2 1 -55544.54 3 2 -71758.97 4 3 -54888.67 5 4 -58391.60 6 5 -69939.51 ... (some output is omitted)
Finally let's find correlation between num_cycles
value and service time of request (however it is a bit useless because for such study we should use variator-based parameter, not a probability map):
> cor(erq$rq_end_time - erq$rq_start_time, erq$num_cycles) [1] 0.6880422