Iñaki Ucar & modified Stefanka Chukova
2018-05-31
library(simmer)library(simmer.plot)library(parallel)set.seed(1234)
The M/M/1 system
In Kendall’s notation, an M/M/1 system has exponential arrivals (M/M/1), a single server (M/M/1) with exponential service time (M/M/1) and an infinite queue (implicit M/M/1/\(\infty\)). For instance, people arriving at an ATM at rate \(\lambda\), waiting their turn in the street and withdrawing money at rate \(\mu\).
Let us recall the basic parameters/performance measures of this system:
\[\begin{aligned}\rho &= \frac{\lambda}{\mu} &&\equiv \mbox{Server} \hspace{0.5ex} \mbox{utilization,}\hspace{0.5ex} \mbox{traffic}\hspace{0.5ex} \mbox{intensity} \\L &= \frac{\rho}{1-\rho} &&\equiv \mbox{Average} \hspace{0.5ex}\mbox{number} \hspace{0.5ex}\mbox{of} \hspace{0.5ex}\mbox{customers} \hspace{0.5ex}\mbox{in} \hspace{0.5ex}\mbox{the} \hspace{0.5ex}\mbox{system} \hspace{0.5ex}\mbox{(queue}\hspace{0.5ex} \mbox{+} \hspace{0.5ex} \mbox{server)} \\W &= \frac{L}{\lambda} &&\equiv \mbox{Average} \hspace{0.5ex}\mbox{time} \hspace{0.5ex}\mbox{in} \hspace{0.5ex}\mbox{the}\hspace{0.5ex} \mbox{system}\hspace{0.5ex} \mbox{(queue}\hspace{0.5ex} \mbox{+ server)} \hspace{0.5ex}\mbox{[Little's} \hspace{0.5ex}\mbox{law]} \\\end{aligned}\]
whenever \(\rho < 1\), so the system is in steady state. Little“s Law is valid also for
\[W_q = \frac{L_q}{\lambda} \hspace{0.9ex} \mbox{and }\hspace{0.9ex} W_s= \frac{L_s}{\lambda}.\]
If \(\rho < 1\) is not true, it means that the system is unstable: there are more arrivals than the server is capable of handling, and the queue will grow indefinitely.
The simulation of an M/M/1 system is quite simple using simmer
. In the code below, the running time is specified by run(until=100000/lambda),
so we will be simulating around 100000 arrivals. This is because the expected value of the interarrival time is \(\frac{1}{\lambda}\), and to simulate around 100000 arrivals it will take around \(100000. \frac{1}{\lambda} = \frac{100000}{\lambda}\) time units.
library(simmer)lambda <- 2mu <- 4rho <- lambda/mu # = 2/4 <1mm1.trajectory <- trajectory() %>% seize("resource", amount=1) %>% timeout(function() rexp(1, mu)) %>% release("resource", amount=1)mm1.env <- simmer() %>% add_resource("resource", capacity=1, queue_size=Inf) %>% add_generator("arrival", mm1.trajectory, function() rexp(1, lambda)) %>% run(until=100000/lambda) #the simulation will run for 50000 time units# number of arrivalsnrow(get_mon_arrivals(mm1.env))#> [1] 99756
COMMANDS in package “simmer.plot”
plot(x, what = c(“resources”, “arrivals”, “attributes”), metric = NULL, …)
Arguments
x - a single simmer environment or a list of environments representing several replications.
what - type of plot, one of c(“resources”, “arrivals”, “attributes”).
metric - specific metric for each type of plot.
what = "resources" one of c("usage", "utilization"). what = "arrivals" one of c("activity_time", "waiting_time", "flow_time").
… further arguments for each kind of plot.
what = "resources" all metrics names the name of the resource(s) (a single string or a character vector) to show. metric = "usage" items the components of the resource to be plotted, one or more of c("system", "queue", "server"). steps if TRUE, shows the instantaneous usage instead of the cumulative average.
The simmer.plot
package provides convenience plotting functions to quickly visualise the usage of a resource over time, for instance. Below, we can see how the simulation converges to the theoretical average number of customers in the system.
library(simmer)library(simmer.plot)# Theoretical value of Lmm1.L <- rho/(1-rho)# Evolution of the average number of customers in the system (L)resources <- get_mon_resources(mm1.env)plot(resources, "usage", "resource", items="system") + geom_hline(yintercept=mm1.L)
Also, we can see how the simulation converges to the theoretical average number of customers in the queue. The simulated value of L is getting reasonably close to the theoretical value of L when the time is approximately (visually estimated) 12500 units of time, which corresponds roughly to 25000 arrivals. The time, usually at the start of the simulation, when the performance measures are unstable is called warm up period. We will get back to the idea of warm up period later in this note.
# Theoretical value of L_qmm1.Lq <- (rho**2)/(1-rho)# Evolution of the average number of customers in the queueplot(resources, "usage", "resource", items="queue") + geom_hline(yintercept=mm1.Lq)
The utilization of the server (the proportion of time the server is busy) is depicted below
# Theoretical value of the utilization U = rhoU <- rho# The resource utilization for M/M/1 (the proportion of the time the resource is busy)plot(resources, "utilization", "resource") + geom_hline(yintercept=mm1.Lq)
Also, it is possible to visualise, for instance, the instantaneous usage of individual elements by playing with the parameters items
and steps
.
plot(resources, "usage", "resource", items=c("system"), steps=TRUE) + xlim(0, 10) + ylim(0, 3)#> Warning: Removed 199481 rows containing missing values (geom_path).#> Warning: Removed 199481 rows containing missing values (geom_path).
Experimentally, we obtain the time spent by each customer in the system and we compare the average with the theoretical expression.
mm1.arrivals <- get_mon_arrivals(mm1.env)mm1.t_system <- mm1.arrivals$end_time - mm1.arrivals$start_time #the waiting timeslambda_eff<- lambdamm1.W <- 1/(mu-lambda) #theoretical value of Wmm1.W ; mean(mm1.t_system)#> [1] 0.5#> [1] 0.4952966
It seems that it matches the theoretical value pretty well. Let’s take a closer look, by looking at the 95% confidence intervals.
Note (see Wikipedia) In statistics, a confidence interval (CI) is a type of interval estimate (of a population parameter) that is computed from the observed data. The confidence level is the frequency (i.e., the proportion) of confidence intervals that contain the true value of their corresponding parameter. In other words, if confidence intervals are constructed using a given confidence level in an infinite number of independent experiments, the proportion of those intervals that contain the true value of the parameter will match the confidence level. Confidence intervals consist of a range of values (interval) that act as good estimates of the unknown population parameter. However, the interval computed from a particular sample does not necessarily include the true value of the parameter. Since the observed data are random samples from the true population, the confidence interval obtained from the data is also random.
The replications needed for the construction of the CI can be done with standard R tools:
envs <- mclapply(1:50, function(i) { simmer() %>% add_resource("resource", capacity=1, queue_size=Inf) %>% add_generator("arrival", mm1.trajectory, function() rexp(1, lambda)) %>% run(100000/lambda) %>% wrap()}, mc.set.seed=FALSE)t_system <- get_mon_arrivals(envs) %>% dplyr::mutate(t_system = end_time - start_time) %>% dplyr::group_by(replication) %>% dplyr::summarise(mean = mean(t_system))n_system <-get_mon_resources(envs)%>% dplyr::group_by(replication) %>% dplyr::summarise(mean = sum(head(system, -1) * diff(time)) / tail(time, 1))t.test(t_system$mean)#> #> One Sample t-test#> #> data: t_system$mean#> t = 819.86, df = 49, p-value < 2.2e-16#> alternative hypothesis: true mean is not equal to 0#> 95 percent confidence interval:#> 0.4974445 0.4998891#> sample estimates:#> mean of x #> 0.4986668t.test(n_system$mean)#> #> One Sample t-test#> #> data: n_system$mean#> t = 706.84, df = 49, p-value < 2.2e-16#> alternative hypothesis: true mean is not equal to 0#> 95 percent confidence interval:#> 0.9929844 0.9986467#> sample estimates:#> mean of x #> 0.9958155
Another approach in computing the confidence intervals is given below. Also, I have shown the difference in the performance measures \(W\) and \(L\) with and without taking into account the warm up period. The warm up period is the time that the simulation will run before starting to collect results. This allows the queues to get into conditions that are typical of normal running conditions in the system we are simulating, i.e., we say, it allows the queue to reach steady state and, therefore, to mimic the functioning of the real system.
library(simmer)library(parallel)library(dplyr)#> #> Attaching package: 'dplyr'#> The following object is masked from 'package:simmer':#> #> select#> The following objects are masked from 'package:stats':#> #> filter, lag#> The following objects are masked from 'package:base':#> #> intersect, setdiff, setequal, unionlibrary(stats)lambda<-2mu<-4rho<-lambda/muW_theo <- 1/(mu-lambda)L_theo <-rho/(1-rho)customer <- trajectory("Customer's path") %>% seize("counter") %>% timeout(function() {rexp(1, mu)}) %>% release("counter")envs <- mclapply(1:30, function(i) { simmer() %>% add_resource("counter", capacity=1) %>% add_generator("customer", customer, function() (rexp(1, lambda))) %>% run(100000/lambda) %>% wrap()}, mc.set.seed=TRUE)# WITH WARM UP PERIOD ( not collecting results for about 12500 time units)k = 75000t_with_warmup<-get_mon_arrivals(envs)%>% dplyr::mutate(wait = end_time - start_time) %>% dplyr::group_by(replication)%>% dplyr::filter(between(row_number(), n()-k, n()))%>% dplyr::summarise(mean = mean(wait[finished]))n_with_warmup<-get_mon_resources(envs)%>% dplyr::group_by(replication) %>% dplyr::filter(between(row_number(), n()-k, n()))%>% dplyr::summarise(mean = sum(head(system, -1)*diff(time))/(tail(time,1)-head(time,1)))# WITHOUT A WARM UP PERIOD (collecting results right from the beginning of the simulation)t_system <- get_mon_arrivals(envs) %>% dplyr::mutate(wait = end_time - start_time) %>% dplyr::group_by(replication) %>% dplyr::summarise(mean = mean(wait[finished]))n_system <- get_mon_resources(envs)%>% dplyr::group_by(replication) %>% dplyr::summarise(mean = sum(head(system, -1) * diff(time)) / tail(time, 1))get.conf.int = function(x) t.test(x)$conf.intconf.intW_nw = get.conf.int(t_with_warmup$mean)conf.intW = get.conf.int(t_system$mean)conf.intL_nw = get.conf.int(n_with_warmup$mean)conf.intL = get.conf.int(n_system$mean)
Perf_Measure<-c("W_theo", "W_all", "W_with_wp","L_theo", "L_all", "L_with_wp" )val<-c(W_theo, mean(t_system$mean),mean(t_with_warmup$mean),L_theo,mean(n_system$mean), mean(n_with_warmup$mean))CI<-c( " ", paste("(",conf.intW[[1]],",",conf.intW[[2]],")"), paste("(",conf.intW_nw[[1]],",",conf.intW_nw[[2]],")"), "",paste("(",conf.intL[[1]],",",conf.intL[[2]],")"),paste("(",conf.intL_nw[[1]],",",conf.intL_nw[[2]],")"))df<-data.frame(Perf_Measure,val,CI)print(df)#> Perf_Measure val CI#> 1 W_theo 0.5000000 #> 2 W_all 0.4993104 ( 0.497604938540511 , 0.501015814712118 )#> 3 W_with_wp 0.5000115 ( 0.498020997695854 , 0.502002062502213 )#> 4 L_theo 1.0000000 #> 5 L_all 0.9987333 ( 0.994515568595136 , 1.00295107416491 )#> 6 L_with_wp 1.0012091 ( 0.993837291021297 , 1.00858082355815 )
The results for the values of (W, W_all, W_with_wp) and (L, L_all, L_with_wp) and the 95% confidence intervals (for W and L, respectively), under the two scenarious
with warm up period and
without warm up period
are reasonably similar. That is why, onwards, while computing 95% confidence intervals, we are going to ignore the warm up period, i.e., we will start to collect simulation results from the beginning of the simulation time.
Finally, the inverse of the mean difference between arrivals is the effective rate, which matches (approx.) the real lambda because there are no rejections.
lambda; 1/mean(diff(subset(mm1.arrivals, finished==TRUE)$start_time))#> [1] 2#> [1] 1.995162
The M/M/c system
In Kendall’s notation, an M/M/c system has exponential arrivals (M/M/c), c servers (M/M/c) with exponential service time (M/M/c) and an infinite queue (implicit M/M/c/\(\infty\)). For instance, people arriving at rate \(\lambda\) at a bank with c tellers, waiting their turn and being served at rate \(\mu\).
Below are the basic parameters/performance measures of this system:
\[\begin{aligned}\mbox{Denote by} \hspace{2ex} \rho &= \frac{\lambda}{\mu} &&\equiv \mbox{the traffic intensity} \\\frac{\rho}{c} &= \frac{\lambda}{c \mu} &&\equiv \mbox{Server utilization}\\\pi_0 &= \left(\sum_{j=0}^{c-1} \frac{\rho^j}{j!} + \frac{\rho^c}{c!}\frac{1}{(1 - \frac{\rho}{c})}\right)^{-1} &&\equiv \mbox{probability system (queue + servers) is empty} \\L_q &= \frac{\rho^c}{c!}\pi_0 \frac{\frac{\rho}{c}}{(1 - \frac{\rho}{c})^2} &&\equiv \mbox{Average number of customers in the queue} \\L_s &= \rho &&\equiv \mbox{Average number of customers in service} \\W_q &= \frac{L_q}{\lambda} &&\equiv \mbox{Average time in the system (queue + servers) [Little's law]} \\\end{aligned}\]
whenever \(\frac{\rho}{c} < 1\), so the system is in steady state. If \(\frac{\rho}{c} < 1\) is not true, it means that the system is unstable: there are more arrivals than the servers are capable of handling, and the queue will grow indefinitely.
The simulation of an M/M/c system is simple using simmer
:
lambda <- 3mu <- 4c<-2rho <- lambda/murho_c <- lambda/(c*mu) # = 5/8 <1mmc.trajectory <- trajectory() %>% seize("resource", amount=1) %>% timeout(function() rexp(1, mu)) %>% release("resource", amount=1)mmc.env <- simmer() %>% add_resource("resource", capacity=2, queue_size=Inf) %>% add_generator("arrival", mmc.trajectory, function() rexp(1, lambda)) %>% run(until=100000/lambda)
Our simmer.plot
package provides convenience plotting functions to quickly visualise the usage of a resource over time, for instance. Down below, we can see how the simulation converges to the theoretical average number of customers in the system.
# Theoretical valuesind <- 0:(c-1)pi_0 <-1/(sum(rho**ind/factorial(ind)) + (rho**c/factorial(c))*(1/(1-rho/c)))mmc.L <- rho + ((rho**c/factorial(c))*pi_0)*((rho/c)/((1-rho/c)**2))mmc.Lq <- mmc.L - rho# Evolution of the average number of customers in the systemresources<-get_mon_resources(mmc.env)plot(resources, "usage", "resource", items="system") + geom_hline(yintercept=mmc.L) + ylim(0, 1.5)
# Evolution of the average number of customers in the queueplot(resources, "usage", "resource", items="queue") + geom_hline(yintercept=mmc.Lq) + ylim(0, 0.2)
It is possible also to visualise, for instance, the instantaneous usage of individual elements by playing with the parameters items
and steps
.
plot(resources, "usage", "resource", items=c("queue", "server"), steps=TRUE) + xlim(0, 20) + ylim(0, 5)#> Warning: Removed 399604 rows containing missing values (geom_path).#> Warning: Removed 399604 rows containing missing values (geom_path).
Experimentally, we obtain the time spent by each customer in the system and we compare the average with the theoretical expression.
mmc.arrivals <- get_mon_arrivals(mmc.env)mmc.t_system <- mmc.arrivals$end_time - mmc.arrivals$start_timelambda_eff<-lambdammc.W <- mmc.L / lambda_effmmc.W ; mean(mmc.t_system)#> [1] 0.2909091#> [1] 0.290454
It seems that it matches the theoretical value pretty well. Again, looking at the confidence intervals, ignoring the warm up period. Replication can be done with standard R tools:
library(parallel)envs <- mclapply(1:50, function(i) { simmer() %>% add_resource("resource", capacity=c, queue_size=Inf) %>% add_generator("arrival", mmc.trajectory, function() rexp(1, lambda)) %>% run(100000/lambda) %>% wrap()}, mc.set.seed=FALSE)t_system <- get_mon_arrivals(envs) %>% dplyr::mutate(t_system = end_time - start_time) %>% dplyr::group_by(replication) %>% dplyr::summarise(mean = mean(t_system))n_system <-get_mon_resources(envs)%>% dplyr::group_by(replication) %>% dplyr::summarise(mean = sum(head(system, -1) * diff(time)) / tail(time, 1))get.conf.int = function(x) t.test(x)$conf.intconf.intW = get.conf.int(t_system$mean)conf.intL = get.conf.int(n_system$mean)Perf_Measure<-c("W_theo", "W_all","L_theo", "L_all" )val<-c(mmc.W, mean(t_system$mean),mmc.L,mean(n_system$mean))CI<-c("", paste("(",conf.intW[[1]],",",conf.intW[[2]],")"), "",paste("(",conf.intL[[1]],",",conf.intL[[2]],")"))df<-data.frame(Perf_Measure,val,CI)print(df)#> Perf_Measure val CI#> 1 W_theo 0.2909091 #> 2 W_all 0.2908508 ( 0.290494852660408 , 0.29120666707597 )#> 3 L_theo 0.8727273 #> 4 L_all 0.8713228 ( 0.869786773106989 , 0.872858785898384 )
Finally, the inverse of the mean difference between arrivals is the effective rate, which matches (approx.) the real lambda because there are no rejections.
lambda; 1/mean(diff(subset(mmc.arrivals, finished==TRUE)$start_time))#> [1] 3#> [1] 2.998789
Exercise: Write an R code to create a 95% CI for Wq - the average waiting time in the queue.
lambda <- 3mu <- 4c<-2rho <- lambda/murho_c <- lambda/(c*mu) # = 5/8 <1mmc.trajectory <- trajectory() %>% seize("resource", amount=1) %>% timeout(function() rexp(1, mu)) %>% release("resource", amount=1)envs <- mclapply(1:50, function(i) { simmer() %>% add_resource("resource", capacity=c, queue_size=Inf) %>% add_generator("arrival", mmc.trajectory, function() rexp(1, lambda)) %>% run(100000/lambda) %>% wrap()}, mc.set.seed=FALSE)t_queue <- get_mon_arrivals(envs) %>% dplyr::mutate(t_queue = end_time - start_time -activity_time) %>% dplyr::group_by(replication) %>% dplyr::summarise(mean = mean(t_queue))get.conf.int = function(x) t.test(x)$conf.intconf.intWq = get.conf.int(t_queue$mean)mmc.Wq<- mmc.Lq/lambdaPerf_Measure<-c("W_theo", "W_all")val<-c(mmc.Wq, mean(t_queue$mean))CI<-c("", paste("(",conf.intWq[[1]],",",conf.intWq[[2]],")"))df<-data.frame(Perf_Measure,val,CI)print(df)#> Perf_Measure val CI#> 1 W_theo 0.04090909 #> 2 W_all 0.04079194 ( 0.0405972407551499 , 0.0409866460119913 )
Exercise: Write R code to create a 95% CI for Lq - the average length of the queue.
lambda <- 3mu <- 4c<-2rho <- lambda/murho_c <- lambda/(c*mu) # = 5/8 <1mmc.trajectory <- trajectory() %>% seize("resource", amount=1) %>% timeout(function() rexp(1, mu)) %>% release("resource", amount=1)envs <- mclapply(1:50, function(i) { simmer() %>% add_resource("resource", capacity=c, queue_size=Inf) %>% add_generator("arrival", mmc.trajectory, function() rexp(1, lambda)) %>% run(100000/lambda) %>% wrap()}, mc.set.seed=FALSE)n_queue <-get_mon_resources(envs)%>% dplyr::group_by(replication) %>% dplyr::summarise(mean = sum(head(queue, -1) * diff(time)) / tail(time, 1))get.conf.int = function(x) t.test(x)$conf.intconf.intLq = get.conf.int(n_queue$mean)Perf_Measure<-c("L_theo", "L_all")val<-c(mmc.Lq, mean(n_queue$mean))CI<-c("", paste("(",conf.intLq[[1]],",",conf.intLq[[2]],")"))df<-data.frame(Perf_Measure,val,CI)print(df)#> Perf_Measure val CI#> 1 L_theo 0.1227273 #> 2 L_all 0.1222089 ( 0.12155916138195 , 0.12285853908079 )
Exercise: Write R code to create a 95% CI for Ls - the average number of busy servers.