src.http_balancer

It is recommended to check PI Controller documentation before going through this document.

Summary

This simulation investigates that a proportional-integral (PI) controller structure can be used to balance the work load of two HTTP servers which accepts requests from a single source.

The auto-generated PT net diagram is given below.

_images/graph.png

In the diagram,

  • \(p_0\) is producing the requests which are represented by tokens with ‘◆’ labels. The tokens are generated at a pre-determined rate.

  • \(e_1\) and \(e_2\) are consumers.

  • At \(p_1\), the graph is branching and \(p_1\) is redirecting the request to the first available branch found.

  • Transition \(t_{11}\) fires when \(k_1, p_1\) both have tokens. After it fires, the request at \(p_1\) is transfered to the input buffer \(p_{12}\) of consumer \(e_1\) through \(p_{11}\).

  • Transition \(t_{21}\) fires when \(k_2, p_1\) both have tokens. After it fires, the request at \(p_1\) is transfered to the input buffer \(p_{22}\) of consumer \(e_2\) through \(p_{21}\).

  • Finally, the requests are processed at consumers \(e_1\) and \(e_2\). And, each processing takes a random amount of time while the producer can generate tokens at a nearly constant rate which can be greater than average processing rate.

Control rule:

The places \(k_1\) and \(k_2\) are considered as controllers that aims to balance the work load of consumers. Because, \(t_{11}\) and \(t_{21}\) can not fire unless \(k_i\) allows. So, \(p_1\) is forced to redirect request to the enabled branch. After \(t_{i1}\) is fired, the token labeled by ‘○’ will loop back to \(k_i\) at the next step.

System description

The simulation consists of

  • Producer

  • TCP clients at consumers

  • HTTP servers

  • Conrollers

The whole implementation can be found at https://github.com/dmrokan/soyutnet-simulations/blob/main/src/http_balancer/main.py

Compared to the PI Controller simulation the only difference is the implementation of the producer and consumers.

Producer

In this case, the main asyncio loop starts a Uvicorn HTTP server.

642
643    soyutnet.run(reg, extra_routines=[canceller(), uvicorn_main()])
644    """Start simulation"""
645

A new token is generated when the HTTP server receives a request. The request data is binded to the token.

284
285    treg = net.TokenRegistry()
286    req_queue = asyncio.Queue()
287
288    def new_http_request_token(scope, receive, send, cond):
289        token = net.Token(label=L, binding=(scope, receive, send, cond))
290        treg.register(token)
291
292        return (token._label, token._id)
293
294    async def uvicorn_app(scope, receive, send):
295        if scope["type"] != "http":
296            return
297        cond = asyncio.Condition()
298        token = new_http_request_token(scope, receive, send, cond)
299        await req_queue.put(token)
300        async with cond:
301            await cond.wait()
302        """Wait until endpoint fullfills HTTP request"""
303

Then, the token is injected to the PT net. However, only the label and ID of token travels through the net. The binded object is registered in the soyutnet.SoyutNet.TokenRegistry.

321
322    async def producer(place):
323        token = await req_queue.get()
324        return [token]
325
326    """Inject token"""
327

Consumers

Similar to the PI Controller simulation, consumers \(e_1\) and \(e_2\) receive tokens as a label and ID. Then they convert it to the actual token as given below.

391
392        t0 = time.time()
393
394        def dt():
395            return time.time() - t0
396
397        nonlocal consumer_stats
398        t0 = time.time()
399        ident = place.ident()
400        index = int(place._name[1:]) - 1
401        """Get branch index (0 or 1)"""
402        sensor = sensors[index]
403        if ident not in consumer_stats:
404            """Initialize stats at first call of the producer."""
405            consumer_stats[ident] = {"started_at": time.time(), "count": 0}
406            """Store initial time and number of requests processed to calculate requests per second."""
407            sensor.put_nowait((True, dt()))
408            """Initial push to the controllers, otherwise they will stuck at waiting the sensor."""
409
410        label = L
411        token = place.get_token(label)
412        T = time.time()
413        if not token:
414            consumer_stats[ident]["last_at"] = time.time()
415            sensor.put_nowait((False, dt()))
416            """If there is no new token in the buffer, inform the controller."""
417            return
418
419        actual_token = treg.pop_entry(*token)
420        """Get actual SoyutNet.Token object from SoyutNet.TokenRegistry"""
421        if actual_token is None:
422            consumer_stats[ident]["last_at"] = time.time()
423            sensor.put_nowait((False, dt()))
424            """If there is no actual token in the register, inform the controller."""
425            return
426

Then consumers redirect the HTTP request defined by the token to the actual HTTP servers running in children processes.

430
431        uvicorn_scope, uvicorn_receive, uvicorn_send, cond = actual_token.get_binding()
432        """Get object binded to the actual token"""
433        await http_proxy(uvicorn_scope, uvicorn_receive, uvicorn_send)
434        """Fulfill the request."""
435        async with cond:
436            cond.notify_all()
437        """Inform uvicorn_app that request is replied"""
438

Finally, the HTTP response is sent to original source by await uvicorn_send(...) lines.

HTTP servers

The simulation runs two additional instances of Uvicorn HTTP servers in children processes which echo back the body of POST requests.

 64
 65    async def read_body(receive):
 66        """
 67        Read and return the entire body from an incoming ASGI message.
 68        """
 69        body = b""
 70        more_body = True
 71
 72        while more_body:
 73            message = await receive()
 74            body += message.get("body", b"")
 75            more_body = message.get("more_body", False)
 76
 77        return body
 78
 79    async def uvicorn_app(scope, receive, send):
 80        """
 81        Echo the request body back in an HTTP response.
 82        """
 83        body = await read_body(receive)
 84        delay_amount = rand()
 85        await asyncio.sleep(delay_amount)
 86        """Imitate a time consuming process by delay."""
 87        await send(
 88            {
 89                "type": "http.response.start",
 90                "status": 200,
 91                "headers": [
 92                    (b"content-type", b"text/plain"),
 93                    (b"content-length", str(len(body)).encode()),
 94                ],
 95            }
 96        )
 97        await send(
 98            {
 99                "type": "http.response.body",
100                "body": body,
101            }
102        )
103

It imitates doing a time consuming work by sleeping.

Controllers

This simulations uses the same controller schemes in PI Controller simulation. Additionally, a new control scheme is implemented which is labeled by ‘C3’.

C3

The controller scheme C2 aims to make number of requests processed by two consumer equal by applying a PI control rule to the difference between the number of requests processed. It purposedly delays a branch if it is processing faster than the other.

‘C3’ considers the time consumed while processing requests instead of the number of requests. It tries to minimize the processing time for both consumers and also the difference between their total processing times.

447
448    ci = [0.0, 0.0]
449    """Integrator states"""
450    Kp = 1e-2 if not K_PI else K_PI[0]
451    """Propotional gain"""
452    Ki = 1e-4 if not K_PI else K_PI[1]
453    """Integrator gain"""
454    Zi = 1e-2
455    """Integrator damping"""
456    count = [0, 0]
457    """Total number of times the transitions t13 and t23 fire."""
458    total_delay = [0.0, 0.0]
459    """Total amount of time spent by consumers for completing HTTP requests."""
460
461    async def controller(place):
462        nonlocal ci
463        if not CONTROLLER_ENABLED:
464            """This happens when controller is chosen 'none'"""
465            return True
466        index = int(place._name[1:]) - 1
467        """Get branch index."""
468        sensor = sensors[index]
469        value: tuple[bool, float] = await sensor.get()
470        """Receive a notification from the consumer."""
471        if CONTROLLER_TYPE == "C2":
472            """This happens when controller is chosen 'C2'"""
473            count[index] += 1
474            err = count[index] - count[1 - index]
475            """Calculate the difference between branches"""
476            sleep_amount = Kp * err + ci[index]
477            ci[index] = (1.0 - Zi) * ci[index] + Ki * err
478            """PI controller"""
479            if abs(sleep_amount) > 1e4:
480                """This should never happen."""
481                print("!!!", sleep_amount, "!!!")
482                ci[index] = 0.0
483            await net.sleep(sleep_amount)
484            """Give a push to the other branch when it is slower."""
485            return True
486        elif CONTROLLER_TYPE == "C3":
487            """This happens when controller is chosen 'C3'"""
488            count[index] += 1
489            total_delay[index] += value[1]
490
491            # [[err-defs-start]]
492
493            err = total_delay[index] - total_delay[1 - index]
494            """Calculate the difference between branches"""
495            err += 0.0 - total_delay[index]
496            """Try to minimize the total time consumed."""
497
498            # [[err-defs-end]]
499
500            sleep_amount = 1e2 * Kp * err + ci[index]
501            ci[index] = (1.0 - Zi) * ci[index] + 1e2 * Ki * err
502            """PI controller"""
503            if abs(sleep_amount) > 1e4:
504                """This should never happen."""
505                print("!!!", sleep_amount, "!!!")
506                ci[index] = 0.0
507            await net.sleep(sleep_amount)
508            """Give a push to the other branch when it is slower."""
509            return True
510
511        return value[0]  # This is the case when controller is 'C1'.
512

Results

It is assumed that, the processing time of servers are modeled by an exponential random variable with an average processing delay of 0.01 seconds (100Hz).

Each simulation starts an ab (server benchmarking tool) process which sends 1000 POST requests with 1024 byte request body size and varying number of concurrent requests.

ab tool can save the results in CSV file with the structure below.

Percentage served,Time in ms
0,5.614
1,6.456
2,6.487
3,6.529
4,6.612
5,6.646
6,6.664

For example 5th line show that, 4% of requests replied in less than 6.65 milliseconds.

In summary, the same simulation run for three different controllers and several different number of concurrent requests and CSV files are obtained.

The figure below plots CSV files for one of the concurrency levels. The x axis shows the data in the second column of the CSV format given above. The y axis shows the first column divided by 100.

_images/result_1.png

The plot resembles a cumulative normal distrbution. When the numerical derivate of y axis data is taken with respect to x axis data, the plots below is obtained for different number of concurrent requests.

_images/result_2.png

The x axis is time and the y axis is time distrbution for different number of concurrent requests. The integer values on the left of plots show the number of concurrent requests. As the number of concurrent requests increases, the average serving time increases.

_images/result_0.png

The difference between consumed requests of two HTTP servers for different control schemes. The x axis is the number of concurrent requests and the y axis shows the difference between the number of requests consumed by two HTTP servers and the total number of requests consumed.

Comments

  • Plots resemble a normal distrbution with varying mean and standard deviations.

  • Mean value of the serving time is smallest for controller type ‘C3’ and ‘C1’.

  • Mean is larger but deviation from mean is smaller for ‘C2’.

  • Controller ‘C3’ has a smaller mean serving time than ‘C1’ for larger number of requesters. Its response can be fine tuned by adjusting the contributions of error terms below

492
493            err = total_delay[index] - total_delay[1 - index]
494            """Calculate the difference between branches"""
495            err += 0.0 - total_delay[index]
496            """Try to minimize the total time consumed."""
497
  • Controller ‘C2’ performs better if closer to “deterministic” service time is required.

  • The second plot is very similar to the results of PI Controller
    • The number of consumed requests are equal for ‘C2’.

Reproduce

sudo apt install python3-venv apache2-utils
python3 -m venv venv
source venv/bin/activate

make build
make build=http_balancer
make clean=http_balancer
make run=http_balancer
make results=http_balancer
make graph=http_balancer
make docs

Usage

Submodules

src.http_balancer.main module

src.http_balancer.main.USAGE()

Arguments:

-T <time (sec)>

total simulation time in seconds (\(T\))

Default: 10

-c <none|C1|C2|C3>

controller type

Default: C1

-o <filename>

output file name to write results. If empty, prints to stdout.

-L <option>
change in server load by time

e.g. 0.2,2;0.8,0.33;

Means load on 2nd consumer will be two times more than 1st consumer after time instant \(0.2T\) until time instant \(0.8T\). Then, its load will be 1/3 of 1st consumer’s load until the end of simulation.

Default: “0,1;”

-p <rate (Hz)>

new token output rate of the producer at each second

Default: 10

-H hostname

Default: 127.0.0.1

-P ports

port1,port2

Default: 8888,8889

-r <option>
random number generator params

e.g. exponential,0.1

Meaning consumation time of a token takes a random amount of time with an exponential distrbution having an average of 0.1 seconds.

Default: exponential,0.5

-G

if provided, the script generates PT net graph and exits

-K
comman separated PI controller gain values

e.g. 1e-1,1e-2

Default: 1e-2,1e-4

-A ab command’s PID

-C number of concurrent requests expected

Example

python src/http_balancer/main.py -T 8.5 -r exponential,0.05 -p 100 -c none

src.http_balancer.main.server_main(args, cond)

src.http_balancer.results module

src.http_balancer.results.load_results()
src.http_balancer.results.main(argv)
src.http_balancer.results.plot_results(results)

Module contents