src.http_balancer¶

It is recommended to check PI Controller documentation before going through this document.

Summary¶

This simulation investigates that a proportional-integral (PI) controller structure can be used to balance the work load of two HTTP servers which accepts requests from a single source.

The auto-generated PT net diagram is given below.

In the diagram,

\(p_0\) is producing the requests which are represented by tokens with ‘◆’ labels. The tokens are generated at a pre-determined rate.
\(e_1\) and \(e_2\) are consumers.
At \(p_1\), the graph is branching and \(p_1\) is redirecting the request to the first available branch found.
Transition \(t_{11}\) fires when \(k_1, p_1\) both have tokens. After it fires, the request at \(p_1\) is transfered to the input buffer \(p_{12}\) of consumer \(e_1\) through \(p_{11}\).
Transition \(t_{21}\) fires when \(k_2, p_1\) both have tokens. After it fires, the request at \(p_1\) is transfered to the input buffer \(p_{22}\) of consumer \(e_2\) through \(p_{21}\).
Finally, the requests are processed at consumers \(e_1\) and \(e_2\). And, each processing takes a random amount of time while the producer can generate tokens at a nearly constant rate which can be greater than average processing rate.

Control rule:

The places \(k_1\) and \(k_2\) are considered as controllers that aims to balance the work load of consumers. Because, \(t_{11}\) and \(t_{21}\) can not fire unless \(k_i\) allows. So, \(p_1\) is forced to redirect request to the enabled branch. After \(t_{i1}\) is fired, the token labeled by ‘○’ will loop back to \(k_i\) at the next step.

System description¶

The simulation consists of

Producer
TCP clients at consumers
HTTP servers
Conrollers

The whole implementation can be found at https://github.com/dmrokan/soyutnet-simulations/blob/main/src/http_balancer/main.py

Compared to the PI Controller simulation the only difference is the implementation of the producer and consumers.

Producer¶

In this case, the main asyncio loop starts a Uvicorn HTTP server.

    soyutnet.run(reg, extra_routines=[canceller(), uvicorn_main()])
    """Start simulation"""

A new token is generated when the HTTP server receives a request. The request data is binded to the token.

    treg = net.TokenRegistry()
    req_queue = asyncio.Queue()

    def new_http_request_token(scope, receive, send, cond):
        token = net.Token(label=L, binding=(scope, receive, send, cond))
        treg.register(token)

        return (token._label, token._id)

    async def uvicorn_app(scope, receive, send):
        if scope["type"] != "http":
            return
        cond = asyncio.Condition()
        token = new_http_request_token(scope, receive, send, cond)
        await req_queue.put(token)
        async with cond:
            await cond.wait()
        """Wait until endpoint fullfills HTTP request"""

Then, the token is injected to the PT net. However, only the label and ID of token travels through the net. The binded object is registered in the soyutnet.SoyutNet.TokenRegistry.

    async def producer(place):
        token = await req_queue.get()
        return [token]

    """Inject token"""

Consumers¶

Similar to the PI Controller simulation, consumers \(e_1\) and \(e_2\) receive tokens as a label and ID. Then they convert it to the actual token as given below.

        t0 = time.time()

        def dt():
            return time.time() - t0

        nonlocal consumer_stats
        t0 = time.time()
        ident = place.ident()
        index = int(place._name[1:]) - 1
        """Get branch index (0 or 1)"""
        sensor = sensors[index]
        if ident not in consumer_stats:
            """Initialize stats at first call of the producer."""
            consumer_stats[ident] = {"started_at": time.time(), "count": 0}
            """Store initial time and number of requests processed to calculate requests per second."""
            sensor.put_nowait((True, dt()))
            """Initial push to the controllers, otherwise they will stuck at waiting the sensor."""

        label = L
        token = place.get_token(label)
        T = time.time()
        if not token:
            consumer_stats[ident]["last_at"] = time.time()
            sensor.put_nowait((False, dt()))
            """If there is no new token in the buffer, inform the controller."""
            return

        actual_token = treg.pop_entry(*token)
        """Get actual SoyutNet.Token object from SoyutNet.TokenRegistry"""
        if actual_token is None:
            consumer_stats[ident]["last_at"] = time.time()
            sensor.put_nowait((False, dt()))
            """If there is no actual token in the register, inform the controller."""
            return

Then consumers redirect the HTTP request defined by the token to the actual HTTP servers running in children processes.

        uvicorn_scope, uvicorn_receive, uvicorn_send, cond = actual_token.get_binding()
        """Get object binded to the actual token"""
        await http_proxy(uvicorn_scope, uvicorn_receive, uvicorn_send)
        """Fulfill the request."""
        async with cond:
            cond.notify_all()
        """Inform uvicorn_app that request is replied"""

Finally, the HTTP response is sent to original source by await uvicorn_send(...) lines.

HTTP servers¶

The simulation runs two additional instances of Uvicorn HTTP servers in children processes which echo back the body of POST requests.

    async def read_body(receive):
        """
        Read and return the entire body from an incoming ASGI message.
        """
        body = b""
        more_body = True

        while more_body:
            message = await receive()
            body += message.get("body", b"")
            more_body = message.get("more_body", False)

        return body

    async def uvicorn_app(scope, receive, send):
        """
        Echo the request body back in an HTTP response.
        """
        body = await read_body(receive)
        delay_amount = rand()
        await asyncio.sleep(delay_amount)
        """Imitate a time consuming process by delay."""
        await send(
            {
                "type": "http.response.start",
                "status": 200,
                "headers": [
                    (b"content-type", b"text/plain"),
                    (b"content-length", str(len(body)).encode()),
                ],
            }
        )
        await send(
            {
                "type": "http.response.body",
                "body": body,
            }
        )

It imitates doing a time consuming work by sleeping.

Controllers¶

This simulations uses the same controller schemes in PI Controller simulation. Additionally, a new control scheme is implemented which is labeled by ‘C3’.

C3¶

The controller scheme C2 aims to make number of requests processed by two consumer equal by applying a PI control rule to the difference between the number of requests processed. It purposedly delays a branch if it is processing faster than the other.

‘C3’ considers the time consumed while processing requests instead of the number of requests. It tries to minimize the processing time for both consumers and also the difference between their total processing times.

    ci = [0.0, 0.0]
    """Integrator states"""
    Kp = 1e-2 if not K_PI else K_PI[0]
    """Propotional gain"""
    Ki = 1e-4 if not K_PI else K_PI[1]
    """Integrator gain"""
    Zi = 1e-2
    """Integrator damping"""
    count = [0, 0]
    """Total number of times the transitions t13 and t23 fire."""
    total_delay = [0.0, 0.0]
    """Total amount of time spent by consumers for completing HTTP requests."""

    async def controller(place):
        nonlocal ci
        if not CONTROLLER_ENABLED:
            """This happens when controller is chosen 'none'"""
            return True
        index = int(place._name[1:]) - 1
        """Get branch index."""
        sensor = sensors[index]
        value: tuple[bool, float] = await sensor.get()
        """Receive a notification from the consumer."""
        if CONTROLLER_TYPE == "C2":
            """This happens when controller is chosen 'C2'"""
            count[index] += 1
            err = count[index] - count[1 - index]
            """Calculate the difference between branches"""
            sleep_amount = Kp * err + ci[index]
            ci[index] = (1.0 - Zi) * ci[index] + Ki * err
            """PI controller"""
            if abs(sleep_amount) > 1e4:
                """This should never happen."""
                print("!!!", sleep_amount, "!!!")
                ci[index] = 0.0
            await net.sleep(sleep_amount)
            """Give a push to the other branch when it is slower."""
            return True
        elif CONTROLLER_TYPE == "C3":
            """This happens when controller is chosen 'C3'"""
            count[index] += 1
            total_delay[index] += value[1]

            # [[err-defs-start]]

            err = total_delay[index] - total_delay[1 - index]
            """Calculate the difference between branches"""
            err += 0.0 - total_delay[index]
            """Try to minimize the total time consumed."""

            # [[err-defs-end]]

            sleep_amount = 1e2 * Kp * err + ci[index]
            ci[index] = (1.0 - Zi) * ci[index] + 1e2 * Ki * err
            """PI controller"""
            if abs(sleep_amount) > 1e4:
                """This should never happen."""
                print("!!!", sleep_amount, "!!!")
                ci[index] = 0.0
            await net.sleep(sleep_amount)
            """Give a push to the other branch when it is slower."""
            return True

        return value[0]  # This is the case when controller is 'C1'.

Results¶

It is assumed that, the processing time of servers are modeled by an exponential random variable with an average processing delay of 0.01 seconds (100Hz).

Each simulation starts an ab (server benchmarking tool) process which sends 1000 POST requests with 1024 byte request body size and varying number of concurrent requests.

ab tool can save the results in CSV file with the structure below.

Percentage served,Time in ms
0,5.614
1,6.456
2,6.487
3,6.529
4,6.612
5,6.646
6,6.664

For example 5th line show that, 4% of requests replied in less than 6.65 milliseconds.

In summary, the same simulation run for three different controllers and several different number of concurrent requests and CSV files are obtained.

The figure below plots CSV files for one of the concurrency levels. The x axis shows the data in the second column of the CSV format given above. The y axis shows the first column divided by 100.

The plot resembles a cumulative normal distrbution. When the numerical derivate of y axis data is taken with respect to x axis data, the plots below is obtained for different number of concurrent requests.

_images/result_2.png — The x axis is time and the y axis is time distrbution for different number of concurrent requests. The integer values on the left of plots show the number of concurrent requests. As the number of concurrent requests increases, the average serving time increases.¶

_images/result_0.png — The difference between consumed requests of two HTTP servers for different control schemes. The x axis is the number of concurrent requests and the y axis shows the difference between the number of requests consumed by two HTTP servers and the total number of requests consumed.¶

Comments¶

Plots resemble a normal distrbution with varying mean and standard deviations.
Mean value of the serving time is smallest for controller type ‘C3’ and ‘C1’.
Mean is larger but deviation from mean is smaller for ‘C2’.
Controller ‘C3’ has a smaller mean serving time than ‘C1’ for larger number of requesters. Its response can be fine tuned by adjusting the contributions of error terms below

            err = total_delay[index] - total_delay[1 - index]
            """Calculate the difference between branches"""
            err += 0.0 - total_delay[index]
            """Try to minimize the total time consumed."""

Controller ‘C2’ performs better if closer to “deterministic” service time is required.
The second plot is very similar to the results of PI Controller
- The number of consumed requests are equal for ‘C2’.

Reproduce¶

sudo apt install python3-venv apache2-utils
python3 -m venv venv
source venv/bin/activate

make build
make build=http_balancer
make clean=http_balancer
make run=http_balancer
make results=http_balancer
make graph=http_balancer
make docs

Usage ¶

Submodules¶