src.pi_controller¶

It is recommended to check SoyutNet documentation before going through this document.

Summary¶

This simulation investigates that a proportional-integral (PI) controller structure can be used to balance the work load of two TCP servers which accepts requests from a single source.

The auto-generated PT net diagram is given below.

In the diagram,

\(p_0\) is producing the requests which are represented by tokens with ‘◆’ labels. The tokens are generated at a pre-determined rate.
\(e_1\) and \(e_2\) are consumers.
At \(p_1\), the graph is branching and \(p_1\) is redirecting the request to the first available branch found.
Transition \(t_{11}\) fires when \(k_1, p_1\) both have tokens. After it fires, the request at \(p_1\) is transfered to the input buffer \(p_{12}\) of consumer \(e_1\) through \(p_{11}\).
Transition \(t_{21}\) fires when \(k_2, p_1\) both have tokens. After it fires, the request at \(p_1\) is transfered to the input buffer \(p_{22}\) of consumer \(e_2\) through \(p_{21}\).
Finally, the requests are processed at consumers \(e_1\) and \(e_2\). And, each processing takes a random amount of time while the producer can generate tokens at a nearly constant rate which can be greater than average processing rate.

Control rule:

The places \(k_1\) and \(k_2\) are considered as controllers that aims to balance the work load of consumers. Because, \(t_{11}\) and \(t_{21}\) can not fire unless \(k_i\) allows. So, \(p_1\) is forced to redirect request to the enabled branch. After \(t_{i1}\) is fired, the token labeled by ‘○’ will loop back to \(k_i\) at the next step.

Goal¶

Can you design a control law that makes the number of request processed by both consumers equal?

System description¶

The simulation consists of

Producer
TCP clients at consumers
TCP servers
Conrollers

The whole implementation can be found at https://github.com/dmrokan/soyutnet-simulations/blob/main/src/pi_controller/main.py

Producer¶

The producer (\(p_0\)) is assumed to generate requests (tokens) at a constant rate and transfers to \(p_1\) which will redirect it to the first available branch.

In the simulation, the latency on the paths are negligible compared to the production rate. However, it can be adjusted by LOOP_DELAY parameter below.

    net = SoyutNet()
    net.SLOW_MOTION = True
    net.LOOP_DELAY = 0

The producer logic is defined as below.

    token_id = 0

    async def producer(place):
        nonlocal token_id
        await net.sleep(PRODUCE_DELAY)
        token_id += 1
        return [(L, token_id)]

Async function producer is called in a dedicated asyncio task loop with the period given by PRODUCE_DELAY. The produced tokens are labeled by integer value L (namely ‘◆’).

Consumers¶

The consumer functionality is more complex because it communicates to a TCP server and also constantly notifies the controllers \(k_1\) and \(k_2\).

    sensors = [asyncio.Queue() for i in range(PROC_COUNT)]
    consumer_stats = {}

    async def consumer(place):
        async def echo_client():
            """Simple TCP echo client"""
            reader, writer = await asyncio.open_connection(HOST, PORTS[index])
            writer.write(MESSAGE)
            await writer.drain()
            data = await reader.read(MESSAGE_SIZE)
            writer.close()
            await writer.wait_closed()

        nonlocal consumer_stats
        start_time = 0
        ident = place.ident()
        index = int(place._name[1:]) - 1
        """Get branch index (0 or 1)"""
        sensor = sensors[index]
        if ident not in consumer_stats:
            """Initialize stats at first call of the producer."""
            consumer_stats[ident] = {"started_at": time.time(), "count": 0}
            """Store initial time and number of requests processed to calculate requests per second."""
            sensor.put_nowait(1)
            """Initial push to the controllers, otherwise they will stuck at waiting the sensor."""

        label = L
        token = place.get_token(label)
        T = time.time()
        if not token:
            consumer_stats[ident]["last_at"] = time.time()
            sensor.put_nowait(0)
            """If there is no new token in the buffer, inform the controller."""
            return

        await echo_client()
        """Fullfill the request."""

        sensor.put_nowait(1)
        """Inform the controller."""
        consumer_stats[ident]["count"] += 1
        consumer_stats[ident]["last_at"] = time.time()

TCP servers¶

The simulation starts two TCP servers which run in separate child processes and each is assigned to one of the consumers. The important part is given below.

    async def handle_echo(reader, writer):
        data = await reader.read(MESSAGE_SIZE)
        delay_amount = rand()
        await asyncio.sleep(delay_amount)
        """Imitate a time consuming process by delay."""
        writer.write(data)
        await writer.drain()
        writer.close()
        await writer.wait_closed()

It imitates doing a time consuming work by sleeping. The duration of sleep is assumed to be a random number with an adjustable mean value. The TCP servers can be made imbalanced by purposedly increasing the the average value of time delay for one of them.

Controllers¶

    ci = [0.0, 0.0]
    """Integrator states"""
    Kp = 1e-2 if not K_PI else K_PI[0]
    """Propotional gain"""
    Ki = 1e-4 if not K_PI else K_PI[1]
    """Integrator gain"""
    Zi = 1e-2
    """Integrator damping"""
    count = [0, 0]
    """Total number of times the transitions t13 and t23 fire."""

    async def controller(place):
        nonlocal ci
        if not CONTROLLER_ENABLED:
            """This happens when controller is chosen 'none'"""
            return True
        index = int(place._name[1:]) - 1
        """Get branch index."""
        sensor = sensors[index]
        value = await sensor.get()
        """Receive a notification from the consumer."""
        if CONTROLLER_TYPE == "C2":
            """This happens when controller is chosen 'C2'"""
            count[index] += 1
            err = count[index] - count[1 - index]
            """Calculate the difference between branches"""
            sleep_amount = Kp * err + ci[index]
            ci[index] = (1.0 - Zi) * ci[index] + Ki * err
            """PI controller"""
            if abs(sleep_amount) > 1e4:
                """This should never happen."""
                print("!!!", sleep_amount, "!!!")
                ci[index] = 0.0
            await net.sleep(sleep_amount)
            """Give a push to the other branch when it is slower."""
            return True

        return value > 0  # This is the case when controller is 'C1'.

The important lines are

value = await sensor.get()
"""Receive a notification from the consumer."""

The controller places \(k_i\) receive updates from consumers and operate accordingly.

none¶

When there is no control rule in work, the token labeled by ‘○’ loops through \(p_{i1}\) and \(k_i\) without any delay. The task loop of \(p_1\) receives a token from \(t_0\) at each loop and checks output arcs for availability. It always redirects the tokens to \(t_{11}\) because its always enabled when a new token arrives. At the end, all requests are redirected to the consumer \(e_1\).

C1¶

This control rule helps balancing TCP servers, because it waits for a notification from the consumer which informs that last request to the TCP is replied. So, it disables its own branch while waiting and lets \(p_1\) redirect the new token to the other branch.

C2¶

This one implements a PI control based approach. It aims to make the number of requests processed by both TCP servers nearly equal. It keeps track of the number of firings of transitions \(t_{13}\) and \(t_{23}\) and tries to make their difference as minimum as possible.

The measured metric is the number of firings and the control input is the sleep function. It delays the branch’s operation if it is going faster than the other branch.

The advantage of using the number of firings of \(t_{i3}\) as a measure instead of the number of tokens processed by the consumers is such that control rule is not effected by the random processing time of the TCP servers. It measures a partially deterministic metric. On the other hand, using the sleep function as the control input introduces an additional delay which limits the total number of processed requests.

Results¶

It is assumed that, the processing time of servers are modeled by an exponential random variable with an average processing delay of 0.01 seconds (100Hz). Each simulation takes nearly 3 seconds.

The simulations are run for all controller types and several token producing rates from 25Hz to 750Hz.

_images/result_12.png — The x axis shows the token producing frequency. The top plot shows the difference between number of tokens consumed by \(e_1\) and \(e_2\) at each simulation. The bottom plot shows the total number of tokens consumed at each simulation.¶

Comments:

At 25Hz producer rate, consumers process the requests faster than token producing rate so 1st branch is always available for redirecting new requests for all control rules. There is a big difference between the number of request processed by consumers.
As producer rate gets close to the average processing delay of consumers (100Hz), the gap between consumed requests goes to zero.
C2 is more successful to balance the number of request processed by consumers, at the expense of an additional delay in the control loop as explained in Section C2.
C1 achieves the best total number of processed requests score and can balance the consumer work loads.
When there is no control rule is present, all requests are piled up in the first consumer’s (\(p_{12}\)) input buffer. It produces the worst scenario among others.

P-controller¶

In this case, the integrator gain Ki is chosen as zero instead of the default value 1e-4, meaning that only proportional gain is active. The plot below shows the results for this case.

_images/result_22.png — The results when integrator is disabled.¶

Comments:

It shows the effectiveness of PI-controller scheme. Because, the controller of a branch (\(k_i\)) tracks the operation of the opposite branch much closely when integrator action is active.

Reproduce¶

sudo apt install python3-venv
python3 -m venv venv
source venv/bin/activate

make build
make build=pi_controller
make clean=pi_controller
make run=pi_controller
make results=pi_controller
make docs

Usage ¶

Submodules¶