Improbable Icon

Load Balancing Strategy for Grid of Instances


#1

Simulation
In our simulation, we have a 5x5 grid of potential ‘world’ positions, giving usa max of 25 possible worlds at one time. In my current design, world’s are 2x2km, and there is a margin of 2km between the edges of adjacent worlds. This is so even when a player is on the edge of one world, looking in the direction of another one, it won’t be checked out (as long as ‘entity interest range’ is lower than 2km, which it is).

Instancing
To manage the instances, I have a ServerManager entity that sits at (0,0,0) and has a hard-coded EntityId of 1. Players begin their session by searching for lobbies via Steam. If a lobby is found, Player sends a join_server command to ServerManager, which sends the player to the respective world’s position. Otherwise, if no lobby is found, one is created, and the Player sends a create_server command to ServerManager, which spawns the world and sends Player to it.

In respect to the ServerManager, it’s working so far, but it doesn’t feel very ‘SpatialOS-y in design’ and I could see problems with high volumes of connecting users, so my ears are very open to suggestions!

World-instancing works pretty well when it’s run on one worker (singleton mode), as long as I don’t spawn too many entities inside them (each instance normally houses hundreds of entities), which assures me that the bulk of the logic is complete, and brings me to optimization:

When I began working on the load balancing settings, I started having trouble finding a load balancing strategy that results in reliably having multiple workers per instance.

With Auto Hex Grid:

-num_workers = 50: Lucky instances get 2 workers, sadly sometimes at the expense of an adjacent instance. Not enough workers to handle load-per-instance.

-num_workers = 100: Simulation will only spawn about 60 workers, and it’s in a bottom-up direction, so the top third of my simulation-space is left worker-less.

Similar results with Static Hex Grid

Dynamic Load Balancing

With dynamic load balancing, the workers seem to gravitate towards the center of the simulation, even when there are overloaded workers elsewhere. With the hex grid params, instance #1 will have a worker or two, but there will be many under-loaded workers never come to help, and instead converge into the middle of the simulation. random_params, sometimes results in lack of coverage for some instances.

Points of Interest

Adding each Instance spawn point to the list of POI’s resulted in each instance having only one worker. After re-reading the docs, I realized this was expected behavior, so this probably isn’t what I need.

Anything I’m missing?

I realize SpatialOS isn’t really built for instance-based games, but it’s immensely useful for the game we’re making.

Something that would be ideal, which I’ve heard of before, is a system that starts new deployments instead of instancing, and routing between them from a game launcher. If this could be possible, it seems it may be the right way to utilize SpatialOS’s power without compromising with a complicated instancing structure.

Sorry for the long read, hopefully it gave a good idea of where I’m at! :sweat_smile:


#2

Hi,

I have a similar entity to your ServerManager, do spawn more of them problem solved?

Also, you didn’t talk about what kind of worker you are using, which make me think you are using only one kind?

Splitting your logic into multiple workers may be required to get the performance you want.

My 0.02$


#3

Basil, my man - the questions you are asking are absolutely brilliant.
What I’d love to do - is get on a call with you, discuss it in person - and then write up the response here for everyone else.
that cool?
I’m slammed next week - but I’ll have someone reach out and we’ll get something organised ASAP.


#4

Ah thanks man!
That sounds perfect.
Because of deadlines, it would be amazing if you could squeeze me in sometime next week but if not I totally understand. Looking forward to it!


#5

To unblock you as fast as possible; if you want to guarantee that each “instance” will get at least two workers -> the only way to do it right now would be to use the static hex grid, and deliberately spawn an instance so that it overlaps several workers.
Chat to you soon!
Cal :support: