Flock

Multi-agent RL environment inspired by Reynolds popular boids model of animal flocks and swarms.

Agents attempt to remain close to other members of the flock whilst avoiding colliding with other agents.

See floxs.flock.Flock for details of the environment API.

Dynamics

The agent state consists of their position on the space, and velocity (represented in polar co-ordinates as a heading and speed). Each step agent positions are updated from their current velocity, and consequently their new rewards and observations generated.

The space is wrapped at the edges (i.e. it forms a torus).

Actions

Each agent can individually updated their velocity each step. Each agents actions is an array of two continuous values in the range [-1, 1], where the values represent [rotation, acceleration]. The action values are then scaled by the maximum rotation and acceleration parameters. In total the actions for the flock are given by an array of shape [n-agents, 2], representing the velocity update for each individual agent.

Rewards

Agents are individually rewarded based on their proximity to other agents in the flock:

  • A positive reward when a neighbour is within a fixed neighbourhood, summed over contributing neighbours

  • A fixed negative penalty when any agent collides

By default the reward provided by in range neighbours decrease exponentially with distance.

Rewards can be customised by implementing the floxs.flock.rewards.RewardFn interface.

Observations

By default each agent individually observes their local neighbourhood of the environment, as a segmented view. The view cone of each agent is divided into segments, with values representing the distance to the closest neighbour along a ray cast from the agent. In the case that no agent lies within range, then the default value is -1.

Observations can be customized by extending the default floxs.flock.observations.ObservationFn observation class.