esquilax.ml.evo.BasicStrategy¶
- class esquilax.ml.evo.BasicStrategy(network_params: flax.typing.FrozenVariableDict | Dict[str, Any], strategy: evosax.Strategy, pop_size: int, centered_rank_fitness: bool = True, z_score_fitness: bool = False, maximize_fitness: bool = True, **strategy_kwargs)¶
Bases:
Strategy
Basic strategy derived from a Flax neural network
Wrapper around a strategy, with parameters initialised from a Flax neural-network.
- Parameters:
network_params – Flax network parameters.
strategy – Evosax strategy class.
pop_size – Strategy population size.
centered_rank_fitness – Use
centered_rank_fitness
for fitness-shaping, defaultTrue
.z_score_fitness – Use
z_score_fitness
for fitness-shaping, defaultFalse
.maximize_fitness –
Use
maximize_fitness
for fitness-shaping, defaultTrue
.Warning
Evosax expects that fitness should be minimised, so this should be
True
if the environment returns rewards to be maximised.**strategy_kwargs – Keyword arguments to pass to the strategy constructor.
- ask(key: chex.PRNGKey, evo_state: evosax.EvoState, evo_params: evosax.EvoParams) Tuple[chex.ArrayTree, evosax.EvoState] ¶
Sample parameters from the current strategy state
- Parameters:
key – JAX random key
evo_state – Strategy state
evo_params – Strategy parameters
- Returns:
Population array and updated state of the strategy
- Return type:
tuple[jax.numpy.array
,evosax.EvoState]
- default_params() evosax.EvoParams ¶
Get default strategy parameters
- Returns:
Strategy parameters
- Return type:
evosax.EvoParams
- initialize(key: chex.PRNGKey, evo_params: evosax.EvoParams) evosax.EvoState ¶
Initialise strategy state
- Parameters:
key – JAX random key
evo_params – Strategy parameters
- Returns:
Strategy state
- Return type:
evosax.EvoState
- reshape_params(population) chex.ArrayTree ¶
Reshape parameters
- Parameters:
population – Array of parameters sampled from the strategy.
- Returns:
Rescaled parameters
- Return type:
jax.numpy.array
- shape_rewards(population, fitness: chex.ArrayTree) chex.ArrayTree ¶
Reshape rewards
Rescale/reshape rewards generated during a simulation.
- Parameters:
population – Strategy population or collection of populations
fitness – Fitness/rewards (or collection of)
- Returns:
Rescaled fitness
- Return type:
chex.ArrayTree
- tell(population: chex.ArrayTree, fitness: chex.ArrayTree, evo_state: evosax.EvoState, evo_params: evosax.EvoParams) evosax.EvoState ¶
Update strategy state
- Parameters:
population – Population array
fitness – Fitness array
evo_state – Strategy state
evo_params – Strategy params
- Returns:
Update state
- Return type:
evosax.EvoState