Skip to content

Optimization: server_on_change_hosts_pstate and server_on_turn_onoff_hosts early drops machines which already are at the target pstate.#76

Open
marcodamico wants to merge 1 commit into
oar-team:mainfrom
marcodamico:main
Open

Conversation

@marcodamico
Copy link
Copy Markdown
Contributor

Ported #75 to main branch.

…hosts early drops machines which already are at the target pstate.
@bleuse
Copy link
Copy Markdown
Collaborator

bleuse commented Apr 15, 2026

Hi @marcodamico, thanks for rebasing your PR to the main branch.

I am not sure this only an optimization: by not forwarding the seemingly useless pstate change, it seems to me we are changing what is simulated, and the visible behavior.

Could you detail why you think this change should be made here rather than checking in the EDC if an actual pstate change is required?

@marcodamico
Copy link
Copy Markdown
Contributor Author

marcodamico commented Apr 16, 2026

The idea is to skip the add_pstate_change functions for nodes that don't need a recalculation. This improved my simulations by a factor of 2.5x on 10000 nodes. I am not sure what you mean by EDC

@Mommessc
Copy link
Copy Markdown
Collaborator

Mommessc commented May 6, 2026

EDC is the External Decision Component, usually it is the scheduler process.

@marcodamico
Copy link
Copy Markdown
Contributor Author

Even if we filter the pstates in the EDC, batsim will keep calling add_pstate_change on all the machines.

@bleuse
Copy link
Copy Markdown
Collaborator

bleuse commented May 26, 2026

Looking at the code of add_pstate_change, this should not be the culprit for your performance issues as its cost is fairly constant on the number of machines.
Do you have any data related to this part, e.g., a flame graph?

Furthermore, I have the intuition that the speedup this PR achieves is coming from the fact you are requesting less operations (e.g., set_pstate) and triggering less logging from Simgrid.

As far as I understand what you are trying to do, your EDC/scheduler is requesting too many changes.
You should filter out irrelevant machines in the sent message.

Are you able to share the relevant parts of you EDC/scheduler code to help us understand what you are trying to achieve?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants