I have found in this example that minimize_nested_blockmodel_dl can return a state in which the same partition is repeated across several levels (in this case levels 2-7 all have same two groups).
Removing one (or more) of these levels reduces (marginally) the model’s negative joint log-likelihood.

Can you please explain exactly what you mean by "removing one (or more) of these layers” and by how much the log-likelihood changes? Do you keep the total number of layers fixed, or is this reduced as well?

As a rule it’s important to show a minimal and complete working example that shows the problem. A vague description is not very helpful.

Thank you for the reply.
Firstly, I apologize as there was some confusion in my original post between the terms “layer” and “level” (I have now edited it).

What I mean by “remove the levels” is shown in the example in my original post: I copy the original state and pass as a bs parameter the hierarchical partition of the original state without the upper level (l=7 in the example). Maybe this is not the correct way to proceed.

The change in the log likelihood is also shown in the example. The variation is minimal (~1.4 nats).

Oh I had missed the second part of your example, I apologize!

The removal of the last level in your example only reduces the negative log-likelihood because the total length has changed — this also contributes to the likelihood.

i.e. replace the last level with with only one group, then the negative log-likelihood would increase.

So, it’s a correct but somewhat counterintuitive artefact of this model/data that it will fill up all the available levels, but if the total number is reduced, this improves the fit. The current code is based on the idea that the total number of levels is fixed, so the proper way to proceed is to constraint this a posteriori if repeated levels are found.