Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify quantifier with preconditions for QP injectivity check #914

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

superaxander
Copy link
Contributor

Hi,
I've been looking into improving Silicon's performance on some of the VerCors case studies that I am trying out. Most of the bottlenecks that I run into have to do with the quantified permission support. I'm hoping to contribute more ideas/PRs in the future. (I do have some things in mind, but I haven't quite made them concrete yet)

The problem

When investigating a slow verification using SMTScope I noticed that the quantifier that is inhaled before the injectivity check for quantified permissions was instantiated unusually often. I had noticed this before on other files, but given the name fieldname-rcvrInj I had assumed it was a quantifier expressing injectivity. But, it is a quantifier that says that the preconditions of the functions used in the expression are true.

The improvement

It seemed to me that this quantifier would only be needed for this injectivity check. So I added a push/pop around it which worked great for the Viper test suite, but I found some examples in the VerCors test suite that broke due to the now missing knowledge. Here is a minimized example:

Example

domain Array  {
  function array_loc(a: Array, i: Int): Ref

  function alen(a: Array): Int

  function loc_inv_1(loc: Ref): Array

  function loc_inv_2(loc: Ref): Int

  axiom {
    (forall a: Array, i: Int ::
      { array_loc(a, i) }
      loc_inv_1(array_loc(a, i)) == a && loc_inv_2(array_loc(a, i)) == i)
  }

  axiom {
    (forall a: Array :: { alen(a) } alen(a) >= 0)
  }
}


field int: Int

function aloc(a: Array, i: Int): Ref
  requires 0 <= i
  requires i < alen(a)
  ensures loc_inv_1(result) == a
  ensures loc_inv_2(result) == i
  decreases
{
  array_loc(a, i)
}

function scale(amount: Perm): Perm
  requires amount >= 0 * write
  ensures result >= 0 * write
  decreases
{
  amount
}

predicate main(arr: Array) {
    (forall i: Int :: { aloc(arr, i) } 0 <= i && i < alen(arr) ==> acc(aloc(arr, i).int, scale(write) * write)) &&
    (forall i: Int :: { aloc(arr, i) } 0 <= i && i < alen(arr) ==> 0 <= aloc(arr, i).int)
}

With the added push/pop this breaks because when accessing aloc(arr, i).int in the last line it can no longer determine that scale(write) * write is greater than zero.

Of course, by extension, this example also fails with --assumeInjectivityOnInhale even without the changes from my PR.

Since I still wanted to speed up my case study I tried a few more options:

  • Generating triggers for this precondition quantifier with quantifierSupporter.autoTrigger
  • Simplifying the precondition quantifier by using only one set of quantified variables (because it was based on the injectivity check it had two sets of the quantified variables)
  • Simplifying the precondition quantifier and generating triggers for it

Results

I ran my file 10 times to get an idea of the performance difference between the different options. Note that the user/system times only count Silicon's execution time and not its child processes, i.e., Z3.

Normal without triggers

This is the baseline measurement

  Time (mean ± σ):     21.274 s ±  0.851 s    [User: 50.287 s, System: 4.702 s]
  Range (min … max):   20.205 s … 22.824 s    10 runs

With push/pop

This was the fastest for this example but as I said above there were some examples where this is less complete

  Time (mean ± σ):     17.693 s ±  0.421 s    [User: 41.324 s, System: 4.049 s]
  Range (min … max):   16.998 s … 18.239 s    10 runs

Normal with triggers

This did not succeed in verifying the example, I assume the triggers are either too restrictive or at least worse in some way compared to the one generated by Z3 itself.

Simplified without triggers

This made a pretty decent difference, I suppose it's mostly about introducing fewer terms through the instantiation of the quantifier. There is one VerCors example that runs out of memory in Z3 with this option but that is likely just because that example is overly complicated / needs improvement.

  Time (mean ± σ):     18.127 s ±  0.672 s    [User: 43.937 s, System: 4.371 s]
  Range (min … max):   16.991 s … 19.503 s    10 runs

Simplified with triggers

Interestingly generating triggers again performs worse than letting Z3 choose the triggers. For the overly complicated VerCors example mentioned previously here silicon runs out of memory while generating a trigger since it finds 120 candidate expressions that attempt to form triggers with for 6 variables. (I guess that's somewhere between 2^120 and 6^120 possible trigger sets to explore since the candidate expressions generally only contain one or two quantified variables) But again I don't think that is a significant problem because the example is just overly complicated, although maybe Silicon should have some sort of maximum iteration count.

  Time (mean ± σ):     20.523 s ±  1.343 s    [User: 49.857 s, System: 4.572 s]
  Range (min … max):   18.897 s … 22.150 s    10 runs

Summary

This PR currently has the simplified quantifier without triggers approach that I would be happy to see merged since it shows a marginal improvement on my case studies. I don't think the issue I presented here is in general related to what makes QPs slow, but it is some low-hanging fruit that was easy to tackle. However, I am also curious to hear the Viper developers' opinion on this change, the other options I discussed, and slow verification with quantified permissions in general. In particular, I am wondering if there is more low-hanging fruit like this and whether there are plans to improve things like the chunk ordering heuristics in the future.

@superaxander superaxander changed the title Restrict injectivity scope Simplify quantifier with preconditions for QP injectivity check Apr 1, 2025
@marcoeilers
Copy link
Contributor

Hey, thanks for trying that out!
I really wasn't aware that that assumption leads to any performance issues, this is interesting.

I'm not entirely sure I understand the push-pop approach. So you pushed, assumed the injectivity check's precondition, did the injectivity check, and popped again? And that led to some incompleteness? That seems really weird to me, why would that assumption be needed for anything else, that might just have uncovered some other incompleteness?

I'll have a look at your other data as soon as I can, but regarding your last question: Markus Limbeck (@mlimbeck) just finished his MSc thesis (https://ethz.ch/content/dam/ethz/special-interest/infk/chair-program-method/pm/documents/Education/Theses/Markus_Limbeck_MS_Description.pdf, the thesis isn't public yet but will be soon) and he explored various optimizations of Silicon's QP handling. In particular, he implemented an alternative QP algorithm that is in a sense more greedy, tries to use only heap chunks that are provably useful or required, and leaves the other ones alone. Overall, that tends to significantly reduce the reliance on chunk ordering heuristics (so it leads to much better performance when the heuristics don't do well), so hopefully that should also help for your use cases. I assume we'll merge that as a new command line option relatively soonish.

@superaxander
Copy link
Contributor Author

I really wasn't aware that that assumption leads to any performance issues, this is interesting.

Time-wise the difference seems relatively minor (at least for small examples) but on this example that I ran these benchmarks with the spread was between 140k and 80k quantifier instantiations. I think mostly this is due to the trigger chosen by Z3 having a lot of available terms elsewhere in the method. (and the assumption remains in scope throughout the entire method)

I'm not entirely sure I understand the push-pop approach. So you pushed, assumed the injectivity check's precondition, did the injectivity check, and popped again? And that led to some incompleteness? That seems really weird to me, why would that assumption be needed for anything else, that might just have uncovered some other incompleteness?

Yes, exactly that is what I did. I was also surprised by that, hence why included the example. It seems that the preconditions of functions in a permission expression are not assumed before checking if the permission amount is non negative, therefore it is relying on this assumption being added when the permission amount was originally inhaled.

I'll have a look at your other data as soon as I can, but regarding your last question: Markus Limbeck (@mlimbeck) just finished his MSc thesis (https://ethz.ch/content/dam/ethz/special-interest/infk/chair-program-method/pm/documents/Education/Theses/Markus_Limbeck_MS_Description.pdf, the thesis isn't public yet but will be soon) and he explored various optimizations of Silicon's QP handling. In particular, he implemented an alternative QP algorithm that is in a sense more greedy, tries to use only heap chunks that are provably useful or required, and leaves the other ones alone. Overall, that tends to significantly reduce the reliance on chunk ordering heuristics (so it leads to much better performance when the heuristics don't do well), so hopefully that should also help for your use cases. I assume we'll merge that as a new command line option relatively soonish.

Yeah I had seen the project description so I was curious how far along that was. I was also thinking that a more greedy approach might work well. In my experience whenever my examples stop verifying this is because the context has gotten big enough that the easy "check" assertions no longer suffice to show that enough chunks have been added together, therefore at that point we always continue all the way to the big assertion at the end. The consequence of this is that then all the chunks in the current heap get an additional pTaken term and this adds up over time until even the big assertion takes too long for our timeout. Even when the heuristics work correctly and give the optimal order this issue still occurs.

I was thinking that a greedy algorithm where the big assertion is already done after all the "good" chunks (determined by the heuristic) have been added together could work well. Since that might mean that the unrelated chunks are not polluted with extra terms as much anymore. I'm curious how that compares to the greedy algorithm you describe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants