#bitcoin-core-pr-reviews Log

17:00:15 <marcofleon> #startmeeting 
17:00:15 <corebot`> marcofleon: Meeting started at 2025-06-11T17:00+0000
17:00:16 <corebot`> marcofleon: Current chairs: marcofleon
17:00:17 <corebot`> marcofleon: Useful commands: #action #info #idea #link #topic #motion #vote #close #endmeeting
17:00:18 <corebot`> marcofleon: See also: https://hcoop-meetbot.readthedocs.io/en/stable/
17:00:19 <corebot`> marcofleon: Participants should now identify themselves with '#here' or with an alias like '#here FirstLast'
17:00:44 <sliv3r__> Hi!
17:01:04 <stickies-v> hi
17:01:11 <monlovesmango> heyy
17:01:27 <sipa> hi
17:01:47 <marcofleon> welcome! we'll be going over https://bitcoincore.reviews/30605
17:02:01 <marcofleon> litte re run of last week, we can actually go over the questions this time
17:02:26 <marcofleon> Did you review the PR/notes? Concept ACK, approach ACK, tested ACK, or NACK?
17:02:55 <stickies-v> didn't review, just lurking today
17:03:06 <marcofleon> lurkers are welcome
17:03:14 <sliv3r__> tried to :) was not an easy one
17:03:15 <monlovesmango> yes a bit. tested too.
17:03:53 <marcofleon> agreed, the cluster linearization code can be tough
17:04:02 <sipa> i hope the testing code for it is easier!
17:04:38 <marcofleon> monlovesmango: ran the fuzz tests? nice! I guess we can go over that a bit with question 4
17:04:48 <marcofleon> okay, let's dive in
17:04:58 <marcofleon> What are the two new fuzz targets introduced and what are they testing? In terms of what they’re testing, how are they different from the other targets?
17:05:34 <sliv3r__> clusterlin_simple_finder and clusterlin_simple_linearize
17:06:14 <sipa> ✔️
17:06:24 <sliv3r__> here I have a question bc I though ones were testing the functionality and the other ones were testing the tests but when I checked the functions I saw that were just simpler re-implementations of the ones found in cluster_linearize.h
17:06:35 <sliv3r__> why is that?
17:07:14 <monlovesmango> clusterlin_simple_finder and clusterlin_simple_linearize. they are different than other targets bc they are fuzz testing functions implemented for fuzz tests
17:07:30 <sipa> sliv3r__: let me explain
17:07:40 <marcofleon> sliv3r__: That could be related to the next question actually
17:07:50 <marcofleon> in terms of what are the benefits of doing that
17:08:21 <sliv3r__> marcofleon: that could be the reason why I wrote IDK as an answer to the next one in my notes :)
17:08:36 <sipa> sliv3r__: the short answer is that SimpleCandidateFinder, ExhaustiveCandidateFinder, SimpleLinearize are part of the test suite, not of production code, even though they mimick the functionality of production code funcstions/classes
17:09:09 <sipa> the point is that the production code is complicated, which means it may be hard for reviewers to get confidence in their correctness
17:09:54 <sliv3r__> and how can we be sure that the behaviour is equal or at least equivalent?
17:10:01 <marcofleon> and so the simpler implementations provide an oracle to check against
17:10:07 <sipa> one tool that we (or i, as author, in this case) have available to make it easier on reviewers, is by writing a simpler, more obviously correct, but potentially slower/less feature-rich reimplementation that does the same thing, plus tests that they are identical
17:10:37 <sipa> so then if you as a reviewer have confidence in the reimplementation + the equivalence test, you also gain confidence in the original implementation
17:11:15 <sipa> sliv3r__: well that is what the fuzz tests do: run a scenario (controlled by the fuzzer) of operation to both the production version and the simpler version, and see they produce the same result
17:11:32 <sipa> that's not a proof of course, but it is one tool that can increase confidence if no such differences are found
17:11:40 <sliv3r__> make sense
17:11:52 <sipa> if you want to get an idea of how good it is, you can try inserting deliberate bugs and see how long it takes for the fuzzer to find it
17:11:54 <abubakarsadiq> hi
17:12:18 <marcofleon> abubakarsadiq: welcome
17:12:46 <sliv3r__> nice thanks, more clear now
17:12:59 <sipa> and in this case, even the "simpler" version is fairly nontrivial, so in addition, there is a third, *even* simpler and *even* slower version implemented
17:13:07 <marcofleon> i'll move on to question 4, unless there's more benefits other than increasing confidence of correctness? for question 3 I mean
17:13:10 <sipa> where the current fuzz tests compare all 3 with each other
17:13:24 <marcofleon> That's the exhaustive one yes?
17:13:25 <sipa> marcofleon: yeah
17:13:33 <monlovesmango> I think that makes sense for clusterlin_simple_finder bc it doesn't seem to compare with other prod code, but for clusterlin_simple_linearization it is comparing to ChunkLinearization which doesn't necessarily feel simpler
17:13:35 <sipa> marcofleon: there are 2 dimensions i think
17:14:10 <sipa> marcofleon: there is the candidate-finding logic, where we have SearchCandidateFinder (prod), SimpleCandidateFinder (simpler), ExhaustiveCandidateFinder (simplest)
17:14:48 <sipa> and there is the linearization logic on top, where there are also 3 version: Linearize (prod), SimpleLinearize (simpler), and the std::next_permutation based exhaustive search (simplest, but not a separate function)
17:15:23 <abubakarsadiq> sipa: haven't look yet, but after SFL simple candidate finder would be changed to match the new algorithm yeah?
17:16:04 <sipa> abubakarsadiq: there is no SearchCandidateFinder anymore post-SFL; SFL is a replacement for Linearize directly, which doesn't involve candidate finding anymore
17:16:14 <marcofleon> So building layers of trust essentially, makes sense
17:16:40 <sipa> SimpleCandidateFinder and ExhaustiveCandidateFinder stay, but they're just used in/for SimpleLinearize, which the SFL-based Linearize is then compared against
17:16:55 <sipa> kind of vestigial
17:17:04 <abubakarsadiq> So we will use exhaustive search to validate correctness of SFL?
17:17:12 <sipa> that would be a possibility!
17:17:22 <marcofleon> monlovesmango if you ran the fuzz tests a bit, question 4: Were you able to run the clusterlin_linearize target? How many iterations (the first number libFuzzer shows at each line) did it take to produce a crash after making the s/chunking/simple_chunking/ change?
17:17:30 <monlovesmango> sipa: ok std::next_permutation was what I was missing
17:17:54 <sipa> abubakarsadiq: but i don't think it gains us very much, as it's pretty non-trivial anyway, so the "added confidence" isn't as much as compared with the much simpler reimplementations gives us
17:17:55 <abubakarsadiq> Or the permutation of the graph linearizations I think will be more valid candidate
17:17:59 <monlovesmango> marcofleon: I actually had a question about that. I did get the failure, but I was using the python test runner and it doesn't show any output with the counts, just the error
17:18:08 <sliv3r__> marcofleon: it took for me 566593 iteartions
17:18:24 <monlovesmango> is there another way I should be running individual tests?
17:18:32 <sipa> i always run the tests individually
17:18:42 <sliv3r__> I did it running: FUZZ=clusterlin_linearize build_fuzz_nosan/bin/fuzz
17:18:46 <sipa> that ^
17:19:01 <marcofleon> exactly, yeah
17:19:11 <monlovesmango> sliv3r__: ok that helps thank you :)
17:19:12 <sipa> i usually add -use_value_profile=1 -workers=30 -jobs=30, to get 30 concurrent processes, with more tendency to explore more paths
17:19:16 <marcofleon> and yeah ~500k iterations was where i was at too. It crashes pretty quickly
17:19:49 <sliv3r__> monlovesmango: in the fuzzing docs you have an example at the very begining https://github.com/bitcoin/bitcoin/blob/master/doc/fuzzing.md
17:20:06 <marcofleon> And then running it like how sliv3r__ said you can add a directory at the end to save the inputs if you'd like. Keep a corpus for later
17:20:47 <sliv3r__> marcofleon: Then it says something like: Test unit written to ./crash-1cea6b51b877d277ba4d1ba7f522b7d3ac182349
17:20:53 <sliv3r__> what's that? bc it's impossible to human-read it :)
17:20:59 <sipa> marcofleon: maybe it makes sense to move the std::next_permutation based logic to an ExhaustiveLinearize function
17:21:26 <marcofleon> sipa: it could make it clearer, agreed
17:21:42 <marcofleon> sliv3r__: that's the input that caused the crash
17:21:46 <sipa> sliv3r__: it is the input to the fuzz harness, which interprets bytes from it (each of the provider.ConsumeIntegral... functions decodes some bytes from it)
17:22:03 <marcofleon> if you run the fuzz command and that input after, it should reproduce, hopefully
17:22:04 <sipa> you can re-run it, using FUZZ=clusterlin_linearize build_fuzz_nosan/bin/fuzz ./crash-1cea6b51b877d277ba4d1ba7f522b7d3ac182349
17:22:26 <sipa> which will not do any fuzzing, just run that one fuzz input through the harness
17:22:41 <monlovesmango> sipa: I think ExhaustiveLinearize would be good and consistent
17:22:43 <sliv3r__> yep runs one iteration and crashes instantly
17:23:06 <sipa> but now you can also modify the code, attempt to fix the bug, and retry with just that case
17:23:06 <marcofleon> that's when the debugging starts
17:23:20 <marcofleon> question 5: In commit 2763b75, why was --iterations_left moved?
17:23:33 <monlovesmango> sliv3r__: thank you I see that example now... I think I didn't really know what that param did bc I didn't realize 'process_message' was a test
17:24:14 <marcofleon> monlovesmango: sorry if I moved on too quickly! Yeah, the thing after FUZZ= is the fuzz target
17:24:26 <monlovesmango> marcofleon: no you're good!
17:24:31 <sliv3r__> sipa: right setting it back to chunking works :P
17:24:49 <sliv3r__> are we at q 5?
17:25:16 <marcofleon> we have a bunch of them. all looking something like "FUZZ_TARGET(clusterlin_depgraph_sim)" for example and then the test itself
17:25:44 <marcofleon> yeah, i'll paste again
17:25:45 <marcofleon> In commit 2763b75, why was --iterations_left moved?
17:25:57 <monlovesmango> it was moved bc otherwise it decrements for splits that don't actually need to be considered (which I imagine might also mess with the result being optimal assumption)
17:26:58 <sipa> i think the answer here is kind of nuanced
17:27:05 <marcofleon> I'm actually not sure if it was a bug fix per se... I think just an optimization?
17:27:05 <sipa> but it also doesn't matter much
17:27:23 <sliv3r__> If I didn't understand wrong we want to only reduce the counter when we add a new "valid" subset. So the number of iterations is proportional to the number of different connected subsets that cna be created instead to be realted to the total num of elements in the queue
17:27:56 <sipa> so the idea is that we want some kind of "cost metric" to control how much time is spent inside SimpleCandidateFinder, just so that it does not run forever, but can stop it after "too much" work has been performed
17:28:16 <sipa> it doesn't really matter what that metric is, as long as it's roughly proportional to runtime
17:28:34 <sipa> the metric that was being used so far was "queue elements processed"
17:28:57 <sipa> moving the --iterations_left is really changing the metric to something else: the number of candidate sets considered
17:29:08 <monlovesmango> interesting
17:29:26 <sipa> they're very similar, but each candidate set being considered can give rise to two queue elements
17:29:37 <sipa> and there is an initial queue element on the stack that's not a candidate set
17:29:51 <sipa> so the old metric is at most 2*N+1, where N is the new metric
17:30:30 <marcofleon> In clusterlin_simple_finder, when finding a topologically valid subset to remove from the graph, why is it important to set non_empty to true? What could happen if we allowed empty sets?
17:30:40 <sliv3r__> infinite loop
17:30:50 <sipa> so why change is: the new one is easier to reason about, because it's easy to count how many candidate sets a cluster can have (2^(N-1))
17:31:10 <sipa> so that's something that can be concretely tested for: if the limit is set to at least 2^(N-1), the optimal must be found
17:31:53 <sipa> makes sense?
17:32:00 <monlovesmango> seems like a better metric. yes.
17:32:07 <sipa> it's not a big change, just making things slightly easier to reason about
17:32:16 <sliv3r__> it makes more sense
17:32:17 <sipa> but the most important thing is that there is some metric, and it's being hit
17:33:35 <marcofleon> sure, I do thiink it makes sense to have it be only when a valid transaction is found
17:33:53 <marcofleon> a better metric, then just next in the queue
17:34:01 <marcofleon> *than
17:34:14 <sipa> *candidate set, not transaction
17:34:26 <marcofleon> valid set, sorry yeah
17:35:15 <abubakarsadiq> We have to find something to remove in the prior commit we evict the found transactions when the it's empty
17:35:47 <abubakarsadiq> infinite loop if we don't remove anything
17:35:52 <marcofleon> for q6, nice yeah
17:36:22 <sliv3r__> yeah so basically we have an empty one the next iteration is equal to the current one so loop loop loop :)
17:36:28 <marcofleon> it would just be the same set if nothing is removed
17:36:30 <sliv3r__> if we have*
17:36:32 <sipa> abubakarsadiq: (bonus) how likely would it be that we *keep* removing nothing, because just occasionally not removing anything wouldn't be an infinite loop?
17:38:05 <abubakarsadiq> Yeah highly unlikely
17:38:13 <sipa> actually no, extremely likely :p
17:38:22 <sliv3r__> I'm lost haha
17:38:25 <sipa> it would happen anytime you hit the end of the fuzz input
17:38:42 <sipa> because once you hit the end, ConsumeIntegral and friends will keep forever returning 0
17:38:47 <sipa> which would be interpreted as the empty set
17:39:13 <sipa> fuzz input is **not** random
17:39:21 <marcofleon> oh yes of course, there'd be no more data from the provider at some point
17:39:28 <sipa> it's hopefully better than random, but it can also be a lot worse
17:40:09 <sipa> abubakarsadiq: sorry, that was a trick question :)
17:40:26 <marcofleon> In clusterlin_simple_linearize, why does the code only verify against all possible permutations when depgraph.TxCount() <= 8? What would happen if we tried to do this for larger transaction counts?
17:40:46 <monlovesmango> very thought provoking trick question
17:41:11 <monlovesmango> the exhaustive search would start taking too long
17:41:22 <sliv3r__> I think I can answer for TxCount <= 7 bc I don't understand the optimization included in cce29ef63ecf114003b529093fdfd2574b830afc
17:41:27 <abubakarsadiq> was wondering the reason you choose to evict randomly. For testing weird and all kind of behaviors yeah. Because the found txs can also be the read transaction
17:41:52 <abubakarsadiq> sipa: it's okay I learn something :P
17:42:04 <marcofleon> We can throw in the next question, it's related. In commit 1e4f345, when a non-topological permutation is found, the code now fast forwards by reversing part of the permutation. Why does this optimization work, and how many permutations can it skip in the best case?
17:42:13 <sliv3r__> So basically the cost would be huge as the possible permutations is N!. For 7 would be 5050 and higher numbers increase the number heavily
17:42:21 <marcofleon> i tried with 9 and the execs/sec were about 4x slower
17:43:30 <sliv3r__> But with 1e4f345 the worst case still being 8! which is huge right? Why is this acceptable if it wasn't before? (bc we had 7)
17:43:48 <sipa> yeah, that's the answer... the number of iterations can grow factorially
17:44:04 <sipa> sliv3r__: just 8! = 40320, that's not enormous
17:44:22 <sipa> fuzz tests work best when they stay above 100-1000 attempts per second
17:44:39 <sliv3r__> why was it 7 before if 8 is ok?
17:44:47 <sipa> hmm?
17:45:15 <sipa> oh, with the optimization there are just fewer permutations searched through
17:45:18 <marcofleon> The optimization of skipping all invalid permutations makes it possible to bump up
17:45:31 <marcofleon> sorry, i should be more specific
17:45:45 <sipa> it's not actually possible for all 8! permutations to be topological, so many/most of them will be skipped by the optimization
17:46:07 <sipa> and the numbers 7 and 8 are just set based on benchmarks... which ones make the test too slow
17:46:29 <marcofleon> all the permutations, after having found an invalid prefix yes?
17:46:51 <sipa> yes, it'll skip all the permutations with the *same* invalid prefix
17:46:53 <sliv3r__> oh ok so there's no "human logic" to that :) I was getting too picky with the numbers
17:47:21 <abubakarsadiq> It's human logic :p
17:47:31 <sipa> so the number of permutations tried is equal to the number of topologically-valid ones, plus the number of topologically-invalid minimal prefixes
17:47:43 <sipa> i don't have numbers for what those are, it just works in practice (TM)
17:47:52 <marcofleon> and how many permutations can it skip in the best case then, just to complete the question
17:48:16 <marcofleon> also sipa: did you ever figure out how many permutations can be tried in total knowing that there are K valid ones?
17:48:24 <sipa> marcofleon: i have not
17:48:35 <abubakarsadiq> Depends on the cluster
17:48:44 <marcofleon> ^ that's what i was thinking too
17:48:45 <sipa> i thought it'd be easy to find a formula, but it does not seem simple at all
17:48:57 <sipa> after some experimentation
17:49:09 <sipa> but it skips a lot in practice, which is enough :p
17:49:16 <monlovesmango> 7! ?
17:49:26 <sliv3r__> sipa: why is not possible for all 8! permutations be topological?
17:49:41 <sipa> sliv3r__: that would imply there are no dependencies in the cluster, so it isn't a cluster
17:49:57 <marcofleon> monlovesmango: basically yeah, although I think you would subtract one from that
17:49:59 <sliv3r__> oh ok that was a stupid question :P
17:50:30 <sipa> sliv3r__: every dependency is a "parent must come before child in topological orderings" constraint, so easy additional dependency reduces the number of valid ones
17:50:34 <monlovesmango> whats the minus 1 for?
17:50:37 <sipa> by how much depends on the topology
17:50:46 <marcofleon> (n-1)! - 1 for n txs
17:51:04 <sipa> you skip from the first topologically-invalid order with a given prefix to the last one with the same prefix
17:51:05 <marcofleon> the -1 is the one tx that you did just try
17:51:16 <sipa> you don't immediately skip to the next prefix
17:51:53 <monlovesmango> marcofleon: makes sense thanks!
17:52:08 <sipa> i think we never answered question 3?
17:52:35 <marcofleon> What are the benefits of separating out the fuzzing of the internal test helpers?
17:52:51 <marcofleon> I thought we touched on it, but that's true, never explicitly answered
17:53:02 <monlovesmango> oh also I was able to test the fuzz crash and got 340233
17:53:17 <marcofleon> nice. I love fuzz crashes
17:53:33 <sliv3r__> 😂
17:53:46 <sipa> marcofleon: fwiw, i suspect the vast majority of the fuzz crashes in these PRs happen before the PR is opened
17:53:47 <monlovesmango> I think it help instill confidence on what is actually being tested
17:53:51 <sipa> as in, i use them to help development
17:54:40 <sipa> monlovesmango: maybe, but nothing really changes there... take Search/Simple/ExhaustiveCandidateFinder... before we had one fuzz test that compared all three (subject to cluster-not-too-large constraints etc)
17:54:53 <sipa> now we have two tests, one comparing Search with Simple, one comparing Simple with Exhaustive
17:55:01 <sipa> what's the benefit of the split?
17:55:07 <sliv3r__> Avoid redundancy
17:55:12 <sipa> yes!
17:55:22 <monlovesmango> yes for me thats much easier to reason about and so for me I can be more confident in comparing Search with Simple
17:55:23 <marcofleon> sipa: makes sense, it's a fairly easy and quick way to help initial bugs surface
17:55:39 <sliv3r__> and also easier to review the code
17:55:49 <sipa> if you're already confident in SimpleCandidateFinder, you have no "need" to waste computation effort on running the Simple/Exhaustive comparison too
17:56:15 <sipa> if you're not, you should be focusing on just establishing that confidence first without putting effort into comparing with the production code
17:56:44 <sipa> (whether that means more review, running fuzz tests, attending a review club, ...)
17:57:14 <sipa> so i feel the goals of what you obtain from these tests is different, and it makes sense to separate them for that reason
17:57:16 <marcofleon> Yeah, I think it's good to have the separate targets here
17:57:19 <sipa> but yes, clearer code too :0
17:58:06 <sipa> i will need to drop off now, but thanks for hosting marcofleon, and thanks all for spending time looking at my PR!
17:58:22 <sliv3r__> thanks for the explanations :)
17:58:23 <monlovesmango> thanks sipa!
17:58:28 <marcofleon> thanks for all your help sipa
17:58:41 <abubakarsadiq> thanks for hosting, thanks for your work sipa
17:58:47 <marcofleon> okay let's jump into SFL then?
17:59:01 <marcofleon> :P
17:59:12 <abubakarsadiq> We have txgraph reorg waiting though
17:59:16 <marcofleon> another day. Thanks for attending everybody
17:59:18 <abubakarsadiq> It's also juicy
17:59:43 <marcofleon> oh yeah I kinda forgot about that one, good call
17:59:56 <sliv3r__> Could you guys share the answer for the last-1 question? Didn't get a clear answer of that
18:00:14 <monlovesmango> I think for last question inc and und are not complementary
18:01:01 <sliv3r__> My idea checking the algorithm was that for inclusion if you don't have inc you may try to add a tx without an ancestor being in the anc set
18:01:06 <marcofleon> sure, maybe Abubakar has a clearer way to look at it, but I sort of initially thought that it was possible to get the inc with only having und
18:02:49 <marcofleon> but then I realized that in order to keep track of feerate, you would need inc I believe
18:03:07 <sipa> it's really a tristate: undecided, included (definitely in the candidate set), excluded (definitely not in the candidate set)
18:03:09 <monlovesmango> one of the sets that you add back to the queue has und - descendants(split), so I don't know how you would figure out inc from that queue set
18:03:20 <sipa> excluded is just implicitly everything not included and not undecided
18:04:03 <monlovesmango> tristate is a good way of phrasing it
18:05:07 <marcofleon> could you implicitly have something like included = m_todo - undecided - excluded?
18:07:27 <monlovesmango> would that be better? I'm not sure
18:08:53 <marcofleon> No probably not, that's just what prompted the question for me. Better to keep track of included explicitly
18:09:59 <sliv3r__> I think keeping inc explicitly is needed bc tx have dependencies between them, not explicitly knowing what is inc may cause you try to add a tx without some ancestor in the inc
18:11:21 <marcofleon> Right that makes sense
18:11:38 <marcofleon> still learning a lot about the clusterlin stuff, it's not easy
18:12:00 <sliv3r__> no it's not, this was first time I read about it and also about fuzz, too many new things in one pr :)
18:12:00 <marcofleon> sliv3r__ i saw your question on the PR and I believe smallest set breaks the tie
18:12:36 <marcofleon> And that's good to know actually, I'll keep that mind. I did realize this review club was trying to do two things at once, which didn't help
18:13:27 <sliv3r__> marcofleon: if that's the case would it make sense to assert the size?
18:13:38 <monlovesmango> this is my second fuzz pr review and still learning... haha
18:13:57 <sliv3r__> marcofleon: nono it was great :) double knowledge
18:14:14 <monlovesmango> sliv3r__: yes agree
18:15:59 <marcofleon> fuzzing is awesome, it's a great tool
18:16:35 <sliv3r__> Last question answer was: just release the code and let users complain and say that Core introduces bugs right (as it's the new tred saying that)? :P
18:16:46 <marcofleon> sliv3r__ I at least know in cluster_linearize.h FindCandidateSet says smallest set is the tiebreaker
18:17:31 <marcofleon> oh yeah haha q10 What is, no doubt, the coolest way to test code?
18:18:01 <sliv3r__> marcofleon: Yeah I saw the comment in the code. Just added a line in my prev comment so sipa can see it and say
18:19:07 <marcofleon> glad you two were able to fuzz it a bit. But yeah monlovesmango I recommend doing the indivdual targets like how we said earlier. The test runner is fine for running through a corpus of inputs once
18:19:55 <monlovesmango> yeah I agree its much better. I learned a lot more about fuzz best practices
18:20:12 <monlovesmango> thank you all for walking me through that :)
18:20:16 <sliv3r__> me too this was actually really usefull
18:20:35 <marcofleon> Glad to hear it. Okay that's it for me. Thanks again for coming, see you both soon!
18:20:50 <sliv3r__> Thanks for hosting!
18:21:29 <monlovesmango> thanks for hosting marcofleon !! was a good one
18:26:15 <sliv3r__> marcofleon: I think you forgot to end the meeting ;)
18:27:30 <marcofleon> #endmeeting