17:00:09 <stickies-v> #startmeeting 17:00:09 <corebot> stickies-v: Meeting started at 2025-05-14T17:00+0000 17:00:10 <corebot> stickies-v: Current chairs: stickies-v 17:00:11 <corebot> stickies-v: Useful commands: #action #info #idea #link #topic #motion #vote #close #endmeeting 17:00:12 <corebot> stickies-v: See also: https://hcoop-meetbot.readthedocs.io/en/stable/ 17:00:13 <corebot> stickies-v: Participants should now identify themselves with '#here' or with an alias like '#here FirstLast' 17:00:29 <stickies-v> hey everyone, welcome to the review club! 17:00:36 <stringintech> Hi! 17:00:43 <brunoerg> hi 17:00:43 <Davidson> Hi 17:00:48 <monlovesmango> hey 17:00:50 <marcofleon> yo 17:00:53 <JoaoLeal> Hi 17:00:55 <stickies-v> today we'll be covering https://bitcoincore.reviews/32317, titled "Separate UTXO set access from validation functions" 17:01:03 <enochazariah> hello 17:01:03 <marcofleon> woo! 17:01:17 <stickies-v> is anyone here for the first time? feel free to say hi, even if you're just lurking 17:01:21 <TheCharlatan> hello! 17:01:48 <Davidson> stickies-v: I haven't participated in a while. Ho hi :) 17:01:57 <Davidson> so* 17:02:04 <lightlike> hi 17:02:18 <pablomartin_> hello 17:02:36 <stickies-v> well, welcome back Davidson! are you the floresta davidson, by any chance? 17:02:51 <Davidson> Yeap, it's me 17:03:27 <stickies-v> oh nice, we've got kernel users, contributors, reviewers and enthusiasts all in one meeting! 17:03:32 <stickies-v> (reviewing IS contributing, of course) 17:03:53 <stickies-v> did anyone get the chance to review the notes and/or PR (y/n)? 17:04:23 <Davidson> y 17:04:27 <stringintech> y 17:04:29 <marcofleon> y, checked out the notes and an initial code review of the PR 17:04:30 <brunoerg> yes 17:04:35 <monlovesmango> yes 17:04:37 <enochazariah> yes, checked the code out 17:04:44 <TheCharlatan> y :D 17:05:06 <stickies-v> oh my that's a lot of prep, excellent 17:05:59 <stickies-v> let's dive right into the questions, we'll start off with the more conceptual ones and then progress into code questions. as always, review club is async so feel free to continue conversation on previous questions when we move on, or raise any other questions you have! 17:06:25 <stickies-v> 2. Why is carving out the new SpendBlock() function from ConnectBlock() helpful for this PR? How would you compare the purpose of the two functions? 17:08:25 <monlovesmango> it will remove all UTXO set interactions from ConnectBlock which makes ConnectBlock much more modular. SpendBlock encapsulates all the UTXO interactions that are needed when connecting new block 17:09:15 <marcofleon> carving it out is helpful because using ConnectBlock no longer requires the utxo set. So you can do a lot of block validation without a utxo set 17:09:22 <Davidson> Before this PR, ConnectBlock also looked utxos up, and therefore needed access to the utxo set. Now, the feching of utxos is defered to the new function. Making ConnectBlock work without access to the utxo set 17:10:39 <stickies-v> right, not every full node implementation has a UTXO set, so only being able to connect a block by passing UTXO set as a parameter seems rather opinionated for how Core works 17:11:26 <stickies-v> follow-up question: does the new ConnectBlock allow for stateless validation? 17:12:11 <marcofleon> do you count blockundo as non-state? if that makes sense 17:12:19 <monlovesmango> yes..? what does stateless validation mean? not needing the statefulness of utxo set? 17:12:49 <marcofleon> i guess the new Connectblock would still requite some state then? 17:13:03 <marcofleon> but it's prevalidated by SpendBlock 17:13:22 <stickies-v> blockundo is a ConnectBlock parameter, so that does not count as state 17:14:07 <stringintech> I guess using the fJustCheck as true could do this for us? 17:14:11 <Davidson> I would say no. Because it still uses a ref to the block tree index. 17:14:27 <monlovesmango> ah ok, so stateless validation would be any state that is not passed in by parameters? 17:15:01 <stickies-v> what I'm trying to get at is: if someone has a serialized block, and they want to check if it's valid, can they validate it with a function (similar to) ConnectBlock? 17:15:15 <stickies-v> so state in this case would be from the Chainstate instance of which ConnectBlock is a method 17:17:14 <TheCharlatan> there is still extra state require, for example they still need the deployment bits 17:18:09 <marcofleon> those pesky deployment bits 17:18:52 <stickies-v> Davidson: yeah ConnectBlock has side-effects (as clearly implied by the name too) in that it e.g. updates the chainstate, and the block index 17:19:58 <marcofleon> i thought SpendBlock updates the chainstate? 17:20:05 <marcofleon> unless i'm mistaking what the chainstate is here... 17:20:40 <stickies-v> oh sorry yes that's confusing naming 🙈 17:22:02 <TheCharlatan> I guess you meant members of the Chainstate class? 17:23:53 <stickies-v> i was thinking about connecting the block to the chaintip (i.e. state of the Chain) but now i'm not actually sure if that's ahppening in ConnectBlock 17:24:32 <stickies-v> you're right marcofleon that ConnectBlock does not update the UTXO set, commonly called the chainstate 17:25:26 <TheCharlatan> ah, yes, that is moved out of ConnectBlock in the commit " validation: Move SetBestBlock out of ConnectBlocky 17:25:57 <stickies-v> ah right! 17:26:00 <stickies-v> 3. 3. Do you see another benefit of this decoupling, besides allowing kernel usage without a UTXO set? 17:27:18 <marcofleon> maybe easier testing 17:27:24 <Davidson> I would say it's cleaner. Since you have a clear separation of concerns 17:27:27 <enochazariah> i see modularity as another benefit of decoupling 17:27:44 <marcofleon> by splitting up parts of validation into separate functions 17:28:26 <stickies-v> yeah improved testability was the main benefit i was thinking of, reusability/modularity might be a win too even though the potential there is probably a bit more limited since there's only so many places these functions can be used 17:28:34 <monlovesmango> it also helps with code maintainability as there is separation of concerns 17:28:56 <stickies-v> anything else? 17:29:28 <Davidson> Makes it easier to validate blocks in parallel (e.g. swift-sync) 17:30:29 <stickies-v> yes! but isn't that a direct effect of the "allowing kernel usage without a UTXO set" bit? 17:30:43 <Davidson> yeah, I think so 17:31:38 <stickies-v> TheCharlatan observed a few other minor improvements: 17:32:11 <stickies-v> reducing the amount of UTXO set lookups (by just doing it once at the beginning) can have minimal performance improvements 17:33:18 <stickies-v> and then from a maintainability / code clarity perspective: explicitly passing objects (coins) through a callstack is easier to reason about and has less thread safety issues etc than having each frame do its own map lookup 17:34:12 <stickies-v> 4. Especially during IBD, transaction validation must be fast. Are there any changes where you have concerns about performance? 17:38:07 <marcofleon> maybe iterating over the txs in the block twice now? doesn't seem like a big deal though 17:38:09 <stickies-v> Or if you don't have any concerns: Which measures can you identify this PR has adopted to minimize the performance impact of this refactor? 17:38:34 <TheCharlatan> there are a few additional vector allocations, where instead of retrieving elements one-by-one, references to elements are filled into a vector and then passed by spans. 17:38:39 <Davidson> I didn't see anything extraordinary. Maybe use a little bit more memory? 17:39:06 <Davidson> TheCharlatan: oh you were faster than me :D 17:39:25 <TheCharlatan> heh :D 17:40:02 <stickies-v> yeah vector allocations was the main thing I could see too. This PR relies quite heavily on passing references rather than copying, so the overhead there should be quite minimal 17:40:44 <stickies-v> of course, these references can introduce lifetime risks, so that might be something to consider in your review 17:41:14 <lightlike> not so important for the question, but I don't agree with the "especially" in the question : In my opinion, validation / block propagation being fast is much more important at the tip than during IBD, which is a one-time thing. 17:42:14 <Davidson> stickies-v: I assume buildinga spam<T> from a vector<T> is almost free? (Rust bro here :D ) 17:42:54 <stickies-v> lightlike: oh, interesting! yeah, i agree that at the tip performance is also crucial, and it's maybe a bit meaningless to compare which is more important - could have phrased that better, sorry! 17:42:54 <monlovesmango> I did have a question about why we pass block_hash as a parameter to ConnectBlock when block.GetHash() is still called when asserting Assume(block.GetHash() == block_hash) 17:44:18 <stickies-v> Davidson: yes, I'm not enough of a C++ expert to give you the sound answer, but my understanding is that a span allows to iterate over the container with almost no overhead, like a view 17:44:25 <Davidson> monlovesmango: I believe this is the topic for question number 5, no? 17:45:04 <stickies-v> it is indeed! 17:45:33 <Davidson> stickies-v: nice, so it's basically one vec allocation per block with this block's coins. And then we pass refs to that vec 17:45:34 <marcofleon> is block_hash just sort of used as sanity check in those Spendblock assertions? 17:46:02 <TheCharlatan> marcofleon: yes :) 17:46:12 <stickies-v> 5. `SpendBlock()` takes a `CBlock block`, `CBlockIndex pindex` and `uint256 block_hash` parameter, all referencing the block being spent. Why do we need 3 parameters to do that? 17:46:19 <monlovesmango> Davidson: haha oops 17:47:42 <marcofleon> the CBlock contains the actual txs that are used to check against and update the utxo set so gotta have that 17:47:49 <Davidson> So, mostly sanity check...? I've seen that the pindex arg also gets used to figure out if our ChainState represents the previous block's state, so we need this one 17:48:11 <stickies-v> monlovesmango: `Assume` statements are only compiled into debug builds, so this check should not incur any overhead on non-debug builds 17:48:27 <stickies-v> otherwise, indeed it would be a bit silly to pass `block_hash` as a performance optimization. good spot! 17:48:27 <monlovesmango> I assumed the block_hash parameter was for performance reasons, so that we don't have to re-hash 17:48:44 <monlovesmango> stickies-v: ok!! that makes a lot more sense :) 17:49:16 <monlovesmango> also good to know that about `Assume` statements 17:49:38 <marcofleon> pindex also used for the height to cehck against bip30 stuff 17:50:14 <marcofleon> and for UpdateCoins 17:51:37 <monlovesmango> isn't bip30 stuff now in SpendBlock? 17:52:10 <stickies-v> monlovesmango: yes, the question is about `SpendBlock` 17:52:29 <monlovesmango> omg ignore sorry.. 17:53:07 <marcofleon> "bip30 stuff" is my best summarizing of that whole chunk of code/text 17:55:07 <stickies-v> but why do we need to pass `pindex`? Can't it be retrieved from `block`? 17:56:25 <stickies-v> (and if it can, should it?) 17:57:22 <Davidson> You mean the prevblock? I think it's better to get the prevblock straight from pindex to make sure we actually building on tip? 17:57:27 <stickies-v> We're almost at time, so gonna launch the next/last question already 17:57:42 <stickies-v> 7. The first commits in this PR refactor `CCoinsViewCache` out of the function signature of a couple of validation functions. Does `CCoinsViewCache` hold the entire UTXO set? Why is that (not) a problem? Does this PR change that behaviour? 17:58:07 <stringintech> I guess we need another disk access for that which is not good to do? (previous question) 17:59:11 <stickies-v> Davidson: no, I mean dropping the `pindex` argument from the `SpendBlock` fn signature, and just letting `SpendBlock` get a `CBlockIndex` from `block` 18:00:13 <stickies-v> stringintech: I don't think we need disk access for that, the entire block index (`BlockManager::m_block_index`) is kept in-memory, it is relatively small 18:00:59 <TheCharlatan> Davidson, that is a good reason, but we could also assert that beforehand. Other than that, looking up the block index with a block's hash does incur bit of a performance penalty. 18:01:15 <stringintech> stickies-v: oh!! thanks. 18:02:52 <stickies-v> TheCharlatan: sure, it's a lookup, but it's a map - do you think that's going to be measurable? 18:04:42 <TheCharlatan> I doubt it would be. I think I implemented it that way more because we already have it available at its call sites. 18:04:46 <Davidson> stickies-v: Oh, I didn't know you could do that :') 18:04:57 <stickies-v> anyway, we're past end time for today, so I'm going to wrap it up here, but as always feel free to share thoughts or follow-up questions - i'll be around for a while longer! 18:05:31 <stickies-v> thanks for joining the discussion today everyone, and thanks TheCharlatan for authoring the PR! 18:05:34 <marcofleon> yeah wait how are we getting a CBlockindex from a CBlock? 18:05:38 <stickies-v> #endmeeting