17:00:09 <stickies-v> #startmeeting 
17:00:09 <corebot> stickies-v: Meeting started at 2025-05-14T17:00+0000
17:00:10 <corebot> stickies-v: Current chairs: stickies-v
17:00:11 <corebot> stickies-v: Useful commands: #action #info #idea #link #topic #motion #vote #close #endmeeting
17:00:12 <corebot> stickies-v: See also: https://hcoop-meetbot.readthedocs.io/en/stable/
17:00:13 <corebot> stickies-v: Participants should now identify themselves with '#here' or with an alias like '#here FirstLast'
17:00:29 <stickies-v> hey everyone, welcome to the review club!
17:00:36 <stringintech> Hi!
17:00:43 <brunoerg> hi
17:00:43 <Davidson> Hi
17:00:48 <monlovesmango> hey
17:00:50 <marcofleon> yo
17:00:53 <JoaoLeal> Hi
17:00:55 <stickies-v> today we'll be covering https://bitcoincore.reviews/32317, titled "Separate UTXO set access from validation functions"
17:01:03 <enochazariah> hello
17:01:03 <marcofleon> woo!
17:01:17 <stickies-v> is anyone here for the first time? feel free to say hi, even if you're just lurking
17:01:21 <TheCharlatan> hello!
17:01:48 <Davidson> stickies-v: I haven't participated in a while. Ho hi :)
17:01:57 <Davidson> so*
17:02:04 <lightlike> hi
17:02:18 <pablomartin_> hello
17:02:36 <stickies-v> well, welcome back Davidson! are you the floresta davidson, by any chance?
17:02:51 <Davidson> Yeap, it's me
17:03:27 <stickies-v> oh nice, we've got kernel users, contributors, reviewers and enthusiasts all in one meeting!
17:03:32 <stickies-v> (reviewing IS contributing, of course)
17:03:53 <stickies-v> did anyone get the chance to review the notes and/or PR (y/n)?
17:04:23 <Davidson> y
17:04:27 <stringintech> y
17:04:29 <marcofleon> y, checked out the notes and an initial code review of the PR
17:04:30 <brunoerg> yes
17:04:35 <monlovesmango> yes
17:04:37 <enochazariah> yes, checked the code out
17:04:44 <TheCharlatan> y :D
17:05:06 <stickies-v> oh my that's a lot of prep, excellent
17:05:59 <stickies-v> let's dive right into the questions, we'll start off with the more conceptual ones and then progress into code questions. as always, review club is async so feel free to continue conversation on previous questions when we move on, or raise any other questions you have!
17:06:25 <stickies-v> 2. Why is carving out the new SpendBlock() function from ConnectBlock() helpful for this PR? How would you compare the purpose of the two functions?
17:08:25 <monlovesmango> it will remove all UTXO set interactions from ConnectBlock which makes ConnectBlock much more modular. SpendBlock encapsulates all the UTXO interactions that are needed when connecting new block
17:09:15 <marcofleon> carving it out is helpful because using ConnectBlock no longer requires the utxo set. So you can do a lot of block validation without a utxo set
17:09:22 <Davidson> Before this PR, ConnectBlock also looked utxos up, and therefore needed access to the utxo set. Now, the feching of utxos is defered to the new function. Making ConnectBlock work without access to the utxo set
17:10:39 <stickies-v> right, not every full node implementation has a UTXO set, so only being able to connect a block by passing UTXO set as a parameter seems rather opinionated for how Core works
17:11:26 <stickies-v> follow-up question: does the new ConnectBlock allow for stateless validation?
17:12:11 <marcofleon> do you count blockundo as non-state? if that makes sense
17:12:19 <monlovesmango> yes..? what does stateless validation mean? not needing the statefulness of utxo set?
17:12:49 <marcofleon> i guess the new Connectblock would still requite some state then?
17:13:03 <marcofleon> but it's prevalidated by SpendBlock
17:13:22 <stickies-v> blockundo is a ConnectBlock parameter, so that does not count as state
17:14:07 <stringintech> I guess using the fJustCheck as true could do this for us?
17:14:11 <Davidson> I would say no. Because it still uses a ref to the block tree index.
17:14:27 <monlovesmango> ah ok, so stateless validation would be any state that is not passed in by parameters?
17:15:01 <stickies-v> what I'm trying to get at is: if someone has a serialized block, and they want to check if it's valid, can they validate it with a function (similar to) ConnectBlock?
17:15:15 <stickies-v> so state in this case would be from the Chainstate instance of which ConnectBlock is a method
17:17:14 <TheCharlatan> there is still extra state require, for example they still need the deployment bits
17:18:09 <marcofleon> those pesky deployment bits
17:18:52 <stickies-v> Davidson: yeah ConnectBlock has side-effects (as clearly implied by the name too) in that it e.g. updates the chainstate, and the block index
17:19:58 <marcofleon> i thought SpendBlock updates the chainstate?
17:20:05 <marcofleon> unless i'm mistaking what the chainstate is here...
17:20:40 <stickies-v> oh sorry yes that's confusing naming 🙈
17:22:02 <TheCharlatan> I guess you meant members of the Chainstate class?
17:23:53 <stickies-v> i was thinking about connecting the block to the chaintip (i.e. state of the Chain) but now i'm not actually sure if that's ahppening in ConnectBlock
17:24:32 <stickies-v> you're right marcofleon that ConnectBlock does not update the UTXO set, commonly called the chainstate
17:25:26 <TheCharlatan> ah, yes, that is moved out of ConnectBlock in the commit "    validation: Move SetBestBlock out of ConnectBlocky
17:25:57 <stickies-v> ah right!
17:26:00 <stickies-v> 3. 3. Do you see another benefit of this decoupling, besides allowing kernel usage without a UTXO set?
17:27:18 <marcofleon> maybe easier testing
17:27:24 <Davidson> I would say it's cleaner. Since you have a clear separation of concerns
17:27:27 <enochazariah> i see modularity as another benefit of decoupling
17:27:44 <marcofleon> by splitting up parts of validation into separate functions
17:28:26 <stickies-v> yeah improved testability was the main benefit i was thinking of, reusability/modularity might be a win too even though the potential there is probably a bit more limited since there's only so many places these functions can be used
17:28:34 <monlovesmango> it also helps with code maintainability as there is separation of concerns
17:28:56 <stickies-v> anything else?
17:29:28 <Davidson> Makes it easier to validate blocks in parallel (e.g. swift-sync)
17:30:29 <stickies-v> yes! but isn't that a direct effect of the "allowing kernel usage without a UTXO set" bit?
17:30:43 <Davidson> yeah, I think so
17:31:38 <stickies-v> TheCharlatan observed a few other minor improvements:
17:32:11 <stickies-v> reducing the amount of UTXO set lookups (by just doing it once at the beginning) can have minimal performance improvements
17:33:18 <stickies-v> and then from a maintainability / code clarity perspective: explicitly passing objects (coins) through a callstack is easier to reason about and has less thread safety issues etc than having each frame do its own map lookup
17:34:12 <stickies-v> 4. Especially during IBD, transaction validation must be fast. Are there any changes where you have concerns about performance?
17:38:07 <marcofleon> maybe iterating over the txs in the block twice now? doesn't seem like a big deal though
17:38:09 <stickies-v> Or if you don't have any concerns: Which measures can you identify this PR has adopted to minimize the performance impact of this refactor?
17:38:34 <TheCharlatan> there are a few additional vector allocations, where instead of retrieving elements one-by-one, references to elements are filled into a vector and then passed by spans.
17:38:39 <Davidson> I didn't see anything extraordinary. Maybe use a little bit more memory?
17:39:06 <Davidson> TheCharlatan: oh you were faster than me :D
17:39:25 <TheCharlatan> heh :D
17:40:02 <stickies-v> yeah vector allocations was the main thing I could see too. This PR relies quite heavily on passing references rather than copying, so the overhead there should be quite minimal
17:40:44 <stickies-v> of course, these references can introduce lifetime risks, so that might be something to consider in your review
17:41:14 <lightlike> not so important for the question, but I don't agree with the "especially" in the question : In my opinion, validation / block propagation being fast is much more important at the tip than during IBD, which is a one-time thing.
17:42:14 <Davidson> stickies-v: I assume buildinga spam<T> from a vector<T> is almost free? (Rust bro here :D )
17:42:54 <stickies-v> lightlike: oh, interesting! yeah, i agree that at the tip performance is also crucial, and it's maybe a bit meaningless to compare which is more important - could have phrased that better, sorry!
17:42:54 <monlovesmango> I did have  a question about why we pass block_hash as a parameter to ConnectBlock when block.GetHash() is still called when asserting Assume(block.GetHash() == block_hash)
17:44:18 <stickies-v> Davidson: yes, I'm not enough of a C++ expert to give you the sound answer, but my understanding is that a span allows to iterate over the container with almost no overhead, like a view
17:44:25 <Davidson> monlovesmango: I believe this is the topic for question number 5, no?
17:45:04 <stickies-v> it is indeed!
17:45:33 <Davidson> stickies-v: nice, so it's basically one vec allocation per block with this block's coins. And then we pass refs to that vec
17:45:34 <marcofleon> is block_hash just sort of used as sanity check in those Spendblock assertions?
17:46:02 <TheCharlatan> marcofleon: yes :)
17:46:12 <stickies-v> 5. `SpendBlock()` takes a `CBlock block`, `CBlockIndex pindex` and `uint256 block_hash` parameter, all referencing the block being spent. Why do we need 3 parameters to do that?
17:46:19 <monlovesmango> Davidson: haha oops
17:47:42 <marcofleon> the CBlock contains the actual txs that are used to check against and update the utxo set so gotta have that
17:47:49 <Davidson> So, mostly sanity check...? I've seen that the pindex arg also gets used to figure out if our ChainState represents the previous block's state, so we need this one
17:48:11 <stickies-v> monlovesmango: `Assume` statements are only compiled into debug builds, so this check should not incur any overhead on non-debug builds
17:48:27 <stickies-v> otherwise, indeed it would be a bit silly to pass `block_hash` as a performance optimization. good spot!
17:48:27 <monlovesmango> I assumed the block_hash parameter was for performance reasons, so that we don't have to re-hash
17:48:44 <monlovesmango> stickies-v: ok!! that makes a lot more sense :)
17:49:16 <monlovesmango> also good to know that about `Assume` statements
17:49:38 <marcofleon> pindex also used for the height to cehck against bip30 stuff
17:50:14 <marcofleon> and for UpdateCoins
17:51:37 <monlovesmango> isn't bip30 stuff now in SpendBlock?
17:52:10 <stickies-v> monlovesmango: yes, the question is about `SpendBlock`
17:52:29 <monlovesmango> omg ignore sorry..
17:53:07 <marcofleon> "bip30 stuff" is my best summarizing of that whole chunk of code/text
17:55:07 <stickies-v> but why do we need to pass `pindex`? Can't it be retrieved from `block`?
17:56:25 <stickies-v> (and if it can, should it?)
17:57:22 <Davidson> You mean the prevblock? I think it's better to get the prevblock straight from pindex to make sure we actually building on tip?
17:57:27 <stickies-v> We're almost at time, so gonna launch the next/last question already
17:57:42 <stickies-v> 7. The first commits in this PR refactor `CCoinsViewCache` out of the function signature of a couple of validation functions. Does `CCoinsViewCache` hold the entire UTXO set? Why is that (not) a problem? Does this PR change that behaviour?
17:58:07 <stringintech> I guess we need another disk access for that which is not good to do? (previous question)
17:59:11 <stickies-v> Davidson: no, I mean dropping the `pindex` argument from the `SpendBlock` fn signature, and just letting `SpendBlock` get a `CBlockIndex` from `block`
18:00:13 <stickies-v> stringintech: I don't think we need disk access for that, the entire block index (`BlockManager::m_block_index`) is kept in-memory, it is relatively small
18:00:59 <TheCharlatan> Davidson, that is a good reason, but we could also assert that beforehand. Other than that, looking up the block index with a block's hash does incur bit of a performance penalty.
18:01:15 <stringintech> stickies-v: oh!! thanks.
18:02:52 <stickies-v> TheCharlatan: sure, it's a lookup, but it's a map - do you think that's going to be measurable?
18:04:42 <TheCharlatan> I doubt it would be. I think I implemented it that way more because we already have it available at its call sites.
18:04:46 <Davidson> stickies-v: Oh, I didn't know you could do that :')
18:04:57 <stickies-v> anyway, we're past end time for today, so I'm going to wrap it up here, but as always feel free to share thoughts or follow-up questions - i'll be around for a while longer!
18:05:31 <stickies-v> thanks for joining the discussion today everyone, and thanks TheCharlatan for authoring the PR!
18:05:34 <marcofleon> yeah wait how are we getting a CBlockindex from a CBlock?
18:05:38 <stickies-v> #endmeeting