17:00:19 <jonatack> #startmeeting 17:00:19 <corebot> jonatack: Meeting started at 2025-05-28T17:00+0000 17:00:20 <corebot> jonatack: Current chairs: jonatack 17:00:21 <corebot> jonatack: Useful commands: #action #info #idea #link #topic #motion #vote #close #endmeeting 17:00:22 <corebot> jonatack: See also: https://hcoop-meetbot.readthedocs.io/en/stable/ 17:00:23 <corebot> jonatack: Participants should now identify themselves with '#here' or with an alias like '#here FirstLast' 17:00:33 <jonatack> Hi and welcome to this week's review club! 17:00:40 <stringintech> Hi! 17:00:42 <monlovesmango> hello :) 17:00:43 <yuvic> hi 17:00:52 <jonatack> Today we'll be discussing "p2p: protect addnode peers during IBD" 17:00:54 <enochazariah> hello 17:00:57 <jonatack> Review club url: 17:01:00 <jonatack> https://bitcoincore.reviews/32051 17:01:04 <jonatack> Bitcoin Core PR url: 17:01:06 <jonatack> https://github.com/bitcoin/bitcoin/pull/32051 17:01:12 <jonatack> Anyone here for the first time? Feel free to say hi, even if you're only observing. 17:01:43 <jonatack> This discussion is ad hoc and asynchronous, so feel free to continue conversation on previous questions when we move on, or raise any other questions or thoughts you have. 17:02:30 <jonatack> To get the convo warmed up: Anyone know who originally introduced the addnode code into Bitcoin Core? 17:03:18 <jonatack> stringintech: monlovesmango: yuvic: enochazariah: welcome! 17:04:09 <monlovesmango> no idea.. 17:04:30 <yuvic> maybe laanwj? 17:04:32 <enochazariah> i do not know that 17:04:49 <stringintech> satoshi? seeing him in addnode commits in git log :)) 17:05:00 <jonatack> :D if my memory serves, it was gregory maxwell ("gmax" in the poll at https://x.com/jonatack/status/1927768398630387973) 17:05:29 <jonatack> So 17:05:31 <jonatack> an interesting observation, I have found, from out in the field in Central America, 17:05:39 <jonatack> is that bitcoind in general is quite robust in dealing with a hostile environment of poor or intermittent internet connection 17:05:51 <jonatack> e.g. where browsing the internet might be painfully slow or no longer really viable, but your bitcoind node survives/thrives quite well 17:06:10 <jonatack> in contrast 17:06:21 <jonatack> the stalling/timeout logic seems less tolerant and adapted, to a slow hostile environment 17:06:26 <jonatack> as it lacks the ability to monitor and adapt accordingly 17:06:48 <jonatack> which, as noted in the notes, mzumsande was looking at improving 17:07:02 <jonatack> still, I noticed the disconnections were heavily affecting some of my addnode peers that were being targeted for no fault of their own, some of which were low latency (like cjdns peers) or medium/higher latency (like tor and i2p peers) 17:07:04 <jonatack> which motivated this pull request. 17:07:19 <jonatack> Did anyone get the chance to review the notes and/or PR (y/n)? 17:07:34 <stringintech> y 17:07:36 <monlovesmango> yes 17:07:36 <yuvic> y 17:07:38 <enochazariah> yes 17:07:47 <jonatack> excellent 17:08:20 <jonatack> I'll add some bonus questions about the code, but let's begin with the more general ones 17:08:33 <jonatack> 1. What is an addnode peer? How can you specify them to your node? 17:09:27 <jonatack> (this is a practical question for anyone who runs a node) 17:09:29 <yuvic> peer which we manually add or connect, using addnode rpc or -addnode/-connect config. 17:10:19 <jonatack> yes! and why would you want to do that, as a node runner 17:10:20 <enochazariah> peer that a node can connect to, it can be speified to the node by using the command line 17:11:01 <jonatack> right, either via CLI/RPC addnode (onetry for a one-shot attempt, or "add" to add it to the addnode list) 17:11:03 <monlovesmango> so you can connect to a peer that you know to be honest 17:11:21 <yuvic> as I trust that peer or also to sync up faster during IBD 17:11:22 <stringintech> to maintain connections to nodes I trust in case for example the core peer selection fails for some reason 17:11:31 <enochazariah> I think a reason why someone would want to add node to add more trust in the network, verify don't trust 17:11:33 <jonatack> or in your bitcoin.conf file with addnode=<peer> ... one per line 17:12:13 <jonatack> So: to ensure a connection to a *trusted* peer 17:12:30 <jonatack> What is a fundamental protection that a trusted peer can provide to your node? 17:12:50 <jonatack> assuming they are an honest peer becaue you know/trust them 17:13:23 <stringintech> it can help prevent the node from being isolated from the rest of the network (not having the knowledge of the best chain anymore) by malicious peers 17:13:48 <yuvic> yes to sync with the best chain 17:13:49 <monlovesmango> would probably provide protection against a sybil attack 17:13:56 <enochazariah> protection from isolation 17:14:01 <jonatack> right! 17:14:03 <jonatack> https://river.com/learn/terms/e/eclipse-attack/ 17:14:16 <jonatack> "An eclipse attack targets particular nodes in a network by surrounding them and obscuring their view of the entire network. For example, if a Bitcoin node has eight connections to other nodes, and an attacker controls all eight of those nodes, the attacker can refuse to relay any new blocks that miners produce. Although the rest of the network continues to process new blocks, the 17:14:17 <jonatack> victim node will be unaware that blocks are coming in." 17:14:47 <jonatack> It only takes one single honest peer connection to break out of an eclipe attack 17:14:51 <monlovesmango> ah so there is a term for it hah 17:15:12 <jonatack> yes 17:15:21 <yuvic> yes eclipse attack 17:15:43 <jonatack> So, by adding a trusted peer, your node cannot be successfully eclipsed unless your addnode peer connections are also eclipsed 17:17:01 <stringintech> then if we are choosing more than one addnode peer, they should be also geographically diverse so that for example if one region is compromised, our node can still maintain connections to rest of the network... 17:17:04 <jonatack> Therefore, it's a good idea to add some trusted peers using the addnode config option or rpc/cli 17:17:22 <jonatack> stringintech: sgtm 17:17:39 <jonatack> You can have up to 8 addnode peer connections simultaneously 17:18:00 <jonatack> In addition to the limit of 8 autamatic full outbound conns and 2 block-relay-only ones 17:18:40 <jonatack> (along with transient connections, like feelers or extra block-relay-only conns 17:19:11 <jonatack> or addr_fetch ones) 17:19:31 <jonatack> see src/node/connection_types.h for details 17:19:50 <stringintech> Thanks for the details 17:19:57 <jonatack> Now, onto the code 17:20:02 <jonatack> What function contains the Bitcoin Core stalling and timeout logic? 17:20:33 <stringintech> PeerManagerImpl::SendMessages() 17:20:34 <jonatack> this wasn't a question to prepare in advance, but you can search the codebase right now if you like 17:20:35 <enochazariah> SendMessages 17:20:47 <enochazariah> in PeerManagerImpl 17:20:47 <jonatack> excellent 17:20:49 <yuvic> SendMessages 17:21:01 <jonatack> From where is SendMessages() called? 17:21:06 <jonatack> (who it its caller) 17:21:26 <stringintech> connection manager 17:22:02 <jonatack> stringintech: can you elaborate? 17:22:58 <jonatack> it is called from a thread, currently inside src/net.cpp 17:23:15 <stringintech> I should go back for source for detail 17:23:15 <stringintech> but I guess we would loop over peers 17:23:16 <stringintech> can call this after ProcessMessages 17:23:24 <enochazariah> Not sure, but i think THe ThreadMessageHandler is the method that calls the SendMessages 17:23:32 <jonatack> correct for both 17:23:35 <enochazariah> *SendMessage 17:23:41 <jonatack> void CConnman::ThreadMessageHandler() 17:24:46 <jonatack> that calls ProcessMessages() and then SendMessages() for each peer, if not flagged for disconnection 17:25:01 <jonatack> Now, on to the PR 17:25:53 <jonatack> I was pleasantly surprised by pinheadz's review here https://github.com/bitcoin/bitcoin/pull/32051#pullrequestreview-2864676242 17:26:16 <jonatack> "I believe this also closes an 11-year-old issue: https://github.com/bitcoin/bitcoin/issues/5097" 17:27:10 <yuvic> yes that was interesting 17:27:21 <jonatack> and also: 17:27:23 <jonatack> https://github.com/bitcoin/bitcoin/issues/9213 17:27:35 <jonatack> that I need to read and look into TBH 17:27:54 <stringintech> Nice! 17:28:40 <jonatack> (as well as the review comments in https://github.com/bitcoin/bitcoin/pull/25880) 17:29:14 <jonatack> Back to this PR 17:29:33 <jonatack> The PR as it is currently, is actually not well-named 17:29:37 <jonatack> because only the first commit affects the IBD issues that I observed 17:29:42 <jonatack> that first commit gives addnode peers more time. apart from that, it doesn't protect them from disconnection. 17:30:37 <stringintech> I had a question: Is the PR intentionally focusing on “-addnode” peers and not “-connect” peers? 17:30:53 <jonatack> I'd be curious to hear if you have any thoughts or suggestions on the changes 17:31:22 <jonatack> stringintech: good question. Yes, I was focusing on addnode peers. 17:31:44 <monlovesmango> have you been able to test disconnection frequency after increasing timeout allowed? 17:31:46 <jonatack> Making a note to verify the effect on a -connect peer. 17:32:39 <jonatack> monlovesmango: yes, I saw much fewer disconnections with the ping times of my peers with my internet speed 17:33:05 <jonatack> from 100 or more per minute to a few an hour 17:33:14 <monlovesmango> nice 17:33:40 <jonatack> it still took more than a month for that node to sync... 17:34:12 <yuvic> I had a similar question as mzumsande's comment on the pr -> https://github.com/bitcoin/bitcoin/pull/32051/commits/3463a7f4813c3eece5ba9a260670a76e3f8d38ab#r1999313868 17:34:28 <jonatack> like 5 weeks, but all those addnode disconnections were not helping, as those peers were being re-connected right away again afterward 17:34:42 <monlovesmango> if it isn't actually protect them from disconnection then I would say I concept ACK and approach ACK. but if it is protecting from disconnection then I think there needs to be more thought into what desired IBD behavior would be 17:34:55 <monlovesmango> jonatack: omg hahah 17:36:52 <jonatack> yuvic: yes, the high frequency of disconnections I was seeing during IBD were not of that logic 17:37:21 <jonatack> those are disconnections that occur infrequently (for me) after IBD is completed 17:38:05 <yuvic> got it! 17:38:28 <jonatack> I need to consider whether to potentially drop the second (and maybe third) commit to keep it focused on IBD only, when the high number of disconnections I was seeing took place 17:39:39 <jonatack> or, alternatively, try to implement his review suggestion to clear the block requests 17:39:43 <enochazariah> I've got a bit of a question 17:39:44 <enochazariah> does this not raise up a silent stall? i mean, if the system does not have the inhererent mechanism to re-request that block from another peer, then the IBD could effectively stall, preventing a disconnection, but introducing a silent stall 17:41:32 <jonatack> enochazariah: I need to look at that (test coverage could be useful, as well, but maybe non-trivial to do, and only if it is reliable) 17:42:07 <enochazariah> okay 17:42:18 <jonatack> The review comments turned up valuable history on this that I need to review. 17:43:18 <jonatack> I didn't necessarily expect the PR to be merged quickly as-is, but hoped to gain insight as to what would be best. 17:44:16 <jonatack> A long-term goal since years that comes up now and then in developer discussions 17:44:47 <jonatack> is how to score peers based on the resources they consume 17:45:34 <jonatack> see, for instance: https://github.com/bitcoin/bitcoin/pull/31672 17:45:39 <yuvic> there was an issue for this 17:45:46 <yuvic> yes by vasild 17:46:08 <jonatack> yuvic: yes. how to measure this an open question. 17:47:03 <jonatack> i'm not sure how useful the cpu load is, as I have been testing it, and the load seems to often be higher when the peer is first connected, and then go down to normal levels 17:47:40 <jonatack> and not necessarily indicate a bad peer 17:47:48 <jonatack> (to be continued) 17:48:32 <jonatack> i haven't seen it yet as useful -- on its own -- to qualify a peer to be disconnected 17:49:46 <jonatack> this perhaps connects to aj towns' review suggestion about scoring peers and using that to scale the number of blocks we request from them 17:50:54 <jonatack> If anyone comes up with test coverage for the stalling or timeout logic here, I'd be very happy to look at it and bring it into the PR 17:51:00 <enochazariah> scoring them and using as an order, so a much higher score would mean higher chances of being requested 17:51:08 <jonatack> right now, the changes in this PR do not break any tests... 17:52:04 <stringintech> Regarding the timeout logics this PR touches, I could only find p2p_ibd_stalling.py, which covers the IBD block stalling timeout (and should possibly be adapted to reflect the addnode changes). I didn't find any integration tests for the initial header sync timeout and regular block download timeout. Am I right?? 17:52:07 <yuvic> yes, test coverage would be interesting 17:53:31 <jonatack> stringintech: neat, maybe addnode connections could be added to that functional test file 17:54:44 <jonatack> stringintech: as for the header sync and block download, I have not yet looked 17:55:12 <stringintech> Hmm... I'd be happy to work on the missing ones (regardless of the PR changes) and open a PR (in case it is out of the scope for this PR of course). If it merges first, the addnode PR could adapt them accordingly too. 17:55:40 <stringintech> Have to double check to see if they are actually missing 17:55:47 <jonatack> stringintech: that would be great! 17:56:36 <jonatack> please ping me (on the github PR, or via IRC or DM on twitter/x) if you come up with coverage 17:56:41 <enochazariah> stringintech that would be nice 17:57:43 <jonatack> any final thoughts or questions? 17:57:49 <jonatack> 2 minutes left 17:58:16 <yuvic> thanks, nothing from my side! 17:58:28 <enochazariah> Nothing from my end 17:58:42 <jonatack> Appreciate you all participating! 17:59:16 <jonatack> Don't hesitate to leave a review comment or feedback on that PR or propose test coverage or improvements to it there 17:59:27 <enochazariah> Thank you jonatack 17:59:33 <stringintech> Thank you jonatack! 17:59:45 <jonatack> Thank you! 17:59:47 <jonatack> #endmeeting