17:00:19 <jonatack> #startmeeting 
17:00:19 <corebot> jonatack: Meeting started at 2025-05-28T17:00+0000
17:00:20 <corebot> jonatack: Current chairs: jonatack
17:00:21 <corebot> jonatack: Useful commands: #action #info #idea #link #topic #motion #vote #close #endmeeting
17:00:22 <corebot> jonatack: See also: https://hcoop-meetbot.readthedocs.io/en/stable/
17:00:23 <corebot> jonatack: Participants should now identify themselves with '#here' or with an alias like '#here FirstLast'
17:00:33 <jonatack> Hi and welcome to this week's review club!
17:00:40 <stringintech> Hi!
17:00:42 <monlovesmango> hello :)
17:00:43 <yuvic> hi
17:00:52 <jonatack> Today we'll be discussing "p2p: protect addnode peers during IBD"
17:00:54 <enochazariah> hello
17:00:57 <jonatack> Review club url:
17:01:00 <jonatack> https://bitcoincore.reviews/32051
17:01:04 <jonatack> Bitcoin Core PR url:
17:01:06 <jonatack> https://github.com/bitcoin/bitcoin/pull/32051
17:01:12 <jonatack> Anyone here for the first time? Feel free to say hi, even if you're only observing.
17:01:43 <jonatack> This discussion is ad hoc and asynchronous, so feel free to continue conversation on previous questions when we move on, or raise any other questions or thoughts you have.
17:02:30 <jonatack> To get the convo warmed up: Anyone know who originally introduced the addnode code into Bitcoin Core?
17:03:18 <jonatack> stringintech: monlovesmango: yuvic: enochazariah: welcome!
17:04:09 <monlovesmango> no idea..
17:04:30 <yuvic> maybe laanwj?
17:04:32 <enochazariah> i do not know that
17:04:49 <stringintech> satoshi? seeing him in addnode commits in git log :))
17:05:00 <jonatack> :D if my memory serves, it was gregory maxwell ("gmax" in the poll at https://x.com/jonatack/status/1927768398630387973)
17:05:29 <jonatack> So
17:05:31 <jonatack> an interesting observation, I have found, from out in the field in Central America,
17:05:39 <jonatack> is that bitcoind in general is quite robust in dealing with a hostile environment of poor or intermittent internet connection
17:05:51 <jonatack> e.g. where browsing the internet might be painfully slow or no longer really viable, but your bitcoind node survives/thrives quite well
17:06:10 <jonatack> in contrast
17:06:21 <jonatack> the stalling/timeout logic seems less tolerant and adapted, to a slow hostile environment
17:06:26 <jonatack> as it lacks the ability to monitor and adapt accordingly
17:06:48 <jonatack> which, as noted in the notes, mzumsande was looking at improving
17:07:02 <jonatack> still, I noticed the disconnections were heavily affecting some of my addnode peers that were being targeted for no fault of their own, some of which were low latency (like cjdns peers) or medium/higher latency (like tor and i2p peers)
17:07:04 <jonatack> which motivated this pull request.
17:07:19 <jonatack> Did anyone get the chance to review the notes and/or PR (y/n)?
17:07:34 <stringintech> y
17:07:36 <monlovesmango> yes
17:07:36 <yuvic> y
17:07:38 <enochazariah> yes
17:07:47 <jonatack> excellent
17:08:20 <jonatack> I'll add some bonus questions about the code, but let's begin with the more general ones
17:08:33 <jonatack> 1. What is an addnode peer? How can you specify them to your node?
17:09:27 <jonatack> (this is a practical question for anyone who runs a node)
17:09:29 <yuvic> peer which we manually add or connect, using addnode rpc or -addnode/-connect config.
17:10:19 <jonatack> yes! and why would you want to do that, as a node runner
17:10:20 <enochazariah> peer that a node can connect to, it can be speified to the node by using the command line
17:11:01 <jonatack> right, either via CLI/RPC addnode (onetry for a one-shot attempt, or "add" to add it to the addnode list)
17:11:03 <monlovesmango> so you can connect to a peer that you know to be honest
17:11:21 <yuvic> as I trust that peer or also to sync up faster during IBD
17:11:22 <stringintech> to maintain connections to nodes I trust in case for example the core peer selection fails for some reason
17:11:31 <enochazariah> I think a reason why someone would want to add node to add more trust in the network, verify don't trust
17:11:33 <jonatack> or in your bitcoin.conf file with addnode=<peer> ... one per line
17:12:13 <jonatack> So: to ensure a connection to a *trusted* peer
17:12:30 <jonatack> What is a fundamental protection that a trusted peer can provide to your node?
17:12:50 <jonatack> assuming they are an honest peer becaue you know/trust them
17:13:23 <stringintech> it can help prevent the node from being isolated from the rest of the network (not having the knowledge of the best chain anymore) by malicious peers
17:13:48 <yuvic> yes to sync with the best chain
17:13:49 <monlovesmango> would probably provide protection against a sybil attack
17:13:56 <enochazariah> protection from isolation
17:14:01 <jonatack> right!
17:14:03 <jonatack> https://river.com/learn/terms/e/eclipse-attack/
17:14:16 <jonatack> "An eclipse attack targets particular nodes in a network by surrounding them and obscuring their view of the entire network. For example, if a Bitcoin node has eight connections to other nodes, and an attacker controls all eight of those nodes, the attacker can refuse to relay any new blocks that miners produce. Although the rest of the network continues to process new blocks, the
17:14:17 <jonatack> victim node will be unaware that blocks are coming in."
17:14:47 <jonatack> It only takes one single honest peer connection to break out of an eclipe attack
17:14:51 <monlovesmango> ah so there is a term for it hah
17:15:12 <jonatack> yes
17:15:21 <yuvic> yes eclipse attack
17:15:43 <jonatack> So, by adding a trusted peer, your node cannot be successfully eclipsed unless your addnode peer connections are also eclipsed
17:17:01 <stringintech> then if we are choosing more than one addnode peer, they should be also geographically diverse so that for example if one region is compromised, our node can still maintain connections to rest of the network...
17:17:04 <jonatack> Therefore, it's a good idea to add some trusted peers using the addnode config option or rpc/cli
17:17:22 <jonatack> stringintech: sgtm
17:17:39 <jonatack> You can have up to 8 addnode peer connections simultaneously
17:18:00 <jonatack> In addition to the limit of 8 autamatic full outbound conns and 2 block-relay-only ones
17:18:40 <jonatack> (along with transient connections, like feelers or extra block-relay-only conns
17:19:11 <jonatack> or addr_fetch ones)
17:19:31 <jonatack> see src/node/connection_types.h for details
17:19:50 <stringintech> Thanks for the details
17:19:57 <jonatack> Now, onto the code
17:20:02 <jonatack> What function contains the Bitcoin Core stalling and timeout logic?
17:20:33 <stringintech> PeerManagerImpl::SendMessages()
17:20:34 <jonatack> this wasn't a question to prepare in advance, but you can search the codebase right now if you like
17:20:35 <enochazariah> SendMessages
17:20:47 <enochazariah> in PeerManagerImpl
17:20:47 <jonatack> excellent
17:20:49 <yuvic> SendMessages
17:21:01 <jonatack> From where is SendMessages() called?
17:21:06 <jonatack> (who it its caller)
17:21:26 <stringintech> connection manager
17:22:02 <jonatack> stringintech: can you elaborate?
17:22:58 <jonatack> it is called from a thread, currently inside src/net.cpp
17:23:15 <stringintech> I should go back for source for detail
17:23:15 <stringintech> but I guess we would loop over peers
17:23:16 <stringintech> can call this after ProcessMessages
17:23:24 <enochazariah> Not sure, but i think THe ThreadMessageHandler is the method that calls the SendMessages
17:23:32 <jonatack> correct for both
17:23:35 <enochazariah> *SendMessage
17:23:41 <jonatack> void CConnman::ThreadMessageHandler()
17:24:46 <jonatack> that calls ProcessMessages() and then SendMessages() for each peer, if not flagged for disconnection
17:25:01 <jonatack> Now, on to the PR
17:25:53 <jonatack> I was pleasantly surprised by pinheadz's review here https://github.com/bitcoin/bitcoin/pull/32051#pullrequestreview-2864676242
17:26:16 <jonatack> "I believe this also closes an 11-year-old issue: https://github.com/bitcoin/bitcoin/issues/5097"
17:27:10 <yuvic> yes that was interesting
17:27:21 <jonatack> and also:
17:27:23 <jonatack> https://github.com/bitcoin/bitcoin/issues/9213
17:27:35 <jonatack> that I need to read and look into TBH
17:27:54 <stringintech> Nice!
17:28:40 <jonatack> (as well as the review comments in https://github.com/bitcoin/bitcoin/pull/25880)
17:29:14 <jonatack> Back to this PR
17:29:33 <jonatack> The PR as it is currently, is actually not well-named
17:29:37 <jonatack> because only the first commit affects the IBD issues that I observed
17:29:42 <jonatack> that first commit gives addnode peers more time. apart from that, it doesn't protect them from disconnection.
17:30:37 <stringintech> I had a question: Is the PR intentionally focusing on “-addnode” peers and not “-connect” peers?
17:30:53 <jonatack> I'd be curious to hear if you have any thoughts or suggestions on the changes
17:31:22 <jonatack> stringintech: good question. Yes, I was focusing on addnode peers.
17:31:44 <monlovesmango> have you been able to test disconnection frequency after increasing timeout allowed?
17:31:46 <jonatack> Making a note to verify the effect on a -connect peer.
17:32:39 <jonatack> monlovesmango: yes, I saw much fewer disconnections with the ping times of my peers with my internet speed
17:33:05 <jonatack> from 100 or more per minute to a few an hour
17:33:14 <monlovesmango> nice
17:33:40 <jonatack> it still took more than a month for that node to sync...
17:34:12 <yuvic> I had a similar question as mzumsande's comment on the pr -> https://github.com/bitcoin/bitcoin/pull/32051/commits/3463a7f4813c3eece5ba9a260670a76e3f8d38ab#r1999313868
17:34:28 <jonatack> like 5 weeks, but all those addnode disconnections were not helping, as those peers were being re-connected right away again afterward
17:34:42 <monlovesmango> if it isn't actually protect them from disconnection then I would say I concept ACK and approach ACK. but if it is protecting from disconnection then I think there needs to be more thought into what desired IBD behavior would be
17:34:55 <monlovesmango> jonatack: omg hahah
17:36:52 <jonatack> yuvic: yes, the high frequency of disconnections I was seeing during IBD were not of that logic
17:37:21 <jonatack> those are disconnections that occur infrequently (for me) after IBD is completed
17:38:05 <yuvic> got it!
17:38:28 <jonatack> I need to consider whether to potentially drop the second (and maybe third) commit to keep it focused on IBD only, when the high number of disconnections I was seeing took place
17:39:39 <jonatack> or, alternatively, try to implement his review suggestion to clear the block requests
17:39:43 <enochazariah> I've got a bit of a question
17:39:44 <enochazariah> does this not raise up a silent stall? i mean, if the system does not have the inhererent mechanism to re-request that block from another peer, then the IBD could effectively stall, preventing a disconnection, but introducing a silent stall
17:41:32 <jonatack> enochazariah: I need to look at that (test coverage could be useful, as well, but maybe non-trivial to do, and only if it is reliable)
17:42:07 <enochazariah> okay
17:42:18 <jonatack> The review comments turned up valuable history on this that I need to review.
17:43:18 <jonatack> I didn't necessarily expect the PR to be merged quickly as-is, but hoped to gain insight as to what would be best.
17:44:16 <jonatack> A long-term goal since years that comes up now and then in developer discussions
17:44:47 <jonatack> is how to score peers based on the resources they consume
17:45:34 <jonatack> see, for instance: https://github.com/bitcoin/bitcoin/pull/31672
17:45:39 <yuvic> there was an issue for this
17:45:46 <yuvic> yes by vasild
17:46:08 <jonatack> yuvic: yes. how to measure this an open question.
17:47:03 <jonatack> i'm not sure how useful the cpu load is, as I have been testing it, and the load seems to often be higher when the peer is first connected, and then go down to normal levels
17:47:40 <jonatack> and not necessarily indicate a bad peer
17:47:48 <jonatack> (to be continued)
17:48:32 <jonatack> i haven't seen it yet as useful -- on its own -- to qualify a peer to be disconnected
17:49:46 <jonatack> this perhaps connects to aj towns' review suggestion about scoring peers and using that to scale the number of blocks we request from them
17:50:54 <jonatack> If anyone comes up with test coverage for the stalling or timeout logic here, I'd be very happy to look at it and bring it into the PR
17:51:00 <enochazariah> scoring them and using as an order, so a much higher score would mean higher chances of being requested
17:51:08 <jonatack> right now, the changes in this PR do not break any tests...
17:52:04 <stringintech> Regarding the timeout logics this PR touches, I could only find p2p_ibd_stalling.py, which covers the IBD block stalling timeout (and should possibly be adapted to reflect the addnode changes). I didn't find any integration tests for the initial header sync timeout and regular block download timeout. Am I right??
17:52:07 <yuvic> yes, test coverage would be interesting
17:53:31 <jonatack> stringintech: neat, maybe addnode connections could be added to that functional test file
17:54:44 <jonatack> stringintech: as for the header sync and block download, I have not yet looked
17:55:12 <stringintech> Hmm... I'd be happy to work on the missing ones (regardless of the PR changes) and open a PR (in case it is out of the scope for this PR of course). If it merges first, the addnode PR could adapt them accordingly too.
17:55:40 <stringintech> Have to double check to see if they are actually missing
17:55:47 <jonatack> stringintech: that would be great!
17:56:36 <jonatack> please ping me (on the github PR, or via IRC or DM on twitter/x) if you come up with coverage
17:56:41 <enochazariah> stringintech that would be nice
17:57:43 <jonatack> any final thoughts or questions?
17:57:49 <jonatack> 2 minutes left
17:58:16 <yuvic> thanks, nothing from my side!
17:58:28 <enochazariah> Nothing from my end
17:58:42 <jonatack> Appreciate you all participating!
17:59:16 <jonatack> Don't hesitate to leave a review comment or feedback on that PR or propose test coverage or improvements to it there
17:59:27 <enochazariah> Thank you jonatack
17:59:33 <stringintech> Thank you jonatack!
17:59:45 <jonatack> Thank you!
17:59:47 <jonatack> #endmeeting