Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolv.conf on macOS sometimes has interface suffix which breaks resolve.rb. #35

Open
ioquatix opened this issue Apr 25, 2023 · 13 comments
Assignees

Comments

@ioquatix
Copy link
Member

ioquatix commented Apr 25, 2023

@ioquatix I just ran into this the other day and tried to run Resolv.getaddress on bin/rails c within the dev same environment as @trevorturk.

Loading development environment (Rails 7.0.4.3)
irb(main):001:0> Resolv.getaddress "google.com"
=> "74.125.136.100"

I can also help with repros if needed.

I put a binding.irb in the resolv.rb exception site and got this:

irb(#<Resolv::DNS::Requester::UnconnectedUDP:0x000000010685e970>):004:0> Addrinfo.ip(host).ip_address
/Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:4:in `ip': getaddrinfo: nodename nor servname provided, or not known (SocketError)
	from /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:4:in `sender'
	from <internal:prelude>:5:in `irb'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:770:in `sender'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:527:in `block in fetch_resource'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1126:in `block (3 levels) in resolv'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1124:in `each'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1124:in `block (2 levels) in resolv'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1123:in `each'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1123:in `block in resolv'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1121:in `each'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:1121:in `resolv'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:521:in `fetch_resource'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:507:in `each_resource'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:402:in `each_address'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:116:in `block in each_address'
	from 3.2.2/lib/ruby/3.2.0/resolv.rb:115:in `each'
	... 30 levels...
irb(#<Resolv::DNS::Requester::UnconnectedUDP:0x000000010685e970>):005:0> host
=> "fe80::887:c7ff:fe62:d64%en0"

I have a feeling something about this %en0 (network interface?!) in the ipv6 DNS hostname is not happy. Removing it works fine.

Addrinfo.ip("fe80::887:c7ff:fe62:d64").ip_address
=> "fe80::887:c7ff:fe62:d64"

Edit: I can also confirm that this %en0 seemingly gets added when I'm tethering to an AT&T device from macOS since it is listed as such in resolv.conf:

cat /etc/resolv.conf
#
# macOS Notice
#
# This file is not consulted for DNS hostname resolution, address
# resolution, or the DNS query routing mechanism used by most
# processes on this system.
#
# To view the DNS configuration used by this system, use:
#   scutil --dns
#
# SEE ALSO
#   dns-sd(1), scutil(8)
#
# This file is automatically generated.
#
nameserver fe80::887:c7ff:fe62:d64%en0
nameserver 172.20.10.1

But this isn't shown in the macOS networking settings if you look at DNS servers:

image

AFAIK en0 in macOS parlance is an identifier for Wi-Fi network interface as this shows:

$ networksetup -listallhardwareports | grep -C 2 en0

Hardware Port: Wi-Fi
Device: en0
Ethernet Address: f0:2f:5b:01:23:b8

It seems like Resolv is choking on this identifier when it likely should be entirely ignored. Might have to file a Ruby bug report for this.

Originally posted by @olivierlacan in socketry/async-http#107 (comment)

@ioquatix
Copy link
Member Author

I suspect there are two solutions possible:

  1. resolv.rb should ignore %interface suffix from resolv.conf
  2. Addrinfo.ip("...%interface") suffix should be ignored.

(2) feels more general.

@ioquatix ioquatix self-assigned this Apr 25, 2023
@ioquatix
Copy link
Member Author

@trevorturk @olivierlacan what versions of Ruby are you using?

@ioquatix
Copy link
Member Author

I tested this on my Linux desktop.

My valid network interface:

2: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 58:11:22:be:55:02 brd ff:ff:ff:ff:ff:ff
    altname enp10s0
    inet 192.168.1.41/24 metric 1024 brd 192.168.1.255 scope global dynamic eno2
       valid_lft 563sec preferred_lft 563sec
    inet6 2406:e000:6833:a800:5a11:22ff:febe:5502/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 2591931sec preferred_lft 604731sec
    inet6 fe80::5a11:22ff:febe:5502/64 scope link 
       valid_lft forever preferred_lft forever

The results:

samuel@aiko ~/P/k/protocol-quic (main)> ruby -v -rsocket -e 'p Addrinfo.ip("fe80::887:c7ff:fe62:d64%eno2").ip_address'
ruby 3.2.1 (2023-02-08 revision 31819e82c8) [x86_64-linux]
"fe80::887:c7ff:fe62:d64%eno2"
samuel@aiko ~/P/k/protocol-quic (main)> ruby -v -rsocket -e 'p Addrinfo.ip("fe80::887:c7ff:fe62:d64%eno123").ip_address'
ruby 3.2.1 (2023-02-08 revision 31819e82c8) [x86_64-linux]
-e:1:in `ip': getaddrinfo: Name or service not known (SocketError)
	from -e:1:in `<main>'

It looks like the interface name, at least on Linux, must be correct.

@ioquatix
Copy link
Member Author

ioquatix commented Apr 26, 2023

On a Darwin based system, you can get the list of interfaces using ipconfig -a.

On my system, en0 is a valid interface.

samuel@sakura ~/D/k/protocol-quic (main)> ruby -v -rsocket -e 'p Addrinfo.ip("fe80::887:c7ff:fe62:d64%en0").ip_address'
ruby 3.2.1 (2023-02-08 revision 31819e82c8) [arm64-darwin22]
"fe80::887:c7ff:fe62:d64%en0"
samuel@sakura ~/D/k/protocol-quic (main)> ruby -v -rsocket -e 'p Addrinfo.ip("fe80::887:c7ff:fe62:d64%en12345").ip_address'
ruby 3.2.1 (2023-02-08 revision 31819e82c8) [arm64-darwin22]
"fe80::887:c7ff:fe62:d64"

So it does seem to work correctly. But unlike Linux, it also works even if the interface is invalid. The question is, why is this not working in resolve.rb?

@olivierlacan
Copy link

olivierlacan commented Apr 26, 2023

@ioquatix Reproduced this issue on 3.2.2.

@ioquatix
Copy link
Member Author

@olivierlacan it seems like the problem is not just with Ruby, but something to do with the OS.

At the time you do it, is the interface listed in ipconfig -a?

@trevorturk
Copy link

trevorturk commented Apr 26, 2023

I'm sorry to say that since switching from AT&T to Verizon the issue hasn't happened to me again, and I can't seem to reproduce now! (I guess that's good news in a way, but I'm sorry I can't reproduce...)

Here's the output from my terminal:

$ networksetup -listallhardwareports | grep -C 2 en0

Hardware Port: Wi-Fi
Device: en0
Ethernet Address: f0:2f:4b:06:b0:1e

...but the ipconfig -a command seems different for me and I'm not sure what I should be running:

ipconfig -a
usage: ipconfig <command> <args>
where <command> is one of waitall, getifaddr, ifcount, getoption, getiflist, getsummary, getpacket, getv6packet, getra, getdhcpduid, getdhcpiaid, set, setverbose

@olivierlacan
Copy link

olivierlacan commented Apr 26, 2023

@trevorturk Looks like ifconfig is the command on macOS, at least that's the one I remember using in the past:

At the time you do it, is the interface listed in ipconfig -a?

Tethering from macOS to iOS connected to AT&T, whose name servers are: fe80::887:c7ff:fe62:d64%en0 and 172.20.10.1 (in /etc/resolv.conf):

$ ifconfig -a | grep -C 1 en0
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	options=6463<RXCSUM,TXCSUM,TSO4,TSO6,CHANNEL_IO,PARTIAL_CSUM,ZEROINVERT_CSUM>
	ether f0:2f:4b:01:23:b6
	inet6 fe80::1090:9f9d:2ebe:2d34%en0 prefixlen 64 secured scopeid 0xe
	inet 172.20.10.12 netmask 0xfffffff0 broadcast 172.20.10.15

For comparison's sake this is what a regular coffee shop Wi-Fi connection yields:

en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	options=6463<RXCSUM,TXCSUM,TSO4,TSO6,CHANNEL_IO,PARTIAL_CSUM,ZEROINVERT_CSUM>
	ether f0:2f:4b:01:23:b6
	inet6 fe80::1090:9f9d:2ebe:2d34%en0 prefixlen 64 secured scopeid 0xe
	inet 10.255.206.97 netmask 0xfffffc00 broadcast 10.255.207.255

Interestingly the inet6 (IPV6 name server?) sticks from the AT&T tethered connection in this response despite being wholly gone from /etc/resolv.conf so there's likely some caching happening in ifconfig.

$ cat /etc/resolv.conf | grep -C 1 nameserver
search lan
nameserver 10.255.204.1

The ifconfig output persists across nameserver cache refreshes at the OS level with:

sudo dscacheutil -flushcache
sudo killall -HUP mDNSResponder

Those typically do the trick with sticky DNS configs on macOS.

@olivierlacan
Copy link

olivierlacan commented Apr 26, 2023

Relevant Ruby issues:

I added assert_match(Resolv::IPv6::Regex, "fe80::1090:9f9d:2ebe:2d34%en0", bug17112) to the regression test @jeremyevans added at the time and it passes as expected.

I think the issue involves Resolv's use of Addrinfo.ip(host).ip_address to figure out the request sender info in fetch_resource:

fetch_resource gets called by each_resource and theoretically there's a different path to use AAAA records for IPV6 but it seems like that breaks down the name server has a suffix (like the one AT&T is sending me).

In my case I'm definitely using that IPV6 branch:

Socket.ip_address_list.any? {|a| a.ipv6? && !a.ipv6_loopback? && !a.ipv6_linklocal? }
=> true

Crucially, this also returns true within an Async block.

Even more interestingly, if I do this manually using the documentation example in Resolv which specifically passes an IPV6 Resource to getresources it works fine outside of an Async block but breaks within one:

irb(main):027:1* Resolv::DNS.open do |dns|
irb(main):028:1*    ress = dns.getresources "google.com", Resolv::DNS::Resource::IN::AAAA
irb(main):029:1*    p ress.map(&:address)
irb(main):030:0>  end
[#<Resolv::IPv6 2607:f8b0:4006:821::200e>]
=> [#<Resolv::IPv6 2607:f8b0:4006:821::200e>]
irb(main):031:1* Async {
irb(main):032:2*   Resolv::DNS.open do |dns|
irb(main):033:2*     ress = dns.getresources "google.com", Resolv::DNS::Resource::IN::AAAA
irb(main):034:2*     p ress.map(&:address)
irb(main):035:1*   end
irb(main):036:0> }
   16m     warn: Async::Task [oid=0xa198c] [ec=0xa19a0] [pid=9331] [2023-04-26 14:58:11 -0400]
               | Task may have ended with unhandled exception.
               |   SocketError: getaddrinfo: nodename nor servname provided, or not known
               |   → /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:771 in `ip'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:771 in `sender'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:527 in `block in fetch_resource'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1128 in `block (3 levels) in resolv'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1126 in `each'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1126 in `block (2 levels) in resolv'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1125 in `each'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1125 in `block in resolv'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1123 in `each'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:1123 in `resolv'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:521 in `fetch_resource'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:507 in `each_resource'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:498 in `getresources'
               |     (irb):33 in `block (2 levels) in <top (required)>'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/3.2.0/resolv.rb:298 in `open'
               |     (irb):32 in `block in <top (required)>'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.5.0/lib/async/task.rb:158 in `block in run'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.5.0/lib/async/task.rb:310 in `block in schedule'
=> #<Async::Task:0x00000000000a198c>

@olivierlacan
Copy link

It looks to me like this commit introduced the bug I'm encountering, at least in conjunction with Async: 5c16180

Addrinfo does not handle suffixed IPV6 IPs within an Async:

Async { Addrinfo.ip("fe80::887:c7ff:fe62:d64%en0").ip_address }
               | Task may have ended with unhandled exception.
               |   SocketError: getaddrinfo: nodename nor servname provided, or not known
               |   → (irb):40 in `ip'
               |     (irb):40 in `block in <top (required)>'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.5.0/lib/async/task.rb:158 in `block in run'
               |     /Users/olivierlacan/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.5.0/lib/async/task.rb:310 in `block in schedule'

Outside of Async, everything is fine:

Addrinfo.ip("fe80::887:c7ff:fe62:d64%en0").ip_address
=> "fe80::887:c7ff:fe62:d64%en0"

@olivierlacan
Copy link

Helpful context from Mastodon thread:

@olivierlacan oh! that's your link-local IPv6 configuration for SEcureNeighborDiscovery on the LAN. That nameserver is just a reference to the same interface (peep the fe80 local prefix and the en0 suffix). As long as inet6 is enabled, it'll generate that and starting with 10.12 they switched from stable addresses (with ff:fe and your mac address) to Cryptographically Generated Addresses. https://binblog.de/2017/09/21/ipv6-privacy-stable-addressing-roundup/ has a great summary
blargh, /s/nameserver/address/g

@ioquatix
Copy link
Member Author

Maybe it's the async resolver mechanism that has problems understanding the address.

cc @bruno-

@bruno-
Copy link

bruno- commented May 2, 2023

Hi,
during the work on Addrinfo.getaddrinfo scheduler hook I did not address the scenario of interface suffix.

I'm still catching up on this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants