TL;DR: My host, https://fly.io, is no longer offering dedicated IPv4 for free. Setting up IPv6 is challenging. I should have read the error message more carefully and, in the end, just used something up-to-date! BTW, Did you know about domain fronting?
Introduction
My host, fly.io is no longer giving dedicated IPv4 for free. I thought, why not make my blog IPv6 only? It’s my way of advocating for an IPv6 internet. There are some known issues between github and IPv6. I’m therefore aware that it might not be easy, but I’m not GitHub, right? It’s 2024, how hard could it be to serve my website over IPv6?
The plan
In my DNS, I have both IPv6 and IPv4 record (A
(IPv4) and AAAA
(IPv6)). I hope I set up everything correctly when I started blogging. I’ll simply delete the A
record, then I’ll disable the IPv4 interface on my VM and release the IP. Simply checking the connection after deleting the A
record is a good start!
Unfortunately, it doesn’t go as expected. Let’s recap what we know about the web, and we’ll try to debug that.
A recap of the web
Okay, we know how to connect to a server. We use a domain name, ask a DNS to resolve the domain name to an actual IP address and then we connect to the IP address and retrieve our content, as depicted below.
Therefore, if we want to serve the content over IPv6, we just have to delete the A record pointing to my VM and then disable the IPv4 interface on my VM. Like we did… but it didn’t work.
Gathering information
Let’s quickly grab the IPv6 address again and try to connect over IPv6, then we can disconnect IPv4.
|
|
Mmm… wrong command.
|
|
As a reminder, -a
stands for app
. Okay, we’ve got the IP; now we can try to connect to it. But wait, I might need horizontal scaling for when this blog becomes popular. You know what? I’ll quickly set up horizontal scaling. It’s only one command.
|
|
And now, we should have two machines. One located in Paris, (aka cdg: Charles-De-Gaulle) and the other in Frankfurt. Yes, I know it’s a bit dumb to have two machines so close to each other, but I’m just testing things out. Anyway, let’s look at the info again.
|
|
Okay… wait a minute, there are two IPs!
It makes sense because I have two machines. How am I supposed to update the DNS record now? Should I create two AAAA
records?
Fuck, those don’t even correspond to the ones I set up in my DNS’s AAAA
record. The value stored in my DNS is 2a09:8280:1::24:ff79
. Where did it come from?
Enough CLI for now, what does the web interface say?The displayed IPv6 address matches the one in my DNS, leading to an interesting situation: I have three IPs for two machines. Hold on a moment, let’s take a closer look— fdaa...
It seems unusual to have two IPs sharing the same prefix. Is it a private IP? It’s easy to recognize private IPv4 they are 192.xxx...
or 10.xxx...
but IPv6! I’ve long forgotten what I learned in the network course. Hopefully, Fly documentation is good!
|
|
Yup, it’s a private IP, not routed on the internet. Fly might want to say private IP and not just IP. You really need to have worked with IPv6 to recognize them easily. 2a09:8280:1::24:ff79
is the anycast address. By connecting to it the client will reach the closest node (either the machine in Paris or Frankfurt). Here closest is defined by the routing protocol, in this case BGP.
Connecting through IPv6
OK, we have the IPv6 address: 2a09:8280:1::24:ff79
. Let’s attempt to connect to it, and then we can simply forget about the private IPs, the DNS and the IPv4 stack. We can force curl to use IPv6 by using the --ipv6
flag, which is more readable than it’s shorthand -6
. We also need to enclose the IP with brackets and use ''
to avoid shell expansion.
|
|
Mmm, bad news! Let’s try from a VM in the Oracle cloud located in Zürich.
|
|
A different error message is an improvement, right? The good news is that I recognize this error; the bad news is the network unreachable
is painful to solve. Let’s go step by step and start with the basics by trying to ping it from my computer. (I’m using ping6
to force the use of IPv6)
|
|
It works!
|
|
Until it doesn’t.
Recap
My computer is able to ping it but is unable to retrieve the content, whereas the VM in the cloud can’t even ping it. The VM is located in Zürich, Switzerland, while I’m in Geneva, Switzerland.
Hypothesis
- The static server I’m using does not listen on IPv6.
- The IPv6 address is not reachable from the internet, as suggested by the error codes, even though it’s not a private IP. It’s odd that my PC is able to ping it, while the VM cannot.
Try #1
I’ll test hypothesis number 1 by setting up everything on my cloud’s VM and using curl to check if the server is listening on IPv6.
|
|
As you can see, the Dockerfile is short, serving the files using a static binary built with Go. Now, let’s fire up Docker on the VM.
|
|
Yeah, yeah, I know, I should not use docker as root, but I’m in a hurry and don’t remember any podman commands. Everything seems to be running fine. Let’s check if it’s listening on IPv6.
|
|
For completeness sake, I also tried to connect using IPv4.
Just to be sure, let’s check the source code of the static server
|
|
According to the net documentation and a random tutorial on the internet, the http.ListenAndServe
function listens on both IPv4 and IPv6 interfaces on port :8043
.
I would have preferred if option one were true, so i could simply change my web server.
You thought it couldn’t get any worse ?
Let’s try connecting to the Fly machine SSH and list the interfaces. I’m not sure of what I’m doing, but hey, at this point I’m just trying things out.
|
|
I want to cry, what are the logs saying ?
|
|
This is useful! A quick duckduckgo search tells me that for SSH to work, I need a working setup. i.e. a valid filesystem (/etc/passwd
must exist), and the user I’m connecting to must also exist on the system. However, the dockerfile uses the scratch image as a base. And scratch has an empty filesystem. Therefore, I can’t SSH into the fly’s machine.
Try #2
My plan is simple: I’ll ask for help from a friend, because I’m lost! I’m stuck and clueless about how to verify hypothesis 2. So I reached out to @ODAncona to be my rubber duck. I explained what I had done, and as we went through the steps, lo and behold! He can ping the IPv6 address. Our computers can ping the server, but the oracle’s VM cannot. This leads me to suspect some cloud shenanigans are at play, and I should not trust the output from the Oracle’s VM. In the end, it’s still not working, but at least I’m now confident that the Fly’s machine is reachable from the internet via its IPv6 address. The next step is to attempt retrieving content over HTTP from ODAncona’s PC.
|
|
He is also unable to retrieve the content. But… I might have messed up while interpreting the error messages:
He encounters a TLS error, whereas I face a SSL_ERROR_SYSCALL
error. The no route to host
is only happening on the Zürich’s VM. I overlooked the VM’s failure to connect to the server! I’ve been going in the wrong direction!
Let’s ignore the VM for now.
We’re both dealing with the same problem : curl: (35) ...
It’s on the encryption layer!
A new hope hypothesis
The goStatic readme (the web-server I’m using) states that they’ve stopped supporting TLS and recommend switching to caddy instead. Given that our errors are tied to TLS/SSL, I’ve decided to give Caddy a shot. I know that I’m currently only using HTTP and Fly manages the TLS connection, but you know, at this point, any potential solution is worth exploring.
Changing the docker image
I will try to solve every problem at once. It sounds risky, I’m aware. But hear me out. By switching from goStatic to Caddy, I can use an image based on alpine linux instead of scratch, that could solve both my problems: the SSH connection and the TLS problem (I hope).
|
|
Now that the Dockerfile utilizes Caddy based on Alpine, SSH should work. I need Caddy to listen on the same port as goStatic: 8043. I can set this in the Caddyfile. Like that, I’m not changing the Fly app settings, because they worked, I don’t want to introduce new problems.
|
|
This simply tells caddy to listen on port 8043 and to serve the files located in /usr/share/caddy
. BUT, if a request is coming to /health_check
it should respond with OK
.
Let’s try it:
|
|
First I’m building an image on based on the current Dockerfile and naming this image jsch-caddy, then I’m running a container based on the jsch-caddy image and mapping the port 8043 of the container to the port 8043 of the host.
I think I don’t need to enable TLS on Caddy, as the TLS is terminated by fly.io, meaning the connection between the client and the Fly edge is encrypted, but the connection between the Fly edge and my Fly’s machine is not. The workflow of the request is illustrated below.
The Fly’s app is left untouched, we can confirm that it already handles TLS and forces https.
|
|
and now if I try to deploy, it should work.
|
|
Ok, it’s deployed and the website is up! Let’s try to connect using SSH.
|
|
Ok, one problem solved, now let’s inspect if it is still accessible!The page also shows in my browser! it’s a great start, but I have no way to check if I’m connecting over IPv6 or IPv4 using the browser. I’ll use curl to check if I can connect to the website over IPv6.
|
|
Ok, same error as before. What about IPv4 ?
|
|
What… same error, but it works in my browser! Did I ever try to connect to IPv4 using curl? Am I using curl the wrong way? The website showed in my browser, was it a cache? I’m dumb, IPv4 addresses are shared! Obviously, I cannot connect to a specific website using the IP address. Fly doesn’t know which server it’s supposed to reach! I need to specify the domain name.
|
|
It’s always in the TLS session. Does it work over HTTP ?
|
|
Ahhah, HTTP over IPv4 is working, we got a redirect! What about IPv6?
|
|
HTTP over IPv6 is working too! I got for both IPv4 and IPv6 a 301 Moved Permanently
which is expected as the Fly app forces https.
The problem is only with TLS over IPv6.
Years of reading blogs comes to the rescue
I remember reading an article about domain fronting years ago. The concept is easy, you have multiple layer during an HTTPS connection
|
|
The important thing is: There are two indicator of the server you want to contact:
- The host header in the HTTP request
- The SNI header in the TLS communication
The HTTP request is passed through the TLS tunnel, therefore it’s encrypted and Fly cannot use that to route the request to the correct machine. The SNI is not encrypted and can be used to route the request to the correct VM.
Even if I’m setting the Host header, Fly cannot see it, it’s encrypted!
I need to set the SNI header to my domain name. We do it using --connect-to
. This is not intuitive as IPv6 addresses are not shared (unlike IPv4), therefore Fly should not use the SNI to route the request. The IPv6 is already a unique server indicator. But, let’s try it, you know, for completeness sake.
|
|
It’s working ! Now, for the real test. I’ll release the IPv4 address and delete the A
record in my DNS.
|
|
I deleted the A
record and everything works fine! Yeaa, let’s hope it’s not due to a cache somewhere.
Confirming the problem
I messed up with curl, I should have used --connect-to
from the beginning. I want to confirm that the old image was indeed the problem. Let’s quickly revert the change to the Dockerfile and launch the old version of the app. Then I can use curl with the correct --connect-to
option and see if it’s working.
|
|
Now we deploy again the previous image that uses goStatic. There is still no IPv4 address associated to the VM and no A
record in the DNS.
For completeness sake, I’ll use curl with the --connect-to
option to confirm that the docker image was the problem.
|
|
We clearly have the same error, this means that the configuration was correct, the problem was coming from the goStatic binary.
The remaining problem
We haven’t addressed a huge problem. Someone with only IPv4 connectivity is unable to access our website. They will just have a generic error saying “the domain is not reachable”. We’d like to make a website accessible to IPv4 users telling them to change ISP. But that would be counter productive to block user from reading my blog. Hopefully by reading this blog post, they’ll realise that they need to change ISP.
Conclusion
It was harder than I thought, but we made It! Furthermore, we set up horizontal scaling which is cool! I’m still not sure why the goStatic binary is not able to work properly over IPv6 and TLS. Nonetheless, the new Dockerfile and Caddy should be up to date for a few years. That’s the advantage of using a well known project.
This little experiment let me refresh the concept of private IP, docker, and obviously domain fronting which is a really cool subject. I hope you learned something too.