I’m serving my blog over IPv6

TL;DR: My host, https://fly.io, is no longer offering dedicated IPv4 for free. Setting up IPv6 is challenging. I should have read the error message more carefully and, in the end, just used something up-to-date! BTW, Did you know about domain fronting?

⚠️

This is a published draft, meaning that the article is not finished. I’m just writing it on the open.

Introduction

My host, fly.io is no longer giving dedicated IPv4 for free. I thought, why not make my blog IPv6 only? It’s my way of advocating for an IPv6 internet.
There are some known issues between github and IPv6. I’m therefore aware that it might not be easy, but I’m not GitHub, right? It’s 2024, how hard could it be to serve my website over IPv6?

The plan

In my DNS, I have both IPv6 and IPv4 record (A(IPv4) and AAAA (IPv6)). I hope I set up everything correctly when I started blogging. I’ll simply delete the A record, then I’ll disable the IPv4 interface on my VM and release the IP. Simply checking the connection after deleting the A record is a good start!

This doesn’t seem to work…

This doesn’t seem to work…

Unfortunately, it doesn’t go as expected. Let’s recap what we know about the web, and we’ll try to debug that.

A recap of the web

Okay, we know how to connect to a server. We use a domain name, ask a DNS to resolve the domain name to an actual IP address and then we connect to the IP address and retrieve our content, as depicted below.

This is the workflow in 4 steps to get a webpage 1: Ask a dns server the ip of the domain, 2. Get the response, 3. Contact the web server at the aforementioned address, 4. Receive the content

Usual request flow from querying the dns to getting the webpage

Therefore, if we want to serve the content over IPv6, we just have to delete the A record pointing to my VM and then disable the IPv4 interface on my VM. Like we did… but it didn’t work.

Gathering information

Let’s quickly grab the IPv6 address again and try to connect over IPv6, then we can disconnect IPv4.

zsh
1
2
3
4
fly status
#Machines
#PROCESS	ID            	VERSION	REGION	STATE  		LAST UPDATED
#app    	5683993b511048	2      	cdg   	started	    2024-02-07T08:14:49Z

Mmm… wrong command.

zsh
1
2
3
4
fly machine list -a jsch-website
#jsch-website
#ID            	STATE  	REGION	IP ADDRESS                      
#5683993b511048	started	cdg   	fdaa:0:89a7:a7b:aec1:1e72:f6df:2  

As a reminder, -a stands for app. Okay, we’ve got the IP; now we can try to connect to it. But wait, I might need horizontal scaling for when this blog becomes popular. You know what? I’ll quickly set up horizontal scaling. It’s only one command.

zsh
1
 fly scale count 1 --region fra

And now, we should have two machines. One located in Paris, (aka cdg: Charles-De-Gaulle) and the other in Frankfurt. Yes, I know it’s a bit dumb to have two machines so close to each other, but I’m just testing things out. Anyway, let’s look at the info again.

zsh
1
2
3
4
5
fly machine list -a jsch-website
#jsch-website
#ID            	STATE  	REGION	IP ADDRESS                      
#32873d3f641d58	stopped	fra   	fdaa:0:89a7:a7b:cbb7:99d8:6a51:2
#5683993b511048	stopped	cdg   	fdaa:0:89a7:a7b:aec1:1e72:f6df:2
A meme from star wars where the text says 'This is getting out of hand. Now there are two of them!' In star wars this refer to the two Siths (the enemy) while here it refers to the IPs making a funny remark

Okay… wait a minute, there are two IPs!

It makes sense because I have two machines. How am I supposed to update the DNS record now? Should I create two AAAA records? Fuck, those don’t even correspond to the ones I set up in my DNS’s AAAA record. The value stored in my DNS is 2a09:8280:1::24:ff79. Where did it come from? Enough CLI for now, what does the web interface say?

Weird, we now only have 1 IPv6 address!

Weird, we now only have 1 IPv6 address!

The displayed IPv6 address matches the one in my DNS, leading to an interesting situation: I have three IPs for two machines. Hold on a moment, let’s take a closer look— fdaa... It seems unusual to have two IPs sharing the same prefix. Is it a private IP? It’s easy to recognize private IPv4 they are 192.xxx... or 10.xxx... but IPv6! I’ve long forgotten what I learned in the network course. Hopefully, Fly documentation is good!

zsh
1
2
3
4
fly ips private list
# ID            	REGION	IP
# 32873d3f641d58	fra   	fdaa:0:89a7:a7b:cbb7:99d8:6a51:2
# 5683993b511048	cdg   	fdaa:0:89a7:a7b:aec1:1e72:f6df:2

Yup, it’s a private IP, not routed on the internet. Fly might want to say private IP and not just IP. You really need to have worked with IPv6 to recognize them easily. 2a09:8280:1::24:ff79 is the anycast address. By connecting to it the client will reach the closest node (either the machine in Paris or Frankfurt). Here closest is defined by the routing protocol, in this case BGP.

Connecting through IPv6

OK, we have the IPv6 address: 2a09:8280:1::24:ff79. Let’s attempt to connect to it, and then we can simply forget about the private IPs, the DNS and the IPv4 stack. We can force curl to use IPv6 by using the --ipv6 flag, which is more readable than it’s shorthand -6. We also need to enclose the IP with brackets and use '' to avoid shell expansion.

zsh
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
curl --verbose --ipv6 'https://[2a09:8280:1::24:ff79]'
#*   Trying [2a09:8280:1::24:ff79]:443...
#* Connected to 2a09:8280:1::24:ff79 (2a09:8280:1::24:ff79) port 443
#* ALPN: curl offers h2,http/1.1
#* (304) (OUT), TLS handshake, Client hello (1):
#*  CAfile: /etc/ssl/cert.pem
#*  CApath: none
#* LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 2a09:8280:1::24:ff79:443
#* Closing connection
#curl: (35) LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 2a09:8280:1::24:ff79:443

Mmm, bad news! Let’s try from a VM in the Oracle cloud located in Zürich.

zsh
1
2
3
4
5
6
SSH oracleCloud
ubuntu@oracleCloud:~$ curl --ipv6 'https://[2a09:8280:1::24:ff79]'
#*   Trying 2a09:8280:1::24:ff79:443...
#*   Immediate connect fail for 2a09:8280:1::24:ff79: Network is unreachable
#*   Closing connection 0
#curl: (7) Couldn't connect to server

A different error message is an improvement, right? The good news is that I recognize this error; the bad news is the network unreachable is painful to solve. Let’s go step by step and start with the basics by trying to ping it from my computer. (I’m using ping6 to force the use of IPv6)

zsh
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
#From my home computer
ping6 2a09:8280:1::24:ff79

#PING6(56=40+8+8 bytes) 2a02:1210:54c8:2200:35fb:c930:10f1:37af --> 2a09:8280:1::24:ff79
#16 bytes from 2a09:8280:1::24:ff79, icmp_seq=0 hlim=55 time=12.009 ms
#16 bytes from 2a09:8280:1::24:ff79, icmp_seq=1 hlim=55 time=11.397 ms
#^C
#--- 2a09:8280:1::24:ff79 ping6 statistics ---
#2 packets transmitted, 2 packets received, 0.0% packet loss
#round-trip min/avg/max/std-dev = 11.397/11.703/12.009/0.306 ms

It works!

zsh
1
2
3
# From my VM in oracle cloud in Zürich
ping6 2a09:8280:1::24:ff79
ping6: connect: Network is unreachable

Until it doesn’t.

Recap

My computer is able to ping it but is unable to retrieve the content, whereas the VM in the cloud can’t even ping it. The VM is located in Zürich, Switzerland, while I’m in Geneva, Switzerland.

Hypothesis

  1. The static server I’m using does not listen on IPv6.
  2. The IPv6 address is not reachable from the internet, as suggested by the error codes, even though it’s not a private IP. It’s odd that my PC is able to ping it, while the VM cannot.

Try #1

I’ll test hypothesis number 1 by setting up everything on my cloud’s VM and using curl to check if the server is listening on IPv6.

zsh
1
2
3
4
5
6
#On my computer in $JSCH. public/ is the folder storing the html files
rsync -r public/ oracleCloud:~/jsch/public
rsync  Dockerfile oracleCloud:~/jsch/
cat Dockerfile
#FROM pierrezemb/gostatic
#COPY ./public/ /srv/http/

As you can see, the Dockerfile is short, serving the files using a static binary built with Go. Now, let’s fire up Docker on the VM.

zsh
1
2
3
4
5
6
#On the vm in ~/jsch
sudo docker build -t jsch-static .
sudo docker run -d -p 8043:8043 --name jsch-website jsch-static
sudo docker ps
#CONTAINER ID   IMAGE         COMMAND       CREATED         STATUS         PORTS                                       NAMES
#66ec6842fe27   jsch-static   "/goStatic"   5 minutes ago   Up 5 minutes   0.0.0.0:8043->8043/tcp, :::8043->8043/tcp   jsch-website

Yeah, yeah, I know, I should not use docker as root, but I’m in a hurry and don’t remember any podman commands. Everything seems to be running fine. Let’s check if it’s listening on IPv6.

zsh
1
2
3
4
5
curl --ipv6 http://[::1]:8043
# Tones of HTML, it's working!

curl http://localhost:8043
# Tones of HTML, it's working!

For completeness sake, I also tried to connect using IPv4.

Just to be sure, let’s check the source code of the static server

go
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
var (
	// Def of flags
	portPtr                  = flag.Int("port", 8043, "The listening port")
    //...
)
//...
port := ":" + strconv.FormatInt(int64(*portPtr), 10)
//...
if err := http.ListenAndServe(port, nil); err != nil && err != http.ErrServerClosed {
    log.Fatal().Err(err).Msg("Server startup failed")
}

According to the net documentation and a random tutorial on the internet, the http.ListenAndServe function listens on both IPv4 and IPv6 interfaces on port :8043.

😔

I would have preferred if option one were true, so i could simply change my web server.

You thought it couldn’t get any worse ?

Let’s try connecting to the Fly machine SSH and list the interfaces. I’m not sure of what I’m doing, but hey, at this point I’m just trying things out.

zsh
1
2
3
fly ssh console
#Connecting to fdaa:0:89a7:a7b:aec1:1e72:f6df:2... complete
#Error: error connecting to SSH server: SSH: handshake failed: SSH: unable to authenticate, attempted methods [none publickey], no supported methods remain

I want to cry, what are the logs saying ?

zsh
1
2
3
fly logs
#[cdg] [info] user: 'root' does not exist
#[cdg] [info] unexpected error: [SSH: no auth passed yet, user: 'root' does not exist]

This is useful! A quick duckduckgo search tells me that for SSH to work, I need a working setup. i.e. a valid filesystem (/etc/passwd must exist), and the user I’m connecting to must also exist on the system. However, the dockerfile uses the scratch image as a base. And scratch has an empty filesystem. Therefore, I can’t SSH into the fly’s machine.

Try #2

My plan is simple: I’ll ask for help from a friend, because I’m lost!
I’m stuck and clueless about how to verify hypothesis 2. So I reached out to @ODAncona to be my rubber duck.
I explained what I had done, and as we went through the steps, lo and behold! He can ping the IPv6 address.
Our computers can ping the server, but the oracle’s VM cannot. This leads me to suspect some cloud shenanigans are at play, and I should not trust the output from the Oracle’s VM. In the end, it’s still not working, but at least I’m now confident that the Fly’s machine is reachable from the internet via its IPv6 address. The next step is to attempt retrieving content over HTTP from ODAncona’s PC.

zsh
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
ODAncona@RocketPC:$ curl --ipv6 'https://[2a09:8280:1::24:ff79]' --verbose
Trying 2a09:8280:1::24:ff79:443...
Connected to 2a09:8280:1::24:ff79 (2a09:8280:1::24:ff79) port 443 (#0)
ALPN, offering h2
ALPN, offering http/1.1
CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
TLSv1.0 (OUT), TLS header, Certificate Status (22):
TLSv1.3 (OUT), TLS handshake, Client hello (1):
TLSv1.0 (OUT), TLS header, Unknown (21):
TLSv1.3 (OUT), TLS alert, decode error (562):
error:0A000126:SSL routines::unexpected eof while reading
Closing connection 0
curl: (35) error:0A000126:SSL routines::unexpected eof while reading

He is also unable to retrieve the content. But… I might have messed up while interpreting the error messages: He encounters a TLS error, whereas I face a SSL_ERROR_SYSCALL error. The no route to host is only happening on the Zürich’s VM. I overlooked the VM’s failure to connect to the server! I’ve been going in the wrong direction! Let’s ignore the VM for now.
We’re both dealing with the same problem : curl: (35) ... It’s on the encryption layer!

A new hope hypothesis

The goStatic readme (the web-server I’m using) states that they’ve stopped supporting TLS and recommend switching to caddy instead. Given that our errors are tied to TLS/SSL, I’ve decided to give Caddy a shot. I know that I’m currently only using HTTP and Fly manages the TLS connection, but you know, at this point, any potential solution is worth exploring.

Changing the docker image

I will try to solve every problem at once. It sounds risky, I’m aware. But hear me out. By switching from goStatic to Caddy, I can use an image based on alpine linux instead of scratch, that could solve both my problems: the SSH connection and the TLS problem (I hope).

diff
1
2
3
4
5
6
7
8
9
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,3 +1,3 @@
-FROM pierrezemb/gostatic
-COPY ./public/ /srv/http/
-
+FROM caddy:2.7.6-alpine
+COPY Caddyfile /etc/caddy/Caddyfile
+COPY ./public/ /usr/share/caddy

Now that the Dockerfile utilizes Caddy based on Alpine, SSH should work. I need Caddy to listen on the same port as goStatic: 8043. I can set this in the Caddyfile. Like that, I’m not changing the Fly app settings, because they worked, I don’t want to introduce new problems.

caddy
1
2
3
4
5
6
7
:8043 {
	root * /usr/share/caddy
	file_server
	handle /health_check {
		respond "OK"
	}
}

This simply tells caddy to listen on port 8043 and to serve the files located in /usr/share/caddy. BUT, if a request is coming to /health_check it should respond with OK.

Let’s try it:

zsh
1
2
3
4
sudo docker build -t jsch-caddy .
sudo docker run -d -p 8043:8043 jsch-caddy
curl --ipv6 'http://[::1]:8043' #Tones of HTML, it's working!
curl http://localhost:8043 #Tones of HTML, it's working!

First I’m building an image on based on the current Dockerfile and naming this image jsch-caddy, then I’m running a container based on the jsch-caddy image and mapping the port 8043 of the container to the port 8043 of the host.

I think I don’t need to enable TLS on Caddy, as the TLS is terminated by fly.io, meaning the connection between the client and the Fly edge is encrypted, but the connection between the Fly edge and my Fly’s machine is not. The workflow of the request is illustrated below.

The new workflow shows the Fly 's reverse proxy, the rest is the same as before. A TLS tunnel exist from my computer to the Fly's reverse proxy

A more accurate workflow, this does not include the anycast mechanism for simplicity and correctness (The complex version would surely be wrong, because I don’t know Fly’s internals enough.)

The Fly’s app is left untouched, we can confirm that it already handles TLS and forces https.

toml
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#Fly.toml's extract 
[[services.ports]]
    port = 80
    handlers = ['http']
    force_https = true

[[services.ports]]
    port = 443
    handlers = ['tls', 'http']

[[http_service.checks]]
    interval = '30s'
    timeout = '5s'
    grace_period = '10s'
    method = 'GET'
    path = '/health_check'

and now if I try to deploy, it should work.

zsh
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
fly deploy
# ==> Verifying app config
# Validating jsch/fly.toml
# Platform: machines
# ✓ Configuration is valid
# --> Verified app config
# ==> Building image...
# --> Building image done
# ==> Pushing image to fly
# --> Pushing image done
# -------
#  ✔ [1/2] Machine 32873d3f641d58 [app] update succeeded
#  ✔ [2/2] Machine 5683993b511048 [app] update succeeded
# -------

Ok, it’s deployed and the website is up! Let’s try to connect using SSH.

zsh
1
2
3
fly ssh console
#Connecting to fdaa:0:89a7:a7b:cbb7:99d8:6a51:2... complete
#32873d3f641d58:/srv# %

Ok, one problem solved, now let’s inspect if it is still accessible!

I show in this picture that using my browser I can access the home page of my website. The picture show the homepage

The homepage is accessible

The page also shows in my browser! it’s a great start, but I have no way to check if I’m connecting over IPv6 or IPv4 using the browser. I’ll use curl to check if I can connect to the website over IPv6.

zsh
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
curl --verbose --ipv6 'https://[2a09:8280:1::24:ff79]'
# *   Trying [2a09:8280:1::24:ff79]:443...
# * Connected to 2a09:8280:1::24:ff79 (2a09:8280:1::24:ff79) port 443
# * ALPN: curl offers h2,http/1.1
# * (304) (OUT), TLS handshake, Client hello (1):
# *  CAfile: /etc/ssl/cert.pem
# *  CApath: none
# * LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 2a09:8280:1::24:ff79:443
# * Closing connection
# curl: (35) LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 2a09:8280:1::24:ff79:443

Ok, same error as before. What about IPv4 ?

zsh
1
 curl --verbose https://66.241.125.239

What… same error, but it works in my browser! Did I ever try to connect to IPv4 using curl? Am I using curl the wrong way? The website showed in my browser, was it a cache?
I’m dumb, IPv4 addresses are shared! Obviously, I cannot connect to a specific website using the IP address. Fly doesn’t know which server it’s supposed to reach! I need to specify the domain name.

zsh
1
2
curl --verbose -H "Host: jsch.ch" https://66.241.125.239
#curl: (35) Recv failure: Connection reset by peer

It’s always in the TLS session. Does it work over HTTP ?

zsh
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
curl --verbose -H "Host: jsch.ch" http://66.241.125.239
# *   Trying 66.241.125.239:80...
# * Connected to 66.241.125.239 (66.241.125.239) port 80
# > GET / HTTP/1.1
# > Host: jsch.ch
# #...
# 
# < HTTP/1.1 301 Moved Permanently
# < location: https://jsch.ch/
# #...

Ahhah, HTTP over IPv4 is working, we got a redirect! What about IPv6?

zsh
1
2
3
4
5
6
7
8
9
curl -6 --verbose -H "Host: jsch.ch" 'http://[2a09:8280:1::24:ff79]'
# *   Trying [2a09:8280:1::24:ff79]:80...
# * Connected to 2a09:8280:1::24:ff79 (2a09:8280:1::24:ff79) port 80
# > GET / HTTP/1.1
# > Host: jsch.ch
# #...
# < HTTP/1.1 301 Moved Permanently
# < location: https://jsch.ch/
# #...

HTTP over IPv6 is working too! I got for both IPv4 and IPv6 a 301 Moved Permanently which is expected as the Fly app forces https. The problem is only with TLS over IPv6.

Years of reading blogs comes to the rescue

I remember reading an article about domain fronting years ago. The concept is easy, you have multiple layer during an HTTPS connection

txt
1
2
3
4
5
METADATA: ...
SNI: Server Name Indication
TLS TUNNEL:
    HTTP request:
        Host: jsch.ch...

The important thing is: There are two indicator of the server you want to contact:

  1. The host header in the HTTP request
  2. The SNI header in the TLS communication

The HTTP request is passed through the TLS tunnel, therefore it’s encrypted and Fly cannot use that to route the request to the correct machine. The SNI is not encrypted and can be used to route the request to the correct VM. Even if I’m setting the Host header, Fly cannot see it, it’s encrypted!
I need to set the SNI header to my domain name. We do it using --connect-to. This is not intuitive as IPv6 addresses are not shared (unlike IPv4), therefore Fly should not use the SNI to route the request. The IPv6 is already a unique server indicator. But, let’s try it, you know, for completeness sake.

zsh
1
2
curl -6 --connect-to "jsch.ch:[2a09:8280:1::24:ff79]" https://jsch.ch
#Tones of HTML, it's working!

It’s working ! Now, for the real test. I’ll release the IPv4 address and delete the A record in my DNS.

zsh
1
 fly ips release 66.241.125.239

I deleted the A record and everything works fine! Yeaa, let’s hope it’s not due to a cache somewhere.

Confirming the problem

I messed up with curl, I should have used --connect-to from the beginning. I want to confirm that the old image was indeed the problem. Let’s quickly revert the change to the Dockerfile and launch the old version of the app. Then I can use curl with the correct --connect-to option and see if it’s working.

diff
1
2
3
4
5
6
7
8
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,3 +1,6 @@
+ FROM pierrezemb/gostatic
+ COPY ./public/ /srv/http/
- FROM caddy:2.7.6-alpine
- COPY Caddyfile /etc/caddy/Caddyfile
- COPY ./public/ /usr/share/caddy

Now we deploy again the previous image that uses goStatic. There is still no IPv4 address associated to the VM and no A record in the DNS.

We are currently unable to connect to the website using a web browser

It doesn’t work in the browser

For completeness sake, I’ll use curl with the --connect-to option to confirm that the docker image was the problem.

zsh
1
2
curl -6 --connect-to "jsch.ch:[2a09:8280:1::24:ff79]" https://jsch.ch
# curl: (7) Failed to connect to jsch.ch port 443 after 3028 ms: Couldn't connect to server

We clearly have the same error, this means that the configuration was correct, the problem was coming from the goStatic binary.

The remaining problem

We haven’t addressed a huge problem. Someone with only IPv4 connectivity is unable to access the website. They will just have a generic error saying “the domain is not reachable”. We’d like to make a website accessible to IPv4 users telling them to change ISP.

!TODO try to instruct caddy to listen to both interfaces ([::1] and 127.0.0.1) and redirect to a page saying to change ISP if the request is coming from 127.0.0.1. Need to check how the fly proxy works

Conclusion

It was harder than I thought, but we made It! Furthermore, we set up horizontal scaling which is cool!
I’m still not sure why the goStatic binary is not able to work properly over IPv6 and TLS. Nonetheless, the new Dockerfile and Caddy should be up to date for a few years. That’s the advantage of using a well known project.

This little experiment let me refresh the concept of private IP, docker, and obviously domain fronting which is a really cool subject. I hope you learned something too.