Seeding a Google Compute Engine instance with SSH HostKeys

Josh Bielick
3 min readFeb 23, 2021

Creating a Google Compute Engine machine image with pre-existing SSH host keys won’t work like you might expect—here’s why:

The Objective

A several-year-old SFTP server resides in AWS. It’s connected to regularly by its hostname. The hostname is an A record for an elastic IP address and it’s attached to the network interface of an EC2 instance running sshd. I’d like to replace this server with an instance running in Google Cloud Platform.

The Challenge

When connecting to an SFTP server, SFTP clients typically save a host key fingerprint alongside the host name and IP address used to connect to the host. The host key fingerprint is based on the host’s public key. This provides the protection and warning message you have probably encountered from time to time when one of those attributes (name, IP, and fingerprint) have changed since the last time you connected. The information about the hosts you’ve previously connected to is stored in known_hosts.

The warning your SSH client prints probably looks like this:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!   @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

My goal is to create a new SFTP server that clients can continue to connect to without receiving this warning. To do so, I need the new server to have the same host keys. In general you ought to rotate your host keys and provide plentiful time and communication to your customers that it will change. For the purposes of this guide, rotation is presumed not feasible at this time.

Host Keys

You can read more about HostKeys on ssh.com.

Google Compute Machine Images

In this guide I will be using Packer to build machine images. Once we have a machine image, we can use it as a boot disk for a new Google Compute Instance or copy it to a persistent disk and use that as a boot disk.

The approach I’m taking is this:

  • Copy a tar archive of host keys to the Packer builder instance
  • Move to the correct directory and set ownership and permissions.

The tar archive in files/ is copied to /tmp/hostkeys.tar.gz. Subsequently, it is extracted to /etc/ssh and contents chowned to root:root. The existing permissions on the private keys is 0600, so no chmod is necessary.

The machine image is saved with the host keys and now we can create an instance! However, as soon as the instance starts up, our host keys are overwritten. 🤔

The solution

When I created an instance from the image including the host keys, the host keys were overwritten every time as soon as the instance started (modified time demonstrated this). Since this was highly correlated with instance creation and the syslog included some messages about Started Google Compute Engine Startup Scripts, I began to wonder if a startup script was overwriting them.

Startup script investigation led me to the Google Cloud Platform guest-agent, which assists in setting up the instance properly and enabling things like OS Login and Cloud SSH users and keys. I was thrilled to find the source on GitHub and noticed the InstanceSetup->set_host_keys configuration that was probably to blame. Here’s the go code that overwrites the host keys.

Ultimately, I ended up with this in the packer config:

Note the addition of instance_configs.cfg being placed at /etc/default/instance_configs.cfg as recommended in the guest-agent readme. The most important part to change in instance_configs.cfg is set_host_keys = false.

The full instance config can be found here, though there was only one change to the default.

That’s it! Your host keys will now persist after instance startup.

--

--

Josh Bielick

loves systems, bikes, sociology, coffee, and the sound of music. VP of Infrastructure @Adwerx