Terraform – Removing Orphaned Session Hosts from an Azure Virtual Desktop Hostpool

Reading Time: 3 minutes

When deploying session hosts for an Azure Virtual Desktop environment with the help of Terraform you will sooner or later run into an issue when there are still left over virtual machines (orphaned objects) inside a host pool. The registration of session hosts to a hostpool is done with the help of an Azure custom script extension during the virtual machine creating which will download the configuration package containing the scripts for the Desired State Configuration (DSC) and registering the machine to the corresponding pool.

In a perfect world the session hosts would get removed from the hostpool when destroying or replacing the virtual machines inside the Terraform code due the custom script extension (responsible for the hostpool join) is being removed. Unfortunately thats not the case and you will be left over with previous existing machines inside the hostpool. Even a re-creation or replacement of the virtual machines will not work (e.g. changing the source image) meaning you will be unable to login to those session hosts via the Azure Virtual Desktop control plane. On the screenshot below you can see the “Unavailable” health state of a machine which has been recreated by Terraform but will not allow users to login due its broken state.

Particularly in scenarios where you are hosting non-persistent workloads which rely on a Golden Image and a rolling upgrade process this will create some overhead when publishing a new image release to the hostpool. If you want to get rid of the orphaned session hosts inside the hostpool you need to either remove them via the Azure Portal and re-register them (manual step) or even destroy and recreate the corresponding hostpool from scratch. Another way would be to use new hostnames for the machines when provisioning the session hosts based on the latest image version. The last method (new machine names) would still mean that you need to do some housekeeping since there will be orphaned objects inside the hostpool.

There must be another way of handling this issue right? With help of the AzAPI provider it possible to manage Azure resources which are not yet (or never) supported by the AzureRM Terraform provider. In other words the AzAPI provider allows us to communicate with the Azure Rest API using any available API version. azapi_resource_action | Terraform Registry

Before we can write the Terraform code, we need to understand how the REST API is working and put the wanted outcome into HCL syntax. For the start lets explore how we can interact with the API to grab the necessary informations.

Session Host – GET

Session Host – DELETE

Let me show you how to configure the “azapi_resource_action” block inside the Terraform code to remove existing machines when destroy a collection of machines. Simply add this to your session host module and orphaned machines will be something you will never have to worry about again in the future. Depending on your Terraform module you maybe need to refactor the code a little bit. I always prefer a for_each loop instead using the count meta argument because it gives more flexibility. But thats totally up to you 🙂

It is important to understand that “azapi_resource_action” for the removal procedure of the the session hosts will only be triggered when running a Terraform “destroy” action. If your are planning to update one of your hostpools (e.g. pool01-test) you need to destroy the virtual machines first and then re-create them.

Depending on your code structure, I would recommend to split each hostpool into a separate Terraform Workspace and call the session host module with the needed variables. The Azure Virtual Desktop Service Objects (Hostpools, Application Groups, Workspaces and Scaling Plans) are being created in an earlier stage of the deployment and the host pool information is just being referred for the registration process of the session host with the PowerShell DSC Extension.


Ideally, the PowerShell DSC extension would handle cleanup when a virtual machine is deleted, but currently, it doesn’t. This workaround might not be perfect, but it gets the job done. I hope you found this information helpful. If you have any questions or comments, feel free to share them below.

See you next time! 🧑‍💻

2 comments

  1. Greetings Julian,

    Great post, this is something i missed in my deployments.
    Can jou give me some more context about het for_each var.session_host_virtual_machines variable
    this would help us allot.

    Kind regards,
    Andreas

    1. Hello Andreas, sorry for the late response. The mentioned variable (var.session_host_virtual_machines) is a map of object which is specifying all the needed information for the deployment of the session hosts. This could include for example the virtual machine name (key), naming of the network interface, managed disk information (name+size), subnet information and the availability zone. Hope that helps!
      Julian

Leave a Reply

Your email address will not be published. Required fields are marked *