Terraform

Terraform with Azure Stack Hub - Creating a VM with multiple data disks

I've recently been working with Azure Stack Hub (ASH) and needed to create some VM's with a variable number of managed data disks. It's not actually as straightforward as it should be, so here's how I achieved it.

azurerm vs. azurestack Providers

Due to differences with the ARM management endpoints for Azure and Azure Stack Hub, Hashicorp provide separate providers for each system. If anyone has used ASH, they will know that the resource providers available are a subset of Azure and are typically an older version, hence the need for different providers.

An interesting thing to check out is how often the providers are updated.

azurerm azurestack
azurerm azurestack

As you can see, the azurerm provider is regularly maintained, whereas azurestack is not. Why's this relevant? Well, if we want to use Terraform as our infra-as-code tool, then we have to work within the limitations.

Deploying a VM with a variable number of managed data disks

With the azurerm provider, this is quite straightforward:

  1. Create Network interface
  2. Create Managed Disk(s)
  3. Create VM
  4. Attach Managed data disks to VM
  5. (Optional) Run Customscript extension on the VM to configure the running VM
locals {
  data_disk_count = 4
}
resource "azurerm_resource_group" "example" {
  name     = "example-resources"
  location = "West Europe"
}

resource "azurerm_virtual_network" "example" {
  name                = "example-network"
  address_space       = ["10.0.0.0/16"]
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name
}

resource "azurerm_subnet" "example" {
  name                 = "internal"
  resource_group_name  = azurerm_resource_group.example.name
  virtual_network_name = azurerm_virtual_network.example.name
  address_prefixes     = ["10.0.2.0/24"]
}

resource "azurerm_network_interface" "example" {
  name                = "example-nic"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name

  ip_configuration {
    name                          = "internal"
    subnet_id                     = azurerm_subnet.example.id
    private_ip_address_allocation = "Dynamic"
  }
}

resource "tls_private_key" "ssh_key" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "azurerm_linux_virtual_machine" "example" {
  name                = "example-machine"
  resource_group_name = azurerm_resource_group.example.name
  location            = azurerm_resource_group.example.location
  size                = "Standard_F2"
  admin_username      = "adminuser"
  network_interface_ids = [
    azurerm_network_interface.example.id,
  ]

  admin_ssh_key {
    username   = "adminuser"
    public_key = tls_private_key.ssh_key.public_key_openssh 
  }

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
  }

  source_image_reference {
    publisher = "Canonical"
    offer     = "0001-com-ubuntu-server-jammy"
    sku       = "22_04-lts"
    version   = "latest"
  }
}

resource "azurerm_managed_disk" "example" {
  count                = local.data_disk_count
  name                 = "${azurerm_linux_virtual_machine.example.name}-data-${count.index}"
  resource_group_name  = azurerm_resource_group.example.name
  location             = azurerm_resource_group.example.location
  storage_account_type = "Premium_LRS"
  create_option        = "Empty"
  disk_size_gb         = 256
}
resource "azurerm_virtual_machine_data_disk_attachment" "example" {
  depends_on = [
    azurerm_managed_disk.example,
    azurerm_linux_virtual_machine.example
  ]
  count              = local.data_disk_count
  managed_disk_id    = azurerm_managed_disk.example[count.index].id
  virtual_machine_id = azurerm_linux_virtual_machine.example.id
  lun                = count.index
  caching            = "ReadWrite"
}


resource "null_resource" "output_ssh_key" { 
  triggers = {
    always_run = "${timestamp()}"
  }
   provisioner "local-exec" {
    command = "echo '${tls_private_key.ssh_key.private_key_pem}' > ./${azurerm_linux_virtual_machine.example.name}.pem"
  }
}

The code above uses the azurerm_virtual_machine_data_disk_attachment resource. When using the azurerm_linux_virtual_machine, this is the only option available to us. Reading the documentation notes:

⚠️ NOTE:

Data Disks can be attached either directly on the azurerm_virtual_machine resource, or using the azurerm_virtual_machine_data_disk_attachment resource - but the two cannot be used together. If both are used against the same Virtual Machine, spurious changes will occur.

There's no method to attach directly using the azurerm_virtual_machine_data_disk_attachment resource.

If we check the resources available with the azurestack provider, we'll see that we can't use the above technique as azurerm_virtual_machine_data_disk_attachment does not exist.

alt text

That means the only option is to use azurestack_virtual_machine resource and attach the disks directly when the VM is created.

Implemetation for Azure Stack Hub

We could just create multiple storage_data_disk blocks within the azurestack_virtual_machine resource, but we want to account for variable number of disks.
To do this we need to use the dynamic blocks capability to generate nested blocks, as the count meta-argument does not work in this instance.

I first setup a map object with the name of each data disk and lun, as can be seen in the locals block in the code below.

This map of objects can then be iterated through to generate the nested block using the for_each meta-argument

The code block in question:

dynamic "storage_data_disk" {
    for_each = {for count, value in local.disk_map :   count => value}
    content {
      name              = storage_data_disk.value.disk_name
      managed_disk_type = "Standard_LRS"
      create_option     = "Empty"
      disk_size_gb      = 256
      lun               = storage_data_disk.value.lun
    }
  }

Example

locals {
  data_disk_count = 4
  vm_name         = "example-machine"
  disk_map = [
    for i in range(local.data_disk_count) :  {
      disk_name = format("%s_disk_%02d", local.vm_name, i+1)
      lun  = i 
    }
  ]
}
resource "azurestack_resource_group" "example" {
  name     = "example-resources"
  location = "West Europe"
}

resource "azurestack_virtual_network" "example" {
  name                = "example-network"
  address_space       = ["10.0.0.0/16"]
  location            = azurestack_resource_group.example.location
  resource_group_name = azurestack_resource_group.example.name
}

resource "azurestack_subnet" "example" {
  name                 = "internal"
  resource_group_name  = azurestack_resource_group.example.name
  virtual_network_name = azurestack_virtual_network.example.name
  address_prefix       = ["10.0.2.0/24"]
}

resource "azurestack_network_interface" "example" {
  name                = "example-nic"
  location            = azurestack_resource_group.example.location
  resource_group_name = azurestack_resource_group.example.name

  ip_configuration {
    name                          = "internal"
    subnet_id                     = azurestack_subnet.example.id
    private_ip_address_allocation = "Dynamic"
  }
}

resource "tls_private_key" "ssh_key" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "azurestack_virtual_machine" "example" {
  name                  = "example-machine"
  resource_group_name   = azurestack_resource_group.example.name
  location              = azurestack_resource_group.example.location
  vm_size               = "Standard_F2"
  network_interface_ids = [
    azurestack_network_interface.example.id,
  ]
  
  os_profile {
    computer_name  = local.vm_name
    admin_username = "adminuser"
  }

  os_profile_linux_config {
    disable_password_authentication = true
    ssh_keys {
      path     = "/home/adminuser/.ssh/authorized_keys"
      key_data = tls_private_key.pk.public_key_openssh
    }
  }

  storage_image_reference {
    publisher         = "Canonical"
    offer             = "0001-com-ubuntu-server-jammy"
    sku               = "22_04-lts"
    version           = "latest"
  }
  storage_os_disk {
    name              = "${local.vm_name}-osdisk"
    create_option     = "FromImage"
    caching           = "ReadWrite"
    managed_disk_type = "Standard_LRS"
    os_type           = "Linux"
    disk_size_gb      = 60
  }

  dynamic "storage_data_disk" {
    for_each = {for count, value in local.disk_map :   count => value}
    content {
      name              = storage_data_disk.value.disk_name
      managed_disk_type = "Standard_LRS"
      create_option     = "Empty"
      disk_size_gb      = 256
      lun               = storage_data_disk.value.lun
    }
  }


resource "null_resource" "output_ssh_key" { 
  triggers = {
    always_run = "${timestamp()}"
  }
   provisioner "local-exec" {
    command = "echo '${tls_private_key.ssh_key.private_key_pem}' > ./${azurestack_linux_virtual_machine.example.name}.pem"
  }
}

Terraform and WSL2 issue

Here’s a quick note on an issue that I encountered today (plus it seems, many other people).

I went to run a Terraform workflow on my system via WSL2, but I cam across a number of problems.

First, was that I couldn’t obtain the State that was stored in an Azure Storage account container. Previously, I used the following config:

backend "azurerm" {
    resource_group_name  = ""
    storage_account_name = ""
    container_name       = "terraform-backend"
    key                  = ""
 }

At runtime, I would specify the values like the example below.

export TF_CLI_ARGS_init="-backend-config=\"storage_account_name=${TERRAFORM_STATE_CONTAINER_NAME}\" -backend-config=\"resource_group_name=${RESOURCE_GROUP_NAME}\" -backend-config=\"access_key=${STG_KEY}\""

However, today, that didn’t work as it just stalled trying to connect to the storage container.

I thought it was something wrong with my credentials, so for troubleshooting purposes, I added the storage account key to see if that made a difference

backend "azurerm" {
    resource_group_name  = ""
    storage_account_name = ""
    container_name       = "terraform-backend"
    key                  = ""
    access_key           = ""
}

I added the primary storage key and lo and behold, this time, it worked.

Strange, as I hadn’t updated the terraform cli or providers.

The next problem I saw was that when I tried to run

terraform plan

it would not complete, seemingly freezing. To troubleshoot this, I ran

export TF_LOG="TRACE"

before running the plan to tell me what was happening in the background.

This in turn produces a verbose output, but something that did catch my was this:

Strange. I know I have internet connectivity and I could certainly connect to Azure using az cli, so I did some Goole-fu and found the following: https://github.com/microsoft/WSL/issues/8022

It was exactly the same problem I had encountered.


Applying the fix https://github.com/microsoft/WSL/issues/5420#issuecomment-646479747 worked for me and persisted beyond a reboot.

(run the code below in your WSL2 instance)

sudo rm /etc/resolv.conf
sudo bash -c 'echo "nameserver 8.8.8.8" > /etc/resolv.conf'
sudo bash -c 'echo "[network]" > /etc/wsl.conf'
sudo bash -c 'echo "generateResolvConf = false" >> /etc/wsl.conf'
sudo chattr +i /etc/resolv.conf


It appears to have occurred in the latest Windows update and affects WSL2. It only appears to affect Go / Terraform as far as I can tell.

Hopefully this will help anyone having a similar issue until the Go provider is fixed.