Jose Enrique Hernandez
Jose Enrique Hernandez
Security Researcher, Founder, and Diver
Oct 16, 2019 5 min read

Building a Windows Domain Controller with Terraform and Ansible

thumbnail for this post

Building a Windows Domain Controller with Terraform and Ansible

Recently, I blogged about building a Windows domain controller (DC) using Ansible and Vagrant, which is a great and easy way to bring up a replicable environment to launch attacks against. Ansible is a great solution to orchestrate builds for attack environments. Unlike other orchestration systems (or even writing your own bash/PowerShell scripting), Ansible sports great Windows support out of the box, has a large community following that provides many playbooks/roles to deploy common software such as network capture services, standard applications used (web servers, databases systems, etc.), and log-collection software (similar to splunk-ansible).

Vagrant, on the other hand, does have some shortcomings. First, it limits a tester to their local system, which is not ideal when you are trying to replicate/study exploits across a larger environment. A modern 8 CPU core 16GB laptop will struggle trying to successfully emulate a Splunk Enterprise server, Windows 2016 DC, and a Windows 10 workstation at the same time. Also, it’s not easy to “share” this testing infrastructure with a colleague if you are working on a team. Moreover, Vagrant limits our ability to define our attack environment as code to only operating systems (and not cloud services). To solve this, I decided to implement in Terraform the ability to bring up an attack testing environment on AWS.

We’ll dig into how this works, but I want to start by sharing that I reached some level of desperation, because there isn’t really a “blessed” way to run Ansible with Terraform. You do have a few options, which are explained in detail by Alex in this blog. Let’s explore them below:

  1. Use the terraform-ansible module
  2. Generate a static inventory from a Terraform state, and then run referencing this static inventory ansible-playbook -i inventory playbooks/windows_dc.yml
  3. Use local-exec Call it via a local shell and dynamically passing the inventory IP to connect with

After a few hours of attempting the first option (using the most recent terraform-ansible-module) I was still not able to get it to work with WinRM. Maybe is a good time to share the fact that Ansible does not manage windows hosts via SSH :-), instead it they recommend using WinRM as the preferred communication system.

The second option is generating a static inventory using terraform-inventory. But because of our implementation, this was a non-starter, as it requires a human or an additional automated step to execute provisioning on a Terraform-built host. Because my purposes for building a DC required that CI/CD would easily run this, I skipped this option.

This meant I was left with the third option—to execute Ansible using the local-exec provisioner for Terraform. Let’s deep dive right into the piece of code in Terraform that actually brings up an aws_instance resource (for a Windows 2016 server). It looks something like this:

# standup windows 2016 domain controller
resource "aws_instance" "windows_2016_dc" {
  ami           = var.windows_2016_dc_ami
  instance_type = "t2.large"
  key_name = var.key_name
  subnet_id = "${aws_subnet.default.0.id}"
  vpc_security_group_ids = ["${aws_security_group.default.id}"]
  tags = {
    Name = "attack-range_windows_2016_dc"
  }
  user_data = <<EOF
<powershell>
$admin = [adsi]("WinNT://./${var.win_username}, user")
$admin.PSBase.Invoke("SetPassword", "${var.win_password}")
Invoke-Expression ((New-Object System.Net.Webclient).DownloadString('https://raw.githubusercontent.com/ansible/ansible/devel/examples/scripts/ConfigureRemotingForAnsible.ps1'))
</powershell>
EOF

 provisioner "local-exec" {
    working_dir = "../ansible"
    command = "sleep 60;cp hosts.default hosts; sed -i '' 's/PUBLICIP/${aws_instance.windows_2016_dc.public_ip}/g' hosts;ansible-playbook -i hosts playbooks/windows_dc.yml"
  }
}

Let’s break this down into parts. Here’s the first portion:

resource "aws_instance" "windows_2016_dc" {
  ami           = var.windows_2016_dc_ami
  instance_type = "t2.large"
  key_name = var.key_name
  subnet_id = "${aws_subnet.default.0.id}"
  vpc_security_group_ids = ["${aws_security_group.default.id}"]
  tags = {
    Name = "attack-range_windows_2016_dc"
  }

This defines a resource of type aws_instance and names it windows_2016_dc. This is how we spin up an EC2 instance in Terraform. This resource has a few properties, such as AMI ID, which are actually coming from a variable (more on this later), instance type, the subnet_id to use (which actually defines also availability zone), and security groups to apply to the instance. These are all common parameters configured when using the EC2 launcher wizard, for example, and so should seem very familiar.

The second portion is a bit more unique, mainly because here we define a PowerShell script to pass through the user_data field in AWS:

user_data = <<EOF
<powershell>
$admin = [adsi]("WinNT://./${var.win_username}, user")
$admin.PSBase.Invoke("SetPassword", "${var.win_password}")
Invoke-Expression ((New-Object System.Net.Webclient).DownloadString('https://raw.githubusercontent.com/ansible/ansible/devel/examples/scripts/ConfigureRemotingForAnsible.ps1'))
</powershell>
EOF

When the EC2 instance comes up, it will execute this script, which installs and runs WinRM on our newly created instance. The primary goal for us to execute this script during boot is to allow Ansible later to log in to the box via WinRM. Based on my reading and testing, this seems to be the best way to get Ansible to execute on a remote Windows machine without SSH.

The third and final portion is our provisioner local-exec call:

 provisioner "local-exec" {
    working_dir = "../ansible"
    command = "sleep 120;cp hosts.default hosts; sed -i '' 's/PUBLICIP/${aws_instance.windows_2016_dc.public_ip}/g' hosts;ansible-playbook -i hosts playbooks/windows_dc.yml"
  }
}

Here we set two parameters—working_dir, which switches our directory to wherever the local Ansible install directory is located (e.g. etc/ansible) and command, which actually passes the Ansible command to execute in bash. You will notice that there is some funky stuff happening before our ansible-playbook command is executed. Our local Ansible directory looks something like this:

├── README.md
├── ansible.cfg
├── hosts.default
├── playbooks
│   └── windows_dc.yml
└── vars
    └── vars.yml

The “funky” stuff is sleep we introduce to allow our Windows machine to come up and also to run the code specified under user_data. Next, we change the content of hosts.default and set the public IP address of the instance ${aws_instance.windows_2016_dc.public_ip} using sed. The contents of hosts.default sets the necessary WinRM parameters for Ansible to connect to the machine. Here is an example:

aws-win-host ansible_ssh_host=PUBLICIP

[win]
aws-win-host

[win:vars]
ansible_connection=winrm
ansible_ssh_port=5986
ansible_ssh_user=Administrator
ansible_ssh_pass=myTempPassword123
ansible_winrm_transport=basic
ansible_winrm_server_cert_validation=ignore

Here you want to make sure that the ansible_ssh_pass=myTempPassword123 matches that of the ${var.win_password} variable defined under terraform.tfvars file. Consider this an example: it is highly recommended to use certificate authentication for WinRM instead of username/creds if you are going to be deploying this in a production environment.

Let’s quickly recap the important bits of getting a Windows 2016 DC build with Terraform:

  • Set up your AWS provider credentials
  • Customize any environment parameters necessary under variables.tf
  • Setup your variables for your environment under terraform.tfvars, set your own password
  • Run terraform init and then terraform apply

You can find a complete example in github that contains all the Terraform+Ansible logic necessary to build on AWS a Windows 2016 DC.