Building a Windows Domain Controller with Terraform and Ansible
Building a Windows Domain Controller with Terraform and Ansible
Recently, I blogged about building a Windows domain controller (DC) using Ansible and Vagrant, which is a great and easy way to bring up a replicable environment to launch attacks against. Ansible is a great solution to orchestrate builds for attack environments. Unlike other orchestration systems (or even writing your own bash/PowerShell scripting), Ansible sports great Windows support out of the box, has a large community following that provides many playbooks/roles to deploy common software such as network capture services, standard applications used (web servers, databases systems, etc.), and log-collection software (similar to splunk-ansible).
Vagrant, on the other hand, does have some shortcomings. First, it limits a tester to their local system, which is not ideal when you are trying to replicate/study exploits across a larger environment. A modern 8 CPU core 16GB laptop will struggle trying to successfully emulate a Splunk Enterprise server, Windows 2016 DC, and a Windows 10 workstation at the same time. Also, it’s not easy to “share” this testing infrastructure with a colleague if you are working on a team. Moreover, Vagrant limits our ability to define our attack environment as code to only operating systems (and not cloud services). To solve this, I decided to implement in Terraform the ability to bring up an attack testing environment on AWS.
We’ll dig into how this works, but I want to start by sharing that I reached some level of desperation, because there isn’t really a “blessed” way to run Ansible with Terraform. You do have a few options, which are explained in detail by Alex in this blog. Let’s explore them below:
- Use the terraform-ansible module
- Generate a static inventory from a Terraform state, and then run referencing this static inventory
ansible-playbook -i inventory playbooks/windows_dc.yml
- Use
local-exec
Call it via a local shell and dynamically passing the inventory IP to connect with
After a few hours of attempting the first option (using the most recent terraform-ansible-module) I was still not able to get it to work with WinRM. Maybe is a good time to share the fact that Ansible does not manage windows hosts via SSH :-), instead it they recommend using WinRM as the preferred communication system.
The second option is generating a static inventory using terraform-inventory. But because of our implementation, this was a non-starter, as it requires a human or an additional automated step to execute provisioning on a Terraform-built host. Because my purposes for building a DC required that CI/CD would easily run this, I skipped this option.
This meant I was left with the third option—to execute Ansible using the local-exec
provisioner for Terraform.
Let’s deep dive right into the piece of code in Terraform that actually brings up an aws_instance resource (for a Windows 2016 server). It looks something like this:
# standup windows 2016 domain controller
resource "aws_instance" "windows_2016_dc" {
ami = var.windows_2016_dc_ami
instance_type = "t2.large"
key_name = var.key_name
subnet_id = "${aws_subnet.default.0.id}"
vpc_security_group_ids = ["${aws_security_group.default.id}"]
tags = {
Name = "attack-range_windows_2016_dc"
}
user_data = <<EOF
<powershell>
$admin = [adsi]("WinNT://./${var.win_username}, user")
$admin.PSBase.Invoke("SetPassword", "${var.win_password}")
Invoke-Expression ((New-Object System.Net.Webclient).DownloadString('https://raw.githubusercontent.com/ansible/ansible/devel/examples/scripts/ConfigureRemotingForAnsible.ps1'))
</powershell>
EOF
provisioner "local-exec" {
working_dir = "../ansible"
command = "sleep 60;cp hosts.default hosts; sed -i '' 's/PUBLICIP/${aws_instance.windows_2016_dc.public_ip}/g' hosts;ansible-playbook -i hosts playbooks/windows_dc.yml"
}
}
Let’s break this down into parts. Here’s the first portion:
resource "aws_instance" "windows_2016_dc" {
ami = var.windows_2016_dc_ami
instance_type = "t2.large"
key_name = var.key_name
subnet_id = "${aws_subnet.default.0.id}"
vpc_security_group_ids = ["${aws_security_group.default.id}"]
tags = {
Name = "attack-range_windows_2016_dc"
}
This defines a resource of type aws_instance
and names it windows_2016_dc. This is how we spin up an EC2 instance in Terraform. This resource has a few properties, such as AMI ID, which are actually coming from a variable (more on this later), instance type, the subnet_id to use (which actually defines also availability zone), and security groups to apply to the instance. These are all common parameters configured when using the EC2 launcher wizard, for example, and so should seem very familiar.
The second portion is a bit more unique, mainly because here we define a PowerShell script to pass through the user_data
field in AWS:
user_data = <<EOF
<powershell>
$admin = [adsi]("WinNT://./${var.win_username}, user")
$admin.PSBase.Invoke("SetPassword", "${var.win_password}")
Invoke-Expression ((New-Object System.Net.Webclient).DownloadString('https://raw.githubusercontent.com/ansible/ansible/devel/examples/scripts/ConfigureRemotingForAnsible.ps1'))
</powershell>
EOF
When the EC2 instance comes up, it will execute this script, which installs and runs WinRM on our newly created instance. The primary goal for us to execute this script during boot is to allow Ansible later to log in to the box via WinRM. Based on my reading and testing, this seems to be the best way to get Ansible to execute on a remote Windows machine without SSH.
The third and final portion is our provisioner local-exec
call:
provisioner "local-exec" {
working_dir = "../ansible"
command = "sleep 120;cp hosts.default hosts; sed -i '' 's/PUBLICIP/${aws_instance.windows_2016_dc.public_ip}/g' hosts;ansible-playbook -i hosts playbooks/windows_dc.yml"
}
}
Here we set two parameters—working_dir
, which switches our directory to wherever the local Ansible install directory is located (e.g. etc/ansible) and command
, which actually passes the Ansible command to execute in bash. You will notice that there is some funky stuff happening before our ansible-playbook
command is executed. Our local Ansible directory looks something like this:
├── README.md
├── ansible.cfg
├── hosts.default
├── playbooks
│ └── windows_dc.yml
└── vars
└── vars.yml
The “funky” stuff is sleep we introduce to allow our Windows machine to come up and also to run the code specified under user_data
. Next, we change the content of hosts.default and set the public IP address of the instance ${aws_instance.windows_2016_dc.public_ip}
using sed. The contents of hosts.default
sets the necessary WinRM parameters for Ansible to connect to the machine. Here is an example:
aws-win-host ansible_ssh_host=PUBLICIP
[win]
aws-win-host
[win:vars]
ansible_connection=winrm
ansible_ssh_port=5986
ansible_ssh_user=Administrator
ansible_ssh_pass=myTempPassword123
ansible_winrm_transport=basic
ansible_winrm_server_cert_validation=ignore
Here you want to make sure that the ansible_ssh_pass=myTempPassword123
matches that of the ${var.win_password}
variable defined under terraform.tfvars
file. Consider this an example: it is highly recommended to use certificate authentication for WinRM instead of username/creds if you are going to be deploying this in a production environment.
Let’s quickly recap the important bits of getting a Windows 2016 DC build with Terraform:
- Set up your AWS provider credentials
- Customize any environment parameters necessary under
variables.tf
- Setup your variables for your environment under
terraform.tfvars
, set your own password - Run
terraform init
and thenterraform apply
You can find a complete example in github that contains all the Terraform+Ansible logic necessary to build on AWS a Windows 2016 DC.