Building a Windows 2016 Domain Controller with Vagrant and Ansible
TLDR; (“The Haiku Version”)
git clone https://github.com/splunk/building_a_windows_dc
edit ansible/var/vars.yml
cd splunk-server
vagrant up
cd ../windows_dc_2016
vagrant up
navigate to http://localhost:8000
Builds a Windows 2016 domain controller with the following instrumentation:
- Sysmon
- Splunk Stream app
- Splunk Sysmon Technology Add-on (TA)
Also builds Splunk server with all the necessary TAs and apps to populate the data models fed by the above sources. You can find the bits at: https://github.com/d1vious/building-a-windows-dc. Keep reading for the “why” and “how” of this project.
The Splunk Security Research Team has a dire and ongoing need to programmatically generate real data from attacks. By “real data,” I mean logs, network captures, endpoint events, and so on from attack tools, proof-of-concept (POC) exploit code, and malware, all of which get ingested into a Splunk environment.
Using much of the work laid out by Chris Long’s DetectionLab project, we agreed it offered good footing to build replicable testing environments for our team. So, we developed environments that included the necessary instrumentation to collect endpoint, network, system events and domain-controller events. Specifically, the window_dc.yml Ansible playbook installs the following things:
Windows DNS feature and management tools
- RSAT Active Directory Admin Center
- Active Directory Domain Services, including management tools
- Splunk Universal Forwarder
- Splunk Stream app
- Splunk Sysmon TA (TA-sysmon)
- Splunk Windows TA (TA-windows)
- Sysmon
You can stand the environment up in an automated way, with a Windows domain controller and all of the Splunk instruments necessary to analyze attacks in detail. In our case, we use this environment to produce content for the Splunk Enterprise Security Content Updates (ESCU) app. It helps us build replicable test environments to execute POC exploit code and attack scenarios, as well as to simulate real attacks with replicable results. It also helps us perform a detailed analysis of the data generated from each phase of an active attack.
In this case, we are leveraging Vagrant to create a virtual machine (VM) for us with the Windows 2016 Server image. After Vagrant starts the VM, it executes a provisioner to configure it. In this case, our provisioner is Ansible.
Here is the general logical diagram of the build flow:
We start by creating a Windows 2016 image using Packer, as well as a build template, which automatically does the following during pre- and post-build:
- Windows updates
- Openssh (for management)
- Installs Virtual Box guest tools
- Enables SSH for Vagrant
- Enables RDP
- Installs .NET framework
- Disables auto logon
In the provided instructions, the Windows 2016 image creation is unnecessary, because we have already built a Windows 2016 server image with Packer and uploaded it to Vagrant Cloud.
Now is a good opportunity to talk about the Vagrant configuration, which defines the physical properties (RAM, disk, and so on) of the domain-controller VM and calls out a provisioner (in our case, Ansible) to configure it.
You can see the relevant sections here:
Our provisioner, Ansible (notice the distinction between ansible_local), executes the playbook called window_dc.yml. This playbook has a set of roles that it applies to the VM, all of which configure and install all the necessary components. There is a valuable file called vars.yml which defines the settings for each role. Things like the domain name and the domain-admin password can be set within this file.
Splunk Security Research now routinely uses the above technique to programmatically stand up environments for staging attacks. This cuts a few days (or more!) off of our analysis and data-collection workflow for new threats. It also allows us to focus on the events produced by an example exploit/attack, instead of on building the instrumentation required to capture and analyze it.
If you try this technique, let me know @d1vious