Professionally Now Otherwise Engaged - E_FEWTIME

Christopher J. Ruwe

Hashicorp's Vault and Provisioning

November 18, 2018. 1240 words.

More often than not, automation modules from third parties greatly enhance operator productivity, but at the same time prevent gaining a proper understanding of a matter. For a deep dive, I regularly propose to switch to a shell (even the ksh or tcsh) and just mill through it.

Customers and colleagues alike have expressed interest at how to provide certificates for systems, which may be accomplished with some ansible or puppet module. However, greater understanding may be accomplished with just a few hundred lines of shell and puppet. Of course, one may wish toFurther Key Words for Use in RFCs to Indicate Requirement Levels, §6. RTFM Inc., April 2013. switch to something more maintainable afterwards.

When talking about how to manage secrets when automatingChristopher J. Ruwe: Managing Secrets in Automated Environments. 2018. (I need to stroke my narcissism as everybody else, so why are you grinning?) , I have tried to explain the principles behind a specific secret store for such environments, Hashicopr’s VaultHashicorp’s Vault. Project Page. in an understandable manner. Now, more practical, I will show how to provision machines so that the machines may consume certificates from vault and how to enable said machines to re-provision certificates when expiration comes near.

Preparing Vault (Sketch only)

The following procedure will assume a working instance of vault with persistent storage enabled. Then, provision an authentication backend of the kind approle, construct a policy so detailing the allowed operations on this path and create a role on the authentication backend with the permissions of the policy. Additionally, users may wish to restrict operations to hosts from a network range by passing bound_cidr_list=<CIDR,...> and token_bound_cidrs=<CIDR,>.

Then, configure the PKI secret engine, setting desired parameters such as domains, permissible SANs, TTLs and so on. The Hashicorp Vault documentationVault Documentation, specifically approle and PKI engine. provides the details.

Machine Provisioning

Remember that vault uses secret engines to derive tokens to pass these tokens to secret consumers instead of the actual “master key”.

So, when provisioning a machine from some automaton (I provision LXD container images with Ansible), the provisioning machine may authenticate to the vault to collect a token from the vault. That token may (should!) be shorter-lived than the actual secret, is unique for the provisioned machine and enables that machine to independently collect secrets for the lifetime of the token from now on.

In Ansible, the logic for my LXD containers is as crude as inThe role and corresponding secret map to the authentication backend, not to the PKI backend. An authentication role ties authentication policies to and auth endpoint, a secret role, such as the PKI role, ties policies to the generation of secrets, here certificates. The heavy escaping may look like perl on steroids, but is necessary for correct wrapping over multiple layers, first lxc exec, then /bin/sh -c and then curl -d. I do not apologize.

- name: provide vault secret for cert role
  shell:
    cmd: |
    
      lxc exec "{{ thetemplate.name }}-{{ ansible_date_time.date }}"  -- sh -c " \
        mkdir -p /etc/vault/{{ auth_role }} \
        && curl \
          --request POST \
          --data \"{
              \\\"role_id\\\": \\\"{{ role_id }}\\\",
              \\\"secret_id\\\": \\\"{{ role_secret }}\\\"
            }\" \
          https://{{ vaultendpoint }}>/v1/auth/{{ auth_role }}/login \
        > /etc/vault/{{ auth_role }}/login.json
      "
      

with the necessary variables and credentials passed as simply asThe directive vars_prompt makes ansible interactively prompt for the secret necessary to obtain the token. Alternatives might be Jenkins or Bamboo secrets.

---
- hosts: lxdhosts
  tasks:
  - include: provision_lxdtemplate.yml
  vars:
    auth_role: <...>
    role_id: <...>
    vaultendpoint: <...>
  vars_prompt:
    - name: "role_secret"
      prompt: "Enter the secret to obtain vault tokens."

Consumption of Certificates

As every machine provisioned likewise will have a token stored in login.json, any machine can now consume secrets in a relatively basic logic:Authentication to a Vault auth engine is performed by passing a web token as an http header to the POST method. Use https to protect the secret on the network.

function getcert() {
  CERTTTLSECS=$(expr <%= @certttldays %> \* 86400)
  
  curl \
  --data "
    {
      \"common_name\": \"<%= @fqdn %>\",
      \"ttl\": \"${CERTTTLSECS}\"
    }
  " \
  --header "X-Vault-Token: $( jq -r '.auth.client_token' ${LOGINFILE})" \
  --request POST \
  "<%= @vaulthost %>:<%= @vaultport %>/v1/<%= @pkiname %>/issue/<%= @pkirole %>" \
  | tee $CERTFILE
  unwrapcert
}

with unwrapcert given as simple asTake care to protect the files (at least the certificate json container and the certificate key with restrictive permissions (chmod 0600, umask, etc.)

function unwrapcert() {
  for part in certificate key
  do
    jq -r ".data.$part" ${CERTFILE} > "${CERTPATH}/<%= @fqdn %>.${part}"
  done
}

Refreshing certificates

Refreshing certificates does not introduce unsupportable difficulty as well: More elegant methods certainly exist.

if [ -f $CERTFILE ]; then

  NOTAFTER_DATE=$(\
      jq -r '.data.certificate' ${CERTFILE}\
      | openssl x509 \
        -text \
      | grep "Not After" \
      | sed 's|Not After :||g' \
    )

  NOTAFTER_UNIX=$( date --date="$NOTAFTER_DATE" +%s )
  SECONDSLEFT=$( expr $NOTAFTER_UNIX   - $(date +%s) )
  DAYSLEFT=$(expr $SECONDSLEFT / 86400)
  if [ $DAYSLEFT -lt <%= @certrenewlt %> ]; then
    mv $CERTFILE $CERTFILE.bck
    getcert
  fi
else
  getcert
fi

Provisioning the Automaton

Provisioning requires nothing more than some light systemd lifting, more specifically, a unitWith no interest to dive into the systemd divide, I consider systemd timers to be better suited when an activity should be performed at regular and irregular intervals.

[Unit]
After=network-online.target
Description=Vault Certificate Getter Service
Requires=network-online.target

[Service]
ExecStart=/usr/local/bin/getcert.sh
TimeoutStopSec=60s
Type=oneshot

[Install]
WantedBy=multi-user.target

and a timer When all machines are on the same few physical hosts as is the case in my home lab, it may be a good idea to allow everybody to settle down, as it is common courtesy to allow a colleague to grab a coffee before starting the pestering.

[Unit]
Description=Vault Certificate Getter Timer
[Timer]
OnBootSec=5min
OnUnitActiveSec=1h

[Install]
WantedBy=timers.target

To paraphrase Kode Vicious, I love navel-gazing exercises as everyone else, but there is a limit to just about everything. Provisioning that erb-stuff with puppet is easier and quicker done than talking about the complexity it introduces and finding reasons why it is not possible.

Hashicorp's Vault and Provisioning - November 18, 2018 - Christopher J. Ruwe