by Tykling
19. nov 2016 13:35 UTC
Earlier this week I was pretty surprised to see some weird permissions on some nginx
config files on my servers. The servers are managed by Ansible so I suspected some changes I made to my ansible roles a few days prior. I only made syntax changes so I didn't expect anything to change. But sometimes the rabbit hole goes deeper than you imagined :)
So I looked at the Ansible task that creates and maintains these files:
- name: Create nginx extra include configs (acls and more) copy: owner: root group: wheel mode: 600 content: "{{ item.value.content }}" dest: "/usr/local/etc/nginx/{{ item.value.filename }}" with_dict: "{{ nginx_extra_configs | default({}) }}" when: nginx_proxy | default(False)
This task is used together with a host_vars
file containing a section something like this:
nginx_extra_configs: bfipacl: filename: bornfiber-ip-acl.conf content: | allow 100.20.45.100/32; # management allow 10.20.64.0/24; # something else anotheracl: filename: another-acl.conf content: | allow 192.168.0.0/24; # management
And the task just loops through the nginx_extra_configs
dict creating or updating each file as needed.
All the Ansible modules that have to do with file management (the copy
, template
and file
modules spring to mind, but many more exist) have the same mode
property which is used to control the permissions of the file. Ansibles documentation has this advice about modes being octal (this text is taken from file module documentation):
For those used to /usr/bin/chmod remember that modes are actually octal numbers (like 0644). Leaving off the leading zero will likely have unexpected results. As of version 1.8, the mode may be specified as a symbolic mode (for example, u+rwx or u=rw,g=r,o=r).
While I understand the intent of this advice, the text as it stands is misleading at best. The advice will help in many cases - but only because setuid
, setgid
and sticky bit
is rarely used. And it completely fails to address subtle differences in parsing behaviour, which I had to learn the hard way.
Now, Ansible tasks are defined in YAML, and when defining properties for a task you have a couple of choices for syntax. The changes I made earlier this week - which resulted in the wrong file permissions - was to switch from the key=value
syntax to the structured key: value
format in my tasks. Basically I changed all the tasks like so:
- name: Install git become: yes pkgng: - name=git - state=present + name: git + state: present
It just feels more natural to me to use the same key: value
syntax in the task arguments as I do in the rest of the task files, so I standardized on this syntax (thinking it would not change any actual functionality).
After the syntax change I ran my playbooks and discovered that some of my config files now had wrong permissions. I expected these files to be chmod 600
(so rw
for owner, and no permissions for group and others) but what I got was this:
[tsr@webproxy ~]$ ls -l /usr/local/etc/nginx/bornfiber-ip-acl.conf ---x-wx--T 1 root wheel 594 Nov 2 10:53 bornfiber-ip-acl.conf
I didn't actually change any tasks, just the syntax of them, so this was baffling. I basically changed mode=600
to mode: 600
and now the permissions are completely different.
Numeric unix file permissions are specified as octal (base 8) as three or four digits, where the three rightmost digits represent the permissions for owner, group and others, respectively. The optional fourth and leftmost digit represents the setuid
, setgid
and sticky bit
. This is assumed to be 0
if it is left out, which is why chmod 600 file
yields the same result as chmod 0600 file
.
I suspected that my 600
was being interpreted as decimal instead of octal after my syntax change. To test the theory I simply convert decimal 600
to octal and feed it to chmod
, and check the resulting permissions:
user@privat:~$ echo "obase=8; 600" | bc 1130 user@privat:~$ touch test user@privat:~$ chmod 1130 test user@privat:~$ ls -l test ---x-wx--T 1 user user 0 Nov 19 15:09 test user@privat:~$
Great! I've now confirmed that the problem is the mode being interpreted as decimal instead of octal after my syntax change. Reading the above snippet from the Ansible documentation about octal numbers I changed my tasks to use 0600
instead of 600
and thought that was the end of that. Until I started thinking a bit more about it. The reason prefixing my permissions with a 0
worked is that the number is now interpreted as an octal number instead of a decimal number by pyyaml
which is used by Ansible to parse the configuration files. But what if my permissions don't start with a 0
?
Ansible is based on Python 2. Python 2 has two valid octal notations: 0600
and 0o600
both mean octal 600
which equals decimal 1130
. Note that Python 3 only supports the 0o600
notation (to avoid this kind of stuff I suspect).
Anyway, using the key=value
syntax it seems the permissions number is interpreted as an octal no matter what, but with the key: value
syntax I switched to the Python notation for octal comes into play: An unquoted number is now considered a decimal - unless it begins with a 0
! So my 0600
is interpreted as an octal now, and my file gets the proper permissions, and all is well it seems.
But what if I wanted to give my file the sticky bit
, or setuid
/setgid
. This would make the first number of the permissions a non-0
instead of a 0
, and I am back with wrong permissions. Observe the difference between mode: 1600
and mode: "1600"
and mode: 0o1600
below:
[tsr@webproxy ~]$ ls -l /usr/local/etc/nginx/bornfiber-ip-acl.conf # mode: 1600 ---x--S--T 1 root wheel 594 Nov 2 10:53 /usr/local/etc/nginx/bornfiber-ip-acl.conf [tsr@webproxy ~]$ ls -l /usr/local/etc/nginx/bornfiber-ip-acl.conf # mode: "1600" -rw------T 1 root wheel 594 Nov 2 10:53 /usr/local/etc/nginx/bornfiber-ip-acl.conf [tsr@webproxy ~]$ ls -l /usr/local/etc/nginx/bornfiber-ip-acl.conf # mode: 0o1600 -rw------T 1 root wheel 594 Nov 2 10:53 /usr/local/etc/nginx/bornfiber-ip-acl.conf [tsr@webproxy ~]$I've since changed all my tasks to use quoted octal values when using numeric permissions. Sticking to either quoted or
0o
prefixed values should ensure I don't run into these problems again.
The advice in the Ansible documentation is wrong, or misleading at best. The advice should be something like; Note that unix file permissions are octal, and should either be quoted or prefixed with "0o" to ensure correct interpretation. To set mode 644 use mode: "644" or mode: 0o644.. This would have saved me some work, but on the other hand, it is always nice to refresh basic concepts like unix permissions and non-base10 numbers :).
After working on this I wanted to open an issue on Github but I found this which discusses this problem and will probably find a solution sooner or later.