The Story of PostgreSQL Choking on an Unreadable $HOME

by Tykling


15. apr 2025 14:05 UTC


Hello pirates!

Yesterday I stumbled upon an issue that was sticky enough that it felt appropriate to document the journey here. It is the story of the PostgreSQL client libpq acting weird when running in an environment where $HOME is set to an existing but unreadable path (which, admittedly, is a weird thing to have in an environment). Whether or not this is a bug is a matter of interpretation, but after getting to the bottom of it I wanted to take some notes.

Prequel

I was minding my own business and checking some PostgreSQL server logs when I noticed log messages like this one from some Django servers, for example 2a09:94c4:55d1:76a0::5:

Apr 15 11:48:15 postgres1 postgres[3988]: [7-1] 2025-04-15 11:48:15.129 UTC [3988] [unknown]@2a09:94c4:55d1:76a0::5(54996)@[unknown]/[unknown] LOG:  could not accept SSL connection: EOF detected

The error tells me that the client with IP 2a09:94c4:55d1:76a0::5, a Django server, is unable to establish TLS connections to Postgres. My PostgreSQL servers all have real hostnames and certificates from LetsEncrypt, so there is no reason why the TLS connection should fail here. The websites all work despite of this error because of two things:

  • Django uses the default sslmode in libpq which is prefer, which means first try an SSL connection; if that fails, try a non-SSL connection
  • My pg_hba.conf allowed both plaintext and encrypted connections (since this is all internal between FreeBSD jails on the same host). This means they were allowed to fallback to plaintext connections if TLS fails.

This means that the setting sslmode=prefer is in effect suppressing the TLS connection error, because Django just falls back to plaintext (and Postgres lets it). So the first thing I needed to do was tickle out an error message from the TLS connection attempt. The mission was clear:

  1. Force Django to always and only use TLS for Postgres connections. While at it also switch to hostssl in pg_hba.conf.
  2. Figure out why the TLS connections are failing
  3. Fix the issue

Since step 1 would likely break the website (forcing TLS when TLS doesn't work tends to do that) the plan was to start out playing with a staging site where downtime doesn't matter, before touching any of the production sites.

Forcing Djangos PostgreSQL Connections to Verified TLS

To force always TLS for Djangos DB connection I went to Djangos settings.py file in the DATABASES setting and added these two options:

'OPTIONS': {'sslmode': 'verify-full', 'sslrootcert': 'system'}

To see it in context see this commit. The two options mean:

sslmode=verify-full
Verify that the CA is trusted AND that the certificate hostname matches
sslrootcert=system
Use the system CA certificate store (usually something like /etc/ssl/certs). Failing to add this option results in an error like failed: root certificate file "/var/db/postgres/.postgresql/root.crt" does not exist Either provide the file, use the system's trusted roots with sslrootcert=system, or change sslmode to disable server certificate verification.

Adding these settings did break the websites database connection, but finally I got a python backtrace showing the error. The error message was somewhat confusing though:

could not open certificate file "/root/.postgresql/postgresql.crt": Permission denied

This error message was confusing for a number of reasons. I don't run PostgreSQL or uwsgi as root, I don't have any certificates in /root, the server had no references to /root/.postgresql/postgresql.crt anywhere, and what the hell was going on. Where did this path come from, what is trying to read from it, and why? Something something Cotton Eyed Joe.

To add to the mystery the Django manage.py commands shell and dbshell still both worked great. They both use the configuration from Djangos settings.py proving that the sslmode and sslrootcert additions I made were valid and functional.

So to sum up, after adding 'OPTIONS': {'sslmode': 'verify-full', 'sslrootcert': 'system'} to force Django to use TLS for database connections the uwsgi website broke but manage.py shell still works well. Mystery? mystery!

Figuring Out Why TLS Connections Fail

The website runs under uwsgi as user www, and uwsgi is run by the process supervisor supervisord which runs as root. Since the TLS error only happens when using Django through the website, and not when using it through the shell, the error is likely somewhere in uwsgi or supervisord.

While playing with verify-full using psql on the command-line I saw an example of what the error looks like when libpq can't find the root CA certificate to verify the connection.

Observe the difference with and without sslrootcert=system (in this case testing locally on the postgres server):

[postgres@postgres1 ~]$ psql "host=postgres1.servers.bornhack.org sslmode=verify-full"
psql: error: connection to server at "postgres1.servers.bornhack.org" (2a09:94c4:55d1:76a0::4), port 5432 failed: root certificate file "/var/db/postgres/.postgresql/root.crt" does not exist
Either provide the file, use the system's trusted roots with sslrootcert=system, or change sslmode to disable server certificate verification.
[postgres@postgres1 ~]$ psql "host=postgres1.servers.bornhack.org sslmode=verify-full sslrootcert=system"
psql (16.8)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

postgres=# 
\q
[postgres@postgres1 ~]$ 

Note that the error message root certificate file "/var/db/postgres/.postgresql/root.crt" does not exist has different wording from the one I got , which was could not open certificate file "/root/.postgresql/postgresql.crt": Permission denied. It is looking in a different homedir of course, but it is also looking for root.crt instead of postgresql.crt. This would later turn out to be significant.

I did a lot of digging around, verifying along the way that the connection parameters did make it all the way to psycopg2. After a LOT of print()-based debugging I slowly narrowed the differences between the working dbshell invocation and the non-functional uwsgi invocation of the same Django code and config.

At some point I started digging into the environment Django runs under when calling dbshell versus under uwsgi and figured out that if I set HOME in the supervisord environment for uwsgi the error went away! It didn't seem to matter what value HOME was set to, it didn't even have to be a valid path, any value (even an empty string!) seemed to fix it.

With a functional workaround in hand I could have stopped there, but I wanted to get to the bottom of it. Something about having HOME=/root in the environment was causing something to error out instead of just ignoring HOME and running as if there was no HOME (which was clearly possible since HOME= also worked).

Reproducing Outside of supervisord

After a long evening of narrowing it further I eventually had a way to reproduce it outside uwsgi and supervisord using psql directly:

[postgres@postgres1 ~]$ echo $HOME
/var/db/postgres
[postgres@postgres1 ~]$ psql "host=postgres1.servers.bornhack.org sslmode=verify-full sslrootcert=system"
psql (16.8)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

postgres=# 
\q
[postgres@postgres1 ~]$ HOME= psql "host=postgres1.servers.bornhack.org sslmode=verify-full sslrootcert=system"
psql (16.8)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

postgres=# 
\q
[postgres@postgres1 ~]$ HOME=/nonexistant psql "host=postgres1.servers.bornhack.org sslmode=verify-full sslrootcert=system"
psql (16.8)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

postgres=# 
\q
could not save history to file "/nonexistant/.psql_history": No such file or directory
[postgres@postgres1 ~]$ HOME=/root psql "host=postgres1.servers.bornhack.org sslmode=verify-full sslrootcert=system"
psql: error: connection to server at "postgres1.servers.bornhack.org" (2a09:94c4:55d1:76a0::4), port 5432 failed: could not open certificate file "/root/.postgresql/postgresql.crt": Permission denied
[postgres@postgres1 ~]$ 

In other words: Setting HOME in the environment to any path which exists but is not readable fails with the above error.

Root Cause

I went digging in the libpq code and eventually found the codepath being taken which leads to this error.

  1. The trouble begins near the beginning of initialise_SSL() in this section which determines whether or not to use a home directory, setting the bool have_homedir.
  2. The code uses pqGetHomeDirectory() which is defined here which simply checks if HOME is defined in the environment. I would argue that this code should also check if the contents of HOME is a valid and readable path before returning true.
  3. Any TLS connection where have_homedir=true will use a default path for client certificate if one has not been defined in the connection settings, the default client certificate path is defined here and here resulting in the path /root/.postgresql/postgresql.crt being accessed, triggering the error.

This finally explained why I was seeing this issue only in uwsgi under supervisord. TLS connections never worked in this setup, this error has always been there. I just never noticed because plaintext fallback was enabled.

The error message was never about reading the CA certificate (even though I initially thought so, since that was the setting I changed!). The error message was about trying to read a client certificate which libpq will always attempt to do if run in an env with HOME set to a non-empty string. If the error returned from attempting to read from this path is anything other than ENOENT or ENOTDIR an error is raised. In other words, if the path exists but is not readable. This finally explains why my workaround of setting HOME worked:

  • The processes started by supervisord inherit the environment which supervisord itself runs under
  • Since supervisord runs as root everything it runs inherits an environment with HOME=/root - regardless of the user the process runs as.
  • Setting HOME in the supervisord config for uwsgi effectively overrides the HOME=/root so uwsgi no longer inherits it.
  • This is also the reason why running manage.py shell or manage.py dbshell worked fine, since those are run with HOME set to the homedir of the user running the command

Wrapping Up

Once I understood the issue the fix was trivial. This commit to my Ansible roles makes HOME= the new default for uwsgi stuff run under my supervisord instances. Arguably this is how the supervisord default should always have been. Or maybe processes should inherit HOME for the user the process runs as (www in this case). The last remaining jobs were:

  • Change all my Django settings.py DATABASES settings to include 'OPTIONS': {'sslmode': 'verify-full', 'sslrootcert': 'system'} so TLS is always used
  • Change all my pg_hba.conf files to use hostssl instead of host so plaintext connections are disallowed.

This was quite a tricky error to get to the bottom of. A couple of simple changes to libpq would have made this much less painful:

  • As mentioned above pqGetHomeDirectory() should IMO return false if HOME is not readable. This would have avoided the issue entirely, and arguably an unreadable HOME should be treated the same as a non-existing HOME
  • The error message returned when failing to read a TLS client certificate should contain the word client somewhere so it is clear what it is attempting to read: The message could not open certificate file "/root/.postgresql/postgresql.crt": Permission denied would have been more clear if it read could not open client certificate file "/root/.postgresql/postgresql.crt": Permission denied

I hope this blogpost can help other people stuck with this issue. I wanted to send a patch to fix the two small things mentioned above, but since Postgres do not accept contributions via pull requests on Github and their contributing guidelines is TL;DR I will sadly leave it up to someone else to fix this.

Donating

I've somewhat recently signed up for Github Sponsors meaning it is now easy to sponsor me and my work. If this post or some of my other writing, software or services have helped you then you can consider becoming a sponsor.

Search this blog