Collation Surprises: Did Postgres Lose My Data? (2024)
Jeremy Schneider
Comparisons are fundamental to computing – and comparing strings is not nearly as straightforward as you might think. Collations, or the ordering and comparison of strings, is continuously evolving along with all the nuances of natural language. But databases use such comparisons everywhere: for ORDER BY, the humble >, <, and = operators, btree indexes, GROUP BY, range partitioning -- even hashing, hash indexes, and hash partitioning. Fundamental decisions from the early days of PostgreSQL have led to surprising challenges. Learn how to navigate these challenges and what new options are available. First we'll briefly cover the history, nuance and surprises of "putting words in order" that you never knew existed in computer science. Next, walk through a few actual scenarios and demonstrations using Postgres as a user and administrator, which you can re-run yourself later for further study, including one way you could easily corrupt your self-managed PostgreSQL database if you aren't prepared. Finally we'll dive into an explanation of the surprising behaviors we saw in Postgres, and learn more about user and administrative features Postgres provides related to localized string comparison - including significant new features in Postgres 17.
Get the Latest
Sign up to stay up to date with news, special announcements and educational content.
Redgate will only contact you about PASS Data Community Summit (in line with our Privacy Policy) unless you separately request emails about Redgate. You can unsubscribe from these updates at any time.
Thanks for submitting! We'll be in touch soon.
