close

Make WordPress Core

Opened 4 weeks ago

Closed 4 weeks ago

#64821 closed defect (bug) (fixed)

remove_accents(): Add support for capital Eszett (ẞ → SS) in German locale

Reported by: apermo's profile apermo Owned by: dmsnell's profile dmsnell
Milestone: 7.0 Priority: normal
Severity: normal Version: 4.3
Component: Formatting Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

The uppercase ß (ẞ, Unicode U+1E9E, LATIN CAPITAL LETTER SHARP S) was officially standardized in German orthography in 2017 (DIN 5008 revision). The remove_accents() function already handles the lowercase variant (ß → ss) in the German locale block added in #3782, but does not handle the uppercase form.

This is admittedly an edge case — ẞ appears almost exclusively in all-caps contexts such as street names and official documents. It is included here for completeness and to keep the German locale block consistent: every other character in that block has both its upper and lowercase form mapped.

Currently is not matched by the German locale block and falls through without a mapping, so it is returned unchanged and later URL-encoded in slugs.

Steps to reproduce:

echo remove_accents( 'STRAẞE', 'de_DE' );
Current: STRAẞE (ẞ passed through unchanged)
Expected: STRASSE

PR is already prepared and just waiting for the ticket ID.

Change History (3)

Image

This ticket was mentioned in PR #11188 on WordPress/wordpress-develop by @apermo.


4 weeks ago
#1

  • Keywords has-patch has-unit-tests added

Adds (U+1E9E, LATIN CAPITAL LETTER SHARP S) → SS to the German locale block in remove_accents(), directly after the existing ßss entry.

The uppercase ß was officially standardized in German orthography in 2017 (DIN 5008 revision). This is an edge case — ẞ appears almost exclusively in all-caps contexts such as street names and official documents — but it is included for completeness and consistency: every other character in the German locale block has both its upper and lowercase form mapped.

Without this fix, falls through without a mapping and is returned unchanged, leading to URL-encoded characters in slugs:

remove_accents( 'STRAẞE', 'de_DE' );
// Before: STRAẞE  (ẞ unchanged)
// After:  STRASSE

## Use of AI Tools

This patch was developed with the assistance of Claude Code (Anthropic). The change, test case, and Trac ticket were reviewed and approved by the contributor before submission.

#2 Image @dmsnell
4 weeks ago

  • Milestone changed from Awaiting Review to 7.0
  • Type changed from enhancement to defect (bug)

#3 Image @dmsnell
4 weeks ago

  • Owner set to dmsnell
  • Resolution set to fixed
  • Status changed from new to closed

In 61855:

Formatting: Transform “ẞ” for German locales in remove_accents().

The capital Eszett was standardized in German orthography in 2017, DIN 5008, but WordPress has only been transforming the lowercase version.

This patch adds the uppercase variant to the list and transforms it to “SS” for more-appriate slug and permalink generation.

Developed in: https://github.com/WordPress/wordpress-develop/pull/11188
Discussed in: https://core.trac.wordpress.org/ticket/64821

Props apermo, dmsnell.
Fixes #64821.

Note: See TracTickets for help on using tickets.