Skip to content

Conversation

khwilliamson
Copy link
Contributor

It is undefined behavior in C for a symbol name to begin with an underscore followed by a capital letter or a second
underscore. It is also undefined behavior for a symbol at file scope to begin with an underscore followed by a lowercase letter. C++ further restricts any leading underscore in file scope. Our headers need to be able to compile with C++, so the restriction for headers is never begin a symbol with an underscore. Some people compile core perl using C++, and sometimes we move symbols into headers. Therefore a reasonable rule is to not begin file-scoped symbols with an underscore.

There are a hundred-ish symbols in the perl core that do begin with an underscore, not all of them currently in file scope. This series of commits renames almost all of them to instead have a single trailing underscore (thus retaining a visual clue that these are special in some way). It doesn't do the ones that already have a pull request in progress for (submitted, or a WIP on my box), nor for the ones that are generated by scripts, as those are a bit more complicated. The symbols changed here are the ones that are simple to do.

Some of the symbols changed here are ones I introduced, out of ignorance of the C standard's wording on these.

Each commit changes a single symbol

The consequences of them being the way they are now are minimal, Only if a C implementation changed to use one of our symbols would there be a symbol clash, or we got ported to a new C compiler. The odds of these being problesm are fairly low. Yet they are non-zero, and we do have existing symbol clashes with other software that they have had to work around.

I got tired of running into these symbols and being reminded that these aren't strictly legal, and that I was responsible for some of them. So I ended up with this p.r., removing nearly all of them at once.

The commits here change even symbols that aren't currently file-level, hence legal. I did this for several reasons

  1. Consistency. Not only for appearance, but If most of the symbols had used trailing underscores when I started on this project, I would have gotten the hint to do so too. Other people creating new symbols are less likely to create potentially clashing ones.
  2. Some people, including me, think the trailing reads better generally.
  3. You never have to worry when moving things around if it is going into file scope and is hence undefined behavior
  • This set of changes does not require a perldelta entry.

@jkeenan
Copy link
Contributor

jkeenan commented Aug 16, 2025

On a simple unthreaded build on Linux, this p.r. is failing for me here:

cc -c   -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -Wall -Werror=pointer-arith -Werror=vla -Wextra -Wno-long-long -Wno-declaration-after-statement -Wc++-compat -Wwrite-strings -Wno-use-after-free -O2   -DVERSION=\"0.48\" -DXS_VERSION=\"0.48\" -fPIC "-I../.."  -DPERL_EXT_RE_BUILD -DPERL_EXT_RE_DEBUG -DPERL_EXT re.c
In file included from ../../perl.h:3315,
                 from re.xs:8:
../../sv_inline.h: In function ‘Perl_SvPV_helper’:
../../sv.h:1401:41: error: ‘svcur’ undeclared (first use in this function); did you mean ‘svcur_’?
 1401 |             &(((XPV*) MUTABLE_PTR(SvANY(svcur)))->xpv_cur);             \
      |                                         ^~~~~
../../handy.h:95:41: note: in definition of macro ‘MUTABLE_PTR’
   95 | #  define MUTABLE_PTR(p) ({ void *p_ = (p); p_; })
      |                                         ^
../../sv.h:1401:35: note: in expansion of macro ‘SvANY’
 1401 |             &(((XPV*) MUTABLE_PTR(SvANY(svcur)))->xpv_cur);             \
      |                                   ^~~~~
../../sv_inline.h:928:19: note: in expansion of macro ‘SvCUR’
  928 |             *lp = SvCUR(sv);
      |                   ^~~~~
../../sv.h:1401:41: note: each undeclared identifier is reported only once for each function it appears in
 1401 |             &(((XPV*) MUTABLE_PTR(SvANY(svcur)))->xpv_cur);             \
      |                                         ^~~~~
../../handy.h:95:41: note: in definition of macro ‘MUTABLE_PTR’
   95 | #  define MUTABLE_PTR(p) ({ void *p_ = (p); p_; })
      |                                         ^
../../sv.h:1401:35: note: in expansion of macro ‘SvANY’
 1401 |             &(((XPV*) MUTABLE_PTR(SvANY(svcur)))->xpv_cur);             \
      |                                   ^~~~~
../../sv_inline.h:928:19: note: in expansion of macro ‘SvCUR’
  928 |             *lp = SvCUR(sv);
      |                   ^~~~~
make[1]: *** [Makefile:340: re.o] Error 1
make[1]: Leaving directory '/home/jkeenan/gitwork/perl2/ext/re'
Unsuccessful make(ext/re): code=512 at make_ext.pl line 584.
make: *** [makefile:593: lib/auto/re/re.so] Error 2

@khwilliamson khwilliamson added the Use merge commit Don't merge this p.r. from github It contains multiple related commits. Instructions in perlgit label Aug 17, 2025
@bulk88
Copy link
Contributor

bulk88 commented Aug 18, 2025

This commit should be immediately rejected. It is breaking over thirty years of C source code compatibility of Perl 5. this commit is as much vandalism as attempting a commit changing the P5P repo to the Raku 6 .tar.gz.

KHW needs to find a commercial C compiler and it's .h files on some on Linux BSD Unix Windows OS first that shows how the underscores in P5 C Token names are harmful or break the building the interp first.

@khwilliamson
Copy link
Contributor Author

This commit should be immediately rejected. It is breaking over thirty years of C source code compatibility of Perl 5. this commit is as much vandalism as attempting a commit changing the P5P repo to the Raku 6 .tar.gz.

Perhaps you are entirely correct in your assessment. But you make this claim without any concrete evidence that would allow someone else to evaluate its accuracy.

@rjbs
Copy link
Member

rjbs commented Aug 18, 2025

bulk88, I think your suggestion here is that somebody downstream might be using the _Foo names in their code, and could have their code broken. Is that right? If so, can you cite an example or provide mode details of a likely scenario?

Also, calling it vandalism -- the deliberate destruction or defacement of property -- is not helpful. Nobody is going to believe that Karl is showing up here to vandalize perl.

@khwilliamson
Copy link
Contributor Author

Unless I made a mistake, no code outside core control should be using these names. That is in fact the point of the underscores, to mark these as special, internal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Use merge commit Don't merge this p.r. from github It contains multiple related commits. Instructions in perlgit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants