-
Notifications
You must be signed in to change notification settings - Fork 589
Create new OP_MULTIPARAM
to implement subroutine signatures
#23574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Weirdly, while github CI and most commentors on IRC report test failures in In case it's somehow related to build options, the script I use to configure and build is: test -f config.sh && rm config.sh
test -f Policy.sh && rm Policy.sh
./Configure -des \
-Dusedevel \
-Dprefix=$HOME/perl5/perlbrew/perls/bleadperl/ \
-DDEBUGGING \
-Uversiononly \
-Dman1dir=none -Dman3dir=none \
-Dusethreads \
-Dinc_version_list=none \
-Doptimize='-gdwarf-2 -g3' \
"$@"
make -j8
make -j8 test_prep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit message: "This two" should be "These two"
op.c
Outdated
*/ | ||
varop->op_next = defop; | ||
defexpr->op_next = varop; | ||
param->padix = allocmy("", 0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perl_allocmy
currently does not support empty names because it blindly uses name[1]
. It needs to be changed to check len
first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allocmy()
is just a wrapper around pad_add_name_pvn()
with some extra sanity checking and flag-setting (I think it exists just for a convenience from the parser). I've instead changed this code to call pad_add_name_pvn()
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding more asserts in #23583 which would have caught this
482ac37
to
e961b24
Compare
Recent discussons on github [1] found a bug when calling this function as allocmy("", 0, 0) This ought not be allowed. The length must be at least 2, because the function checks the first two characters of `name`. [1]: Perl#23574 (comment)
Recent discussons on github [1] found a bug when calling this function as allocmy("", 0, 0) This ought not be allowed. The length must be at least 2, because the function checks the first two characters of `name`. [1]: #23574 (comment)
e961b24
to
5dd6e3a
Compare
EXTEND(SP, (IV)(3 + nparams + 1)); | ||
mPUSHu(p->min_args); | ||
mPUSHu(p->n_positional); | ||
PUSHs(sv_2mortal(p->slurpy ? newSVpvf("%c", p->slurpy) : &PL_sv_no)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why run a 1 byte long string through a printf engine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because offhand I don't believe we have a newSVpvc
function. Can you suggest an alternative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newSVpvn(&p->slurpy, 1)
?
Also, why are we mortalizing PL_sv_no
? Couldn't we do this (warning: untested code, written in the browser):
PUSHs(p->slurpy ? newSVpvn_flags(&p->slurpy, 1, SVs_TEMP) : &PL_sv_no);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very true, there is missing Perl in C infrastructure in this area. I've seen various builds from various decades of Spidermonkey/V8 having all 1 char long string objects, fo all of low 7 bit printable ASCII or 0x00-0x7F ,pre-generated burned into HW RO memory as C structs. Perl 5 calls it a SvIMMORTAL. Its funny because demerphq very quietly added something similar for all of 1 char long, 0x00-0xff, to Perl 5, like 8 years ago, but its always # define off by default and nobody has paid attention to the feature ever since. And Configure/config.h doesn't know about it.
So currently in perl 5, there are 2 normalish ways to do this. and maybe 1 @bulk88 high-speed way to do this, I'd have to research if the cpan visible exported constant array I'm thinking could be inside libperl actually exists.
I'd say option 1 is see 60% of the time, option 2 is 40% in core.
normal option 1
char c = 'z';
sv = newpvn(&c, 1);
normal option 2
sv = newpvn(" ", STRLENs(" ")); // or sv = newpvn("\0", STRLENs("\0"));
SvPVX(sv)[0] = 'z'; // safe b/c we know the above isn't going to do
// COW tricks on us (and if it does it probably needs another dedicated name)
// ive personally smoke tested with only 2 lines SEGVing after a make test
// in all of blead where newSVpvs("some hw const c str");
// doesn't make a Newx() block inside SvPVX() but a
// SvLEN() == 0 no cow or SvLEN() == 0 with COW
// POSIX::.pm's 2000+ newCONSTSUB_flags() calls on initial load
// badly needs this for `SvPVX(cv)`'s buffer which is always 1 byte long string `""`
// not 2000 16 byte Newx() blocks with CUR=0 and LEN=16.
bulk option that probably doesn't exist inside libperl, lack of infrastructure
// below probably needs to be CPAN visible, if the fantasy wishlist feature
// of nukeing a module's *main::MyPkg:: and then de-mmaping the
// XS .so and .dll files from address space at any time actually was
// safe, we'd rather not have CPAN shared libs creating this kind of
// SV * POK with a "dbl quote" hardware literal string from their .rodata
// instead of libperl's .rodata if we can help it
const char ** PL_1_ch_strs[256] = {..... ,"1", "2", "3", .... "a", "b", "c", ,,,,};
sv = newSVtype(SVt_PV);
//calling this below is really bad, this whole sequence needs a dedicated Perl_newSV*() func
// or some more secret bitfield flags for Perl_newSVpvn_flags();
// or don't be scared and just use SvPV_set(); SvCUR_set(); yourself
// SvLEN_set(,0); i belive is redundant, the body allocators zero out
// SV body all non ghost fields IME
usepvn(sv, PL_1_ch_strs[c], 1, SV_I_DID_ADD_THE_NUL_CHAR);
SvLEN(sv, 0);
SvFLAGS(sv) |= (SVf_ISCOW | SVs_STATIC);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better infrastructure, most or all src code instance of "abcd"
, "abc"
, "ab"
, "a"
need to be exterminated. Those are not strings, they are call integers or CPU registers.
sv = newSVpvc('exit');
sv = newSVpvc('INIT');
See what I did?
All LE CPU will need a secret byte order flipper macro inside the #define newSVpvc(_chrs)
macro.
The real .i
machine code decl of the above is
extern SV* Perl_newSVpvc_x(pTHX_ U32 chrs);
sv = newSVpvc('exitexit');
isn't ISO C and way too rare as a vendor extension to pretend that (a U64
multi char literal) exists anywhere on any C compiler Perl can target now, historically, or in the future.
sv = newSVpvc8('exit','exit');
is possible, day code we have such rapid short string comparison/parsing macros. newSVpvc8
internally on i386 is
extern SV* Perl_newSVpvc8u32x2_x(pTHX_ U32 chrlo, U32 chrhi);
and on x64
extern SV* Perl_newSVpvc8u64_x(pTHX_ U64 chrs);
https://www.altium.com/ 's ISO C99 compliant compiler for ARM32/64 has uint2_t
, uint3_t
, uint7_t
, and a whole bunch of funny stuff built into its official C grammar/BNF notation file. Its non-ISO extensions are all GCC-style syntax.
https://valhalla.altium.com/Learning-Guides/GU0122%20C-to-Hardware%20Compiler%20User%20Manual.pdf
I want to crack a joke after reading its entire manual, if Perl 5 was compiled with that C compiler, the output will binary describing transistor gates. There is now a Perl CPU executing the Perl ISA. The Perl CPU's socket pins are an RS-232 serial port called STDIN
.
I'm not sure after reading that C compiler's manual and IDE, if Altium can generate an ELF/EFI binary of ARM machine code or not or it only generates transistor gates for random unmodified desktop POSIX software, and that transistor gate language can only execute in BOCHS
or QEMU
, or be emailed to Taiwan to get back as an actual microchip (big $$$).
Altium's C99 compiler has USA and EU govt certificates to generate GUI-aware binaries that draw automobile digital speedometers and aircraft pilot LCD GUIs. Enough said who their customers are. The last time I rented a Nissan, I found out the radio was a 5 year old no-Google Android 6.0 OS, on a "this year" 4 month old car.
Nissan bought the cheapest android SOC chips they could find, that were heading to be melted or burned anyway, because they were obsolete, and nobody, not at any cost, will solder them into a phone, not even a $40 phone.
The Nissans's dashboard has a 1 way serial port that pushes data to Android 6 (tire pressure icon, etc), the radio has absolutely no electrical way to talk back to the GUI on the instrument console. Proper design IMO.
{ | ||
struct op_multiparam_aux *p = (struct op_multiparam_aux *)aux; | ||
UV nparams = p->n_positional; | ||
EXTEND(SP, (IV)(3 + nparams + 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the MEXTEND mortal stack stretcher macro here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other cases (OP_MULTIDEREF
, OP_MULTICONCAT
) don't do those either; I was following similar structure. Perhaps in a separate commit we could add that later. Though this is only during B
's inspection, it's not a hot performance path. I think the slight performance cost is fine for the simpler code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just because bad code exists elsewhere in core already, is not a reason to copy the bad code to new code in new places and cite the bad code as proof your new code is correct because nobody had to time to refactor the bad code over the years.
Write new code correctly to the Perl C API the first time so nobody has to clean it up after you.
nparams
is a fixed known ahead length. realloc()
the mortal stack ONCE with MEXTEND()
and don't bounds check and then possibly grow the stack in units of 1 SV*
at a time, on every iteration through the loop.
Its as silly as writing this in C.
STRLEN i=1;
STRLEN nparams;
char * pv = SvPV(sv, nparams);
char * ptr = malloc(1);
while (i <= nparams) {
ptr = realloc(ptr, i++);
}
for(UV parami = 0; parami < nparams; parami++) | ||
mPUSHu(p->param_padix[parami]); | ||
mPUSHu(p->slurpy_padix); | ||
XSRETURN(3 + nparams + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace with PUTBACK; return;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, just following similarity with the other cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is dangerous C code.
XSRETURN()
is only for use with with the ST(1234)
absolute index assignment macro.
or these macros
#define XST_mIV(i,v) (ST(i) = sv_2mortal(newSViv(v)) )
#define XST_mUV(i,v) (ST(i) = sv_2mortal(newSVuv(v)) )
#define XST_mNV(i,v) (ST(i) = sv_2mortal(newSVnv(v)) )
#define XST_mPV(i,v) (ST(i) = sv_2mortal(newSVpv(v,0)))
#define XST_mPVN(i,v,n) (ST(i) = newSVpvn_flags(v,n, SVs_TEMP))
#define XST_mNO(i) (ST(i) = &PL_sv_no )
#define XST_mYES(i) (ST(i) = &PL_sv_yes )
#define XST_mUNDEF(i) (ST(i) = &PL_sv_undef)
_*_PUSH_*_()
class macros are already tracking the number of SV*
you wrote to your portion of the stack.
You"ve never read
https://perldoc.perl.org/perlxs#Returning-SVs,-AVs-and-HVs-through-RETVAL
or
https://perldoc.perl.org/perlguts#Subroutines
before writing this huge commit.
I wouldn't blame anyone for saying those 2 sections are unreadable, because the first time I read them a long time ago they were unreadable to me. In 2025 they are still unreadable for me. They confuse new people more than they help new people.
So if after reading the official API docs, you still don't understand, go watch my videos on how to use the Perl Stack https://www.youtube.com/results?search_query=%22Writing+XS+in+Plain+C%22
I included graphics explaining it.
if(namepv) | ||
sv_catpvf(retsv, ":%s=%" UVf, namepv, paramidx); | ||
else | ||
sv_catpvf(retsv, ":(anon)=%" UVf, paramidx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#define PadnamePV(pn) (pn)->xpadn_pv
#define PadnameLEN(pn) (pn)->xpadn_len
#define PadnameUTF8(pn) 1
#define PadnameSV(pn) newSVpvn_flags(PadnamePV(pn), PadnameLEN(pn), SVs_TEMP|SVf_UTF8)
":(anon)=%"
can be de-duped to 1 branch above with ":(%s)=%". "anon",
in theory PadnameLEN(pn)
and sv_catXXX()
family calls should be taken advantage of here, rather than repeatedly going through the printf engine.
IDK and IDC enough, and probably its impossible to write a bug ticket with a failure/defect demo, about it, but I see UTF8 flag is perma-on in the data source API, but we arent propagating it to the higher level. And this is XS::APItest anyways, so perfection isnt critical. but someone might look at this in the future for "best practices" ideas, and then copy paste quicky hacky code into a more visible API.
lib/B/Deparse.pm
Outdated
# Look for OP_NULL[OP_PARAMTEST[OP_PARAMSTORE]] | ||
if ($o->name eq 'null' and $o->flags & OPf_KIDS and | ||
$o->first->name eq 'paramtest' and | ||
$o->first->first->name eq 'paramstore') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$o->first
cache the retval of the method or key ($first = $o->first)->name
.
|
||
length $sig[$parami] > 1 ? | ||
( $sig[$parami] .= ' ' ) : | ||
( $sig[$parami] = '$' ); # intentionally no trailing space |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont look up same thing over and over in an array, do a $sref = \$sig[$parami];
or $s = $sig[$parami];
. probably the first is easier in this sub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance improvement of that indirection will be absolutely miniscule given this is deparse of a signatured sub; hardly a hot path. I prefer the clarity of the code as written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, you can click close / hide on your side for this bubble.
|
||
my $defop = $paramtest->first->first; | ||
if ($defop->name eq "stub") { | ||
$sig[$parami] .= "$assign"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
y the ""s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Symmetry with the other case below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quoting poorly written old doesn't make it right.
You don't need "Name $name\n"
interpolation here so don't use "dbl"
quotes. Perl Parser/yylex
/toke.c
have to do more work on startup to parse a ""
string lit vs a ''
string lit. Perl 5 doesn't have AOT compilation.
I have no idea how $sig[$parami] .= "$assign";
is any different from $sig[$parami] .= $assign;
. It looks like a bug.
If .= "$assign";
has a technical rational like XSS/bobby drop tables/GetMagic/overload.pm/taint -T, you need to document it the technical reason for future people to know.
char slurpy; /* the sigil of the slurpy var (or null) */ | ||
OP *elemops; /* NULL, or an OP_LINESEQ of individual element and fence ops */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move field char slurpy; to the bottom of the struct. don't have hidden alignment holes.
* zero or more arguments. | ||
*/ | ||
UV next_argix; /* the argument index of the next parameter we add */ | ||
UV opt_params; /* number of optional scalar parameters */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using 64 bit integers to store an arrays length on i386 CPUs? Since when is void *
8 bytes long on a Pentium 3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking a little bit more, I think this API should be counting things not in types STRLEN/Size_t, but in type U16 (perfection design/ideal design) or type I32/U32 (some user's bizzare and unreasonable definition of "production code"). Since this API represents pieces of .pl/.pm src code should be U32 is the maximum non-fuzzing/non-pentesting amount of named @_ args legally possible. Above that, above 4.2 billion named my $named+args_from_AT_UNDERSCOE;
, yy_lex()
is being fed infinite PP src code nonsense on FD STDIN
generated by another pen testing .pl
script running in another perl process on a 64 bit CPU.
So these counts ideally should be U16 on all 32 and 64b CPUs. With appropriate overflow/fatal effor bounds checks as needed, Or a compromise design for unrealistic PP src, code, of type U32 on all 32 and 64b CPUs.
Some years ago, there was a legit production PP code ticket filed and fixed, so 64b Perl can do this $obj->meth(@{$ref_to_arr_with_4_dot_5_billion_scalars});
. But this API in this PR, represents fixed length, named, incoming arguments,.
It doesn't represent or implement basic day 1 of Perl lang @_
variable length upto infinity (aka OOM) incoming arguments feature. So it doesn't need to count to 4.9 billion SV*
s on 64bit CPUs. Type U16
or max U32
should be used. Anything larger type is just creating a permanent ocean of 0x00
bytes that are wasteful. So no Size_t
, it should be U16
or I32
or U32
.
I32
can be a good choice efficient choice if negative non-0 value or sign bit on is used to mean some specific behavior vs the behavior of a positive non-0 value.
Perl C API historically likes to like negative I32
s or negative SSize_t
to mean the object's string/array/contents are UTF8-on/UTF8-yes. instead of creating a whole new U8 or U32 somewhere, just to transport a single 1/0 bit, which indicate UTF-yes vs UTF-no flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use UVs prettymuch universally for these sorts of things everywhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't know the difference between type IV
and type size_t
you shouldn't be writing C code ever.
That also means you have no clue how bytes long/bits long any of C's fundamental builtin scalar types are.
Note Perl only compiles on not-antique real OSes and real CPUs that normal people can buy with cash. So Intel 8088/286 CPUs, 29 bit CPUs and 69 bit CPUs don't apply to the Perl 5 project, regardless if they are allowed by the C language spec in 2025.
if(signature->elemops) | ||
op_free(signature->elemops); | ||
if(signature->params) { | ||
for(UV parami = 0; parami < signature->nparams; parami++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i386 doesnt have 8 byte pointers, use Size_t.
/* handle '$=' special case */ | ||
if(varop) | ||
if(padix) | ||
yyerror("Optional parameter lacks default expression"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was struct yy_parser_signature_param *param = subsignature_push_param();
leaked if this error branch was entered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's been stored in the params
array of the current struct yy_parser_signature
, which will be tidied up on unwind by the deferred call to destroy_subsignature_context()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm then
op.c
Outdated
UV end_argix = signature->next_argix; | ||
|
||
struct op_multiparam_aux *aux = (struct op_multiparam_aux *)PerlMemShared_malloc( | ||
sizeof(struct op_multiparam_aux) + end_argix * sizeof(PADOFFSET)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope I know PEMDAS but a () would be nice and make it easier to read, on what gets multiplied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. Will add.
struct yy_parser_signature_param *params = signature->params; | ||
UV max_argix = 0; | ||
|
||
for(UV parami = 0; parami < signature->nparams; parami++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no 64 bit pointers on ARM32/i386.
PADOFFSET padix = param->padix; | ||
|
||
while(max_argix < argix) { | ||
aux->param_padix[max_argix] = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't double deref in a loop. CC has no idea if &(aux->param_padix)
and &(aux->param_padix[5])
happened to be the same mem addr.
IDK what realistic values are for max_argix, but if its over 16 x U32/U64 or 32 x U32/U64 call libc's memset aka perl Zero(), below that, this loop will win in speed b/c its move 4/8 aligned bytes at a time, while all libc memsets, need 4-8 branches to figure out alignment and if to use 2/4/8/16 sse/32 avx/64 avx512 loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As written elsewhere, honestly I don't expect many signatured functions to ever have more than about 4 actual parameters. People don't usually write the sort of code that declares hundreds of parameters. Optimising for large memset-style operations is pointless here, vs the clarity of the code as written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would 256 or 65K aka 2^16 or U32 hard coded Perl API limits be appropriate here?
Keeping 6 bytes or 4 bytes of null bytes times 2-4x per struct times 100s/1000s of malloc structs in each perl process becomes very wasteful. CPU cache reasons mostly nowadays, RAM, eghhh, private bare metal you personally own has GBs nowadays. But Heroku/AWS/and its clones DO NOT have GBs and GBs worth of free HW RAM. They have 128MB/256MB of phy ram per customer VM. So this definitely still matters nowadays.
So coding it correctly whether its 2005 or 2025 or 1995 needs to be done. This is the interp, not a random CPAN/PAUSE tarball with XS code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://stackoverflow.com/questions/4582012/maximum-number-of-parameters-in-function-declaration
C/C++ say 128 I8 and 256 U8 are good enough max function argument (or wont C/C++ compile) limits for everyone.
I would say for bizzare/fuzzing/trolling use cases, 65K arguments in a prototype is enough of a limit. U32 4 billion ISO C arguments or PP Perl 5 arguments in a prototype, needs a demo use case from the bug reporter of how they the bug reporter have the HW/CPU wall time/Perl interp arch limits to load through STDIO from disk, PP parse/compile that .pm
, and enter the runloop without a SEGV or a Windows/Linux kernel OOMing them for reaching 8/16/32GBs of malloc() blocks backed by swap file storage inside 1 perl process. Note in 2020-2022 someone did file a real bug ticket that they wanted 4.5 billion SV*
s from an AV*
(my $ret = $self->somemeth(@{$arrref_to_4_5_billion_scalars});
) to be sent through @_
in a subroutine on 64 bit CPUs. That was feture was implemented with tests backing it for stable Perl.
4.5 billion SVs in perl array at runtime sounds reasonable for ETL workloads, but a PP source code file would need to be how big of a disk file to declare 4.5 billion PP variables/arguments?
Just bounds check the API on source code ascii string intake and cap it to U16 I16 or I32 U32. Which developer is going to hand type the values for 71000 arguments each time to invoke a Pure Perl method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perl 5's current anti abuse/anti fuzzing limits are mostly 1 byte U8 limits. I searched the codebase for earlier anti abuse "reasonable PP src code" decisions.
https://github.com/Perl/perl5/blob/blead/regcomp.h#L394 2 billion limit
https://github.com/Perl/perl5/blob/blead/perl.h#L1601 100 limit
https://github.com/Perl/perl5/blob/blead/perl.h#L1612 64 commas in a row limit
https://github.com/Perl/perl5/blob/blead/mro_core.c#L255 100 limit
https://github.com/Perl/perl5/blob/blead/perl.h#L9265 1000 limit for subroutine invoke opening (
s or {
s in a row
UV n_positional; /* = the number of mandatory + optional scalar parameters, not counting a final slurpy */ | ||
char slurpy; | ||
PADOFFSET *param_padix; /* points at storage allocated along with the struct itself, immediately following */ | ||
PADOFFSET slurpy_padix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort all struct fields by alignment and size, dont use 64 bit ptrs on 32 bit CPUs
STATIC void | ||
S_av_refresh_elements_range(pTHX_ AV *av, IV startix, IV endix) | ||
{ | ||
for(IV ix = startix; ix < endix; ix++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AV*s use SSize_t not IV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just moved the previous code, which also used an IV
. I wasn't going to change the type because of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just moved the previous code, which also used an
IV
. I wasn't going to change the type because of that.
You didn't move the previous code, you rewrote it.
This is the previous code.
/* must be AV or HV */
assert(!(o->op_flags & OPf_STACKED));
argc = ((IV)AvFILL(defav) + 1) - ix;
/* This is a copy of the relevant parts of pp_aassign().
*/
if ((o->op_private & OPpARGELEM_MASK) == OPpARGELEM_AV) {
IV i;
if (AvFILL((AV*)targ) > -1) {
/* target should usually be empty. If we get get
* here, someone's been doing some weird closure tricks.
* Make a copy of all args before clearing the array,
* to avoid the equivalent of @a = ($a[0]) prematurely freeing
* elements. See similar code in pp_aassign.
*/
for (i = 0; i < argc; i++) {
SV **svp = av_fetch(defav, ix + i, FALSE);
SV *newsv = newSVsv_flags(svp ? *svp : &PL_sv_undef,
(SV_DO_COW_SVSETSV|SV_NOSTEAL));
You changed old code
for (i = 0; i < argc; i++) {
to new code
for(IV ix = startix; ix < endix; ix++) {
That was not a whitespace change and was not a copy paste move.
This is a full blown rewrite/refactoring/cleanup change. You can't rename every C auto variable in a function, but still leave in all the prior coding errors from the past in the new code.
If the src line is a green shadow, the commit author takes responsibility for the contents of the green shadow.
If there are "legacy code" problems in the green shadow, the author needs to document in a comment why they are not correcting errors in the code in the new commit/diff file. The comment should include 1 of these words TODO
FIXME
BUG
to let future people know this pageful of code currently has minor coding problems, that need to be cleaned up, or watched out for by future devs who whose eyes land on that page of code, while they are hunting down or diagnosing some random bugs/defects/flaws, related or unrelated.
The terms "zero extend", "sign extend", "truncation", "integer promotion", and "signed vs unsigned comparison", are the usual C "criminals" involved in committing the crime (bug ticket) being in investigated.
AV*
s type is SSize_t
Line 337 in 945b008
Perl_av_store(pTHX_ AV *av, SSize_t key, SV *val) |
declaring a variable of any type except SSize_t
is always wrong.
This is not JavaScript code. This is C code.
C doesn't not have 1 and only 1 numeric class a dev can use. C does not come with a variable type called Number
.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number
{ | ||
for(IV ix = startix; ix < endix; ix++) { | ||
SV **svp = av_fetch(av, ix, FALSE); | ||
SV *newsv = newSVsv_flags(svp ? *svp : &PL_sv_undef, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It takes alot of work for sv_setsv_flags()
to figure out &PL_sv_undef
is empty. This isn't Chrome or NodeJS, its C. It won't constant fold. Grep the perl repo for the correct way to make a new read write undef SV*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I just moved the existing code. We can tidy it up later.
#define av_refresh_elements_range(av, startix, endix) S_av_refresh_elements_range(aTHX_ av, startix, endix) | ||
STATIC void | ||
S_av_refresh_elements_range(pTHX_ AV *av, IV startix, IV endix) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the av_extend()? so we aren't constantly going inside libc's realloc every 1-5 SVs. I forgot the exact increment unit av_store() does on a grow event, but even wanting know that count is UB/bad coding practices. av_extend() it once. We know the max length, we aren't iterating a HV or a SQL DB or a readdir() or something off an AJAX socket here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to av_extend()
. We don't alter the size of the array, we're just refreshing SVs at certain indices within it. This is an in-place alteration that preserves the size.
pp.c
Outdated
|
||
UV parami; | ||
for(parami = 0; parami < nparams; parami++) { | ||
PADOFFSET padix = aux->param_padix[parami]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
factor out of the loop reading's struct aux's param_padix field over and over , nothing will be reallocing it afaik
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, will do.
} | ||
|
||
SV **valp = av_fetch(defav, parami, FALSE); | ||
SV *val = valp ? *valp : &PL_sv_undef; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is valp retval going to be NULL?
SV*
val should be set to NULL instead of &PL_sv_undef
here if SV**
was NULL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good question. The same question could be applied to pp_argelem
where I copied this from. Perhaps there could be holes in @_
. It's trying to act defensively and treat those holes as just undef
, as it should.
if (UNLIKELY(TAINT_get) && !SvTAINTED(val)) | ||
TAINT_NOT; | ||
|
||
SvSetMagicSV(*padentry, val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go read the source code for macro SvSetMagicSV()
and tell me if its optimizations apply or don't apply here. You need to set a breakpoint in your C debugger and examine left and right side SV ptrs to fix this line either through a code change or a comment saying you verified certain behavior can (if so when and % chance of it happening in typical production code), or that certain behavior will always happen here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go read the source code for macro
SvSetMagicSV()
and tell me if its optimizations apply or don't apply here. You need to set a breakpoint in your C debugger and examine left and right side SV ptrs to fix this line either through a code change or a comment saying you verified certain behavior can (if so when and % chance of it happening in typical production code), or that certain behavior will always happen here.
Also from the comment above, if right side SV is NULL, there is a better way to take an existing SV ptr and set it to undef. grep the repo's root .c and .h files to learn how. Since you want to fire SetMagic here (why and when can there be SMG here?), make sure the correct set to undef macro or fn either fires SMG for you or fire SMG yourself right after it returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll have to explain more what you mean by that comment.
In any case again this is all behaviour copied from the existing pp_argelem
which I observe has run without reported issue for the past decade or so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 2492 in 3bd2e0d
#define SvSetMagicSV(dst,src) \ |
Line 2464 in 3bd2e0d
#define SvSetSV_and(dst,src,finally) \ |
now read
Line 2468 in 3bd2e0d
if (LIKELY((dst_) != (src_))) { \ |
Read line 2468 carefully, Read line 2468 a 2nd time. Now look at this line.
SvSetMagicSV(*padentry, val);
Go read the source code for macro SvSetMagicSV() and tell me if its optimizations apply or don't apply here. You need to set a breakpoint in your C debugger and examine left and right side SV ptrs to fix this line either through a code change or a comment saying you verified certain behavior can (if so when and % chance of it happening in typical production code), or that certain behavior will always happen here.
Answer above question. What are the addresses inside (*padentry, val
?
|
||
if(av_count(av)) { | ||
/* see "target should be empty" comments in pp_argelem above */ | ||
av_refresh_elements_range(defav, parami, parami + argc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should retval of av_count(av) be passed as arg 4 to av_refresh_elements_range()? does av_refresh_elements_range() need that info? dont look it up twice. IDK enough to say what to do here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous code I copied it from didn't make use of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then nvm
if (UNLIKELY(TAINT_get) && !SvTAINTED(val)) | ||
TAINT_NOT; | ||
|
||
av_store(av, avidx++, newSVsv(val)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newSVsv is not cow aware, see your code from above how to make it cow aware. dont use 64b ptrs on i386/arm32. newSVsv(&PL_sv_undef) is not how to alloc a new SV with value undef. use av_simple variants unless TIEARRAY can happen here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm; that's unfortunate. Given how often we'd want to do "the normal thing", perhaps it suggests we should make a COW-aware version; maybe newSVsv_cow()
or somesuch? It gets a pain to have to remember the exact names of all these flags every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your right on that one too, but the question why #ifdef PERL_CORE
newSVsv()
has COW disabled
Line 2630 in 3bd2e0d
#define newSVsv(sv) newSVsv_flags((sv), SV_GMAGIC|SV_NOSTEAL) |
Line 2165 in 3bd2e0d
#ifdef PERL_CORE |
Line 2271 in 3bd2e0d
#define sv_setsv(dsv, ssv) \ |
but sv_setsv(dsv, ssv)
in the same exact PERL_CORE
compile enviroment has COW turrned on, is something that has been nagging my brain for a while.
The topic "Why?" someone, maybe me, maybe you, maybe richard, really needs , its own new GH ticket about the question, and then a deep dive done to answer, the ? is it safe yet to make newSVsv()
make COW 255s everywhere or not.
There might be slightly bit rotten areas of libperl, where some C code, blindly does
nsv = newSVsv(oldsv);
Renew(SvPVX(nsv), SvCUR(nsv)+3, char);
SvLEN_set(nsv, SvCUR(nsv)+3);
SvPVX(nsv)[SvCUR(nsv)] = ',';
SvCUR_set(nsv, SvCUR(nsv)+1);
There are 2 problem lines in libperl, I know of, where SVt_PVCV
only code calls Safefree(SvPVX(cv));
that I've had to fix, for my unsubmitted PRs, to make sub AUTOLOAD
have its SvPVX(cv)
SvPOK_on(cv)
allow HEK COWs and COW 255s.
Your and my question needs a deep dive done on the topic.
argc -= 2; | ||
|
||
if (UNLIKELY(SvGMAGICAL(key))) | ||
key = sv_mortalcopy(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#define sv_mortalcopy(sv) Perl_sv_mortalcopy_flags(aTHX_ sv, SV_GMAGIC|SV_DO_COW_SVSETSV)
looks good on this line
SV **svp; | ||
|
||
svp = av_fetch(defav, parami, FALSE); parami++; | ||
SV *key = svp ? *svp : &PL_sv_undef; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What on earth does hv_store_ent(hv, &PL_sv_undef, newSVsv(val), 0);
mean?
how about executing hv_store_ent(hv, &PL_sv_undef, newSVsv(val), 0);
5 or 9 times in a row, each time with a different addr in var "val"? what does that mean? why is the previous sentence valid correct behavior and not a panic/heap corruption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A SEGV here would be just as good and more efficient than an actual croak()
fn call located here and its C string that nobody will ever see unless they use a disassembler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What on earth does
hv_store_ent(hv, &PL_sv_undef, newSVsv(val), 0);
mean?
Same thing as
my $key = undef;
$hv{$key} = $val;
I.e. treat the key as an empty string but provoke the "uninitialized" (sic) warning. See hv.c
line 545.
Executing it 5 or 9 times in a row will have the effect of storing a copy of the final val in the hash, having first thrown away the first 4 or 8 temporary copies.
Exactly as per
$hv{+undef} = $_ for @five_or_nine_values;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, so its to start making console noise, sounds ok to me
|
||
if(!ok) | ||
return cLOGOP->op_other; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic tree makes no sense. test each c auto or malloc backed variable exactly once in this if/else tree. var
OP*
PL_op should be saved to a C auto at the very top of this PP func to stop multiple rereads.
PL_op is an lvalue but it is not a variable. If you believe its a C variable goto the top of pp.c and add this line. it will be harmless.
#include "EXTERN.h"
#define PERL_IN_PP_C
#include "perl.h"
#include "keywords.h"
#include "invlist_inline.h"
#include "reentr.h"
#include "regcharclass.h"
#undef PL_op /* <<<<< Add this line and push this commit to blead, its a C variable right? */
/* variations on pp_null */
PP(pp_stub)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what most of that comment is about, but I'll pull out the PL_op->op_private
into a U8 priv
as many ops seem to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dTARGET;
U8 priv = PL_op->op_private;
bool ok = TARG && !SvPADSTALE(TARG); //read one
if (ok && (priv & OPpPARAM_IF_UNDEF) && !SvOK(TARG)) // read two
ok = false; // write three
if (ok && (priv & OPpPARAM_IF_FALSE) && !SvTRUE(TARG)) // read four
ok = false; //write five
if(!ok) // this line is unreadable, how did we get here?
return cLOGOP->op_other;
it should look something like this but check my logic yourself I mightve made subtle mistakes.
#define dTARGET SV * GETTARGET // for reference
#define GETATARGET targ = (PL_op->op_flags & OPf_STACKED ? sp[-1] : PAD_SV(PL_op->op_targ))
/////////////
#define cLOGOP cLOGOPx(PL_op)
PP(pp_paramtest)
{
dTARGET; // fuses CSE with next line
OP *op = PL_op;
if(TARG && !SvPADSTALE(TARG)) {
U8 priv = op->op_private;
if(priv & OPpPARAM_IF_UNDEF) {
if (!SvOK(TARG))
return cLOGOPx(op)->op_other;
}
// can OPpPARAM_IF_UNDEF and OPpPARAM_IF_FALSE be on at the same time?
else if(priv & OPpPARAM_IF_FALSE) {
if( !SvTRUE(TARG)))
return cLOGOPx(op)->op_other;
}
}
return op->op_next;
}
also change if( !SvTRUE(TARG)))
to if( !SvTRUE_NN(TARG)))
for efficiency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok = false;
P5P source code doesn't use C++'s all lower case false
, instead P5P source code uses all upper case FALSE
inside the interp. FALSE
is "some .h file"'s #define
, maybe P5P's #define
, maybe host OS's #define
, maybe P5P did a #undef FALSE \n #define FALSE ((unsigned char)0)
on some CC. Who knows but its alot more portable and following Perl C API coding style.
false
is probably a BNF grammer/lexer token that can not be controlled by any .h
file, or a C++ overloaded operator/overloaded literal/overloaded class.
Who knows what false
really is on every current CC ancient CC and future CC operation in pedantic C, vendor extension C, pedantic C++ and vendor extension C++, on any version of Windows, VMS , Z/OS ,ancient commercial Unix, FOSS Unix, or future commercial Unix (Apple).
Easier and more simple to not think what false
does, or means, or try to answer all the permutations of all the questions above.
Perl 5 in C code just writes FALSE
everywhere, not false
. Problem solved.
Perl 5 C code doesn't use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
false
became available for use as a result of requiring C99. New code does use the lower case version. It's not a problem to use the lower case.
SV *value = POPs; | ||
|
||
SvPADSTALE_off(TARG); | ||
SvSetMagicSV(TARG, value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
research with a step debugger and BPs if the macro is appropriate or should be sv_setsv(dsv, ssv) + SvSMG(dsv)
|
||
assert(PadnamePV(PadnamelistARRAY(PL_comppad_name)[padix])[0] == sigil); | ||
|
||
param->padix = padix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a /* this is a different struct member */
or /*different var */
comment, or maybe rename padix
the c auto to pad_ix
. these 2 lines took too many seconds with eyeballs to figure out they arent the same line accidentally written twice. makes it easier to read by future ppl.
5dd6e3a
to
0f590aa
Compare
I have made some slight improvements based on a few of @bulk88's comments that I could understand and actually made sense. Most of the rest I couldn't see the relevance of (including a long rambling about C compilers and Nissan cars?!), or other things such as subtle details about possibly-64bit UV values used as array indexes that were copied from other code anyway. Most of the rest of the latter may be valid things to look into, but were already present in the code before I started. They should be looked at separately, either as commits before this one is merged, or afterwards. |
Recent discussons on github [1] found a bug when calling this function as allocmy("", 0, 0) This ought not be allowed. The length must be at least 2, because the function checks the first two characters of `name`. [1]: Perl#23574 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks reasonable, though I think bulk88 had some good points about UV vs variants of size_t
, and possibly on the SvSetMagicSV().
But this is safe and these can be fixed later after investigation.
|
||
if(signature->nparams >= signature->params_size) { | ||
signature->params_size *= 2; | ||
Renew(signature->params, signature->params_size, struct yy_parser_signature_param); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this memblock lasts a very short time on startup right?
So signature->params_size *= 2;
is over-allocating as a performance optimization right?
I dont see a problem with that, I just want a confirm this over-alloc-ing is intentional and not a underflow/overflow bounds checking accident.
op.c
Outdated
signature->elems = 0; | ||
signature->optelems = 0; | ||
signature->next_argix = 0; | ||
signature->opt_params = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switch Newx() to Newxz() above instead of DIY zeroing. The authors of Foo LibC know malloc ptrs are aligned and the exact internal/secret alignment (which can be higher than type __m128i
or NV
) and can be zeroed in units of 4, 8, 16 (SSE) or 32 (AVX) per CPU ins, so they can/may/could/do zero memory faster than you can, and def faster than memset() can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah fair.
OP *paramstore; | ||
OP *paramtest = (OP *)alloc_LOGOP(OP_PARAMTEST, | ||
paramstore = newUNOP(OP_PARAMSTORE, 0, defexpr), | ||
defexpr_start); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would indent this a dif way like
OP *paramtest = (OP *)
alloc_LOGOP(OP_PARAMTEST,
paramstore = newUNOP(OP_PARAMSTORE, 0, defexpr),
defexpr_start);
or (chk sample for precedence bugs)
OP *paramstore = paramstore = newUNOP(OP_PARAMSTORE, 0, defexpr);
OP *paramtest = (OP *)alloc_LOGOP(OP_PARAMTEST, paramstore, defexpr_start);
assert(!SvMAGICAL(defav)); | ||
UV argc = (UV)(AvFILLp(defav) + 1); | ||
|
||
S_check_argc(aTHX_ argc, aux->params, aux->opt_params, aux->slurpy); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop S_check_argc()'s 1st argument argc
, move and dedupe this code (2 places)
AV *defav = GvAV(PL_defgv); /* @_ */
assert(!SvMAGICAL(defav));
UV argc = (UV)(AvFILLp(defav) + 1);
into S_check_argc, then change ret type of S_check_argc() from void
to UV
(discussed elsewhere).
UV argc = S_check_argc(aTHX_ aux->params, aux->opt_params, aux->slurpy);
The previous names were just copied from similar fields in previous signature handling code. These new names are a better fit for their current and intended future purpose.
By defining a helper term for optional defaulting expression we can remove one of the three alternate cases down to just two. This will help when adding more code to these action blocks in future,
These two new helper functions will be useful for sharing behaviour with upcoming code, when adding OP_MULTIPARAM.
…gument processing The specific behaviour of this obscure modification case is not documented or relied upon elsewhere in code. Rather than attempt to preserve this cornercase, it's easier just to no longer test for it, as upcoming changes will alter the values that are visible.
0f590aa
to
97e435b
Compare
Creates a new UNOP_AUX op type, `OP_MULTIPARAM`, that handles all of the initial behaviour of assigning values to parameters of a subroutine signature out of values passed by the caller. This is created in a similar style to other multi-ops like `OP_MULTIDEREF` and `OP_MULTICONCAT` where the op's aux structure contains a sub-program of sorts, which describes all of the small details of operation. Also adds a LOGOP, `OP_PARAMTEST` and UNOP `OP_PARAMSTORE` which are responsible for implementing the default expressions of optional parameters. These use the `SvPADSTALE` flag set on pad lexicals used as parameters to remember whether assignment has happened, ensuring that missing vs present-but-undef can be detected in a way that does not depend on counting arguments on the stack. This change is carefully designed to support two future ideas: * Named parameters as per PPC0024 https://github.com/Perl/PPCs/blob/main/ppcs/ppc0024-signature-named-parameters.md * "no-snails"; the performance optimisation that avoids storing incoming argument values in the `@_` AV and instead consumes them directly from the stack
97e435b
to
5d78994
Compare
Creates a new UNOP_AUX op type,
OP_MULTIPARAM
, that handles all of the initial behaviour of assigning values to parameters of a subroutine signature out of values passed by the caller. This is created in a similar style to other multi-ops likeOP_MULTIDEREF
andOP_MULTICONCAT
where the op's aux structure contains a sub-program of sorts, which describes all of the small details of operation.Also adds a LOGOP,
OP_PARAMTEST
and UNOPOP_PARAMSTORE
which are responsible for implementing the default expressions of optional parameters. These use theSvPADSTALE
flag set on pad lexicals used as parameters to remember whether assignment has happened, ensuring that missing vs present-but-undef can be detected in a way that does not depend on counting arguments on the stack.This change is carefully designed to support two future ideas:
Named parameters as per PPC0024
"no-snails"; the performance optimisation that avoids storing incoming argument values in the
@_
AV and instead consumes them directly from the stack--
This set of changes currently does not have a perldelta entry, because it doesn't directly make any changes that are end-user visible. However, since it is quite a significant internal change, I could be talked into writing an entry in the "internal changes" section anyway. Reviewers: Thoughts?