#ifdef USE_ITHREADS> >> /* XXX: perhaps we can optimize this further. At the moment when> >> * perl w/ ithreads is used, we always deparse the anon subs> >> * before storing them and then eval them each time they are> >> * used. This is because we don't know whether the same perl that> >> * compiled the anonymous sub is used to run it.> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^> > Can you explain under what circumstance it actually happens, or even> > matters? AFAICT the sub is compiled into an optree as soon as the> > module is loaded.
[...]
Very simple: modperl can't store those CVs, since it can store only> one CV per handler, but there are many interpreters and each one of> the them will have a different CV of the same compiled anon sub.>
There are several cases we need to deal with.>
1) compile time definition in the perl module>
$r->push_handlers(PerlTransHandler => sub { .... });>
there is no way this can be precompiled before the run time.>
3) .conf inlined handlers:>
#httpd.conf:> PerlTransHandler 'sub { ... }'>
Here the handler is not compiled until the run-time
Thanks Stas for the clear explanation. Although I don't fully understand how the interpreter pools operate yet, I'm still of the opinion that the best way to resolve this problem is to ensure the same interpreter always runs whatever extra hooks it creates (eg pool cleanups, filters/handlers added, etc.).
Now, let's say we could somehow store those CVs in each interpreter,> rather than mod_perl's handlers (which I can't see how is it possible> w/o turning anon subs into named, which may have some bad> side-effects. We still have problems: >
it's possible that the module was compiled during PerlTranshandler, by> interpreter A, but interpreter B was selected to run> PerlResponseHandler (which is the anon sub). What do we do now? B has> no idea what that anon-sub is, as it didn't compile it.
Is it really ok for interpreter B to deparse A's sub here?
Thanks Stas for the clear explanation. Although I don't fully> understand how the interpreter pools operate yet,
Feel free to ask additional questions, Joe. I have explained that several times on the modperl list, but never had a chance to throughly document that. One day...
I'm still of > the opinion that the best way to resolve this problem is to > ensure the same interpreter always runs whatever extra hooks it > creates (eg pool cleanups, filters/handlers added, etc.).
That will be only possible if you drop the pools functionality, the interpreter scopes, etc.
I was thinking that something is flawed in my explanation, but no, I've checked mp1 source and in mod_perl.c it has:
#if 0 /* XXX: CV lookup cache disabled for now */ if(!is_method && defined_sub) { /* cache it */ MP_TRACE_h(fprintf(stderr, "perl_call: caching CV pointer to `%s'\n", (anon ? "__ANON__" : SvPV(sv,na)))); SvREFCNT_dec(sv); sv = (SV*)newRV((SV*)cv); /* let newRV inc the refcnt */ } #endif
so I suppose caching of anon CVs compiled at run-time never worked.
Now, let's say we could somehow store those CVs in each interpreter,>>rather than mod_perl's handlers (which I can't see how is it possible>>w/o turning anon subs into named, which may have some bad>>side-effects. We still have problems: >>
it's possible that the module was compiled during PerlTranshandler, by>>interpreter A, but interpreter B was selected to run>>PerlResponseHandler (which is the anon sub). What do we do now? B has>>no idea what that anon-sub is, as it didn't compile it.>
Is it really ok for interpreter B to deparse A's sub here?
It doesn't do that. A deparses the function and stores its source in the mod_perl struct. Then B compiles it before running it.
I'm still of the opinion that the best way to resolve this problem is to> > ensure the same interpreter always runs whatever extra hooks it creates (eg> > pool cleanups, filters/handlers added, etc.).>
That will be only possible if you drop the pools functionality, the> interpreter scopes, etc.
Obviously I don't see why that is so. In fact, I see that it is already implemented in parts of mod_perl, for instance when registering a pool-cleanup callback:
Here the currently active interpreter has its refcount incremented by mpxs_apr_pool_cleanup_register(). AFAICT the interpreter is not allowed to reenter the thread pool until this cleanup has run. I don't see how PerlInterpScope influences this behavior at all.
I still believe that this is the right approach, and again I wonder why this cannot also be used in the situations you cited earlier in this thread.
[...]
Is it really ok for interpreter B to deparse A's sub here?>
It doesn't do that. A deparses the function and stores its source in> the mod_perl struct. Then B compiles it before running it.
Joe Schaefer wrote:> Stas Bekman <stas@stason.org> writes:>
Joe Schaefer wrote:>
[...]>
I'm still of the opinion that the best way to resolve this problem is to>>>ensure the same interpreter always runs whatever extra hooks it creates (eg>>>pool cleanups, filters/handlers added, etc.).>>
That will be only possible if you drop the pools functionality, the>>interpreter scopes, etc.>
Obviously I don't see why that is so. In fact, I see that it is already > implemented in parts of mod_perl, for instance when registering a> pool-cleanup callback:>
Here the currently active interpreter has its refcount incremented by> mpxs_apr_pool_cleanup_register(). AFAICT the interpreter is not allowed > to reenter the thread pool until this cleanup has run. I don't see how > PerlInterpScope influences this behavior at all.>
I still believe that this is the right approach, and again I wonder > why this cannot also be used in the situations you cited earlier in > this thread.
Anti-Example 1: parent interpreter compiles the anon-sub, which interpreter are you going to stick to that anon-sub?
Anti-Example 2: PerlPreConnectionHandler compiles the anon-sub, that sub is to run in PerlResponseHandler. If the interpreter scope is set to 'handler' (and if there is a different pool for each phase), the interpreter that has run PerlPreConnectionHandler can't possible come even close around PerlResponseHandler.
In any case, this is not feasible, the way things are initialized now, since anon-subs aren't accessible from perl once you've compiled them (i.e. if you didn't store it somewhere, you can't find it anymore). And you can't store them in mod_perl, since it can store only one entry per registered handler - it can't store N entries.
Just try to take each of the 3 cases I've explained earlier and see how can you compile and where can you store it (even if you could stick the perl interpreter to it).
regarding the example of pool cleanup, I believe that it's broken in the case of interpreter scope functionality utilization.
As mentioned before hardly any of these interpreter scope/pool features are exercised, so...
Obviously I don't see why that is so. In fact, I see that it is> > already implemented in parts of mod_perl, for instance when> > registering a pool-cleanup callback:
Here the currently active interpreter has its refcount incremented> > by mpxs_apr_pool_cleanup_register(). AFAICT the interpreter is not> > allowed to reenter the thread pool until this cleanup has run. I> > don't see how PerlInterpScope influences this behavior at all.> > I still believe that this is the right approach, and again I wonder> > why this cannot also be used in the situations you cited earlier in> > this thread. >
Anti-Example 1: parent interpreter compiles the anon-sub, which> interpreter are you going to stick to that anon-sub?
A: As I stated, I expect the intepreter which invokes cleanup_register() to also be the one which executes the anonymous sub during $sp's pool cleanup. Are you saying that doesn't work if the interpreter is actually a clone, and not the parent?
Bear in mind I'm learning about threads|ithreads as I go here, so please don't intrepret my inquisitiveness as anything other than naivete. I'm just trying to sort all this out so I can write a few useful patches.
As I slowly digest the rest of your post, it looks like I'll derive the most benefit from it by writing a few tests first.
Joe Schaefer wrote:> Stas Bekman <stas@stason.org> writes:>
Joe Schaefer wrote:>
[...]>
Obviously I don't see why that is so. In fact, I see that it is>>>already implemented in parts of mod_perl, for instance when>>>registering a pool-cleanup callback:>
Here the currently active interpreter has its refcount incremented>>>by mpxs_apr_pool_cleanup_register(). AFAICT the interpreter is not>>>allowed to reenter the thread pool until this cleanup has run. I>>>don't see how PerlInterpScope influences this behavior at all.>>>I still believe that this is the right approach, and again I wonder>>>why this cannot also be used in the situations you cited earlier in>>>this thread. >>
Anti-Example 1: parent interpreter compiles the anon-sub, which>>interpreter are you going to stick to that anon-sub?>
A: As I stated, I expect the intepreter which invokes cleanup_register()> to also be the one which executes the anonymous sub during $sp's pool> cleanup. Are you saying that doesn't work if the interpreter is actually> a clone, and not the parent?
I think what happens is this:
If the (very top) parent perl interpreter is attached to the cleanup hook (which only makes sense for cleanup callbacks registered at the server startup), there is no problem, since when those callbacks are called, there are no more threads and all the calls will be serialized and that interpreter will never be used at the same time by two threads (since there are no threads on shutdown).
but you can't stick that very top parent perl interpreter to an anon-sub, because it's not thread safe. If you did, several threads may try to use the same interpreter at the same time at run time. Moreover you aren't even allowed to use that very top parent interpreter, nowhere but at the startup and shutdown.
I hope that answers the question. I'm not sure why you are asking about the clone, when there is already a problem with the top-level perl, in the simplest case where there are no clones.
Bear in mind I'm learning about threads|ithreads as I go here, so please> don't intrepret my inquisitiveness as anything other than naivete. I'm > just trying to sort all this out so I can write a few useful patches.
No problem, Joe. Just ask questions and I'll try to answer them
As I slowly digest the rest of your post, it looks like I'll > derive the most benefit from it by writing a few tests first.
Joe Schaefer wrote:> > Stas Bekman <stas@stason.org> writes:
[...]
Anti-Example 1: parent interpreter compiles the anon-sub, which> >>interpreter are you going to stick to that anon-sub?
A: As I stated, I expect the intepreter which invokes> > cleanup_register() to also be the one which executes the anonymous> > sub during $sp's pool cleanup. Are you saying that doesn't work if> > the interpreter is actually a clone, and not the parent?>
I think what happens is this:>
If the (very top) parent perl interpreter is attached to the cleanup hook> (which only makes sense for cleanup callbacks registered at the server> startup), there is no problem, since when those callbacks are called,> there are no more threads and all the calls will be serialized and> that interpreter will never be used at the same time by two threads> (since there are no threads on shutdown).>
but you can't stick that very top parent perl interpreter to an anon-sub,> because it's not thread safe.
Agreed, in the abstract. So on to the next question- when does the parent interpreter have to deal with such anon-subs? Certainly during server config, when it encounters things like
PerlResponseHandler 'sub { print "foo\n"; return OK }'
Anywhere else besides the Perl*Handlers? Do the <Perl> sections a similar problem? If not, then a simple way to address this would be to have the parent interpreter not compile such strings, which might avoid the deparse overhead.
If you did, several threads may try to use the same interpreter at the> same time at run time. Moreover you aren't even allowed to use that> very top parent interpreter, nowhere but at the startup and shutdown.>
I hope that answers the question. I'm not sure why you are asking> about the clone, when there is already a problem with the top-level> perl, in the simplest case where there are no clones.
I thought there was always at least one clone, and that the parent interpreter was reserved for just doing startup/shutdown/cloning:
When the server is started, a Perl interpreter is constructed, compiling any code specified in the configuration, just as 1.0 does. This interpreter is referred to as the "parent" interpreter. Then, for the number of PerlInterpStart configured, a (thread-safe) clone of the parent interpreter is made (via perl_clone()) and added to the pool of interpreters. This clone copies any writeable data (e.g. the symbol table) and shares the compiled syntax tree.
Joe Schaefer wrote:> Stas Bekman <stas@stason.org> writes:>
Joe Schaefer wrote:>>
Stas Bekman <stas@stason.org> writes:>
[...]>
Anti-Example 1: parent interpreter compiles the anon-sub, which>>>>interpreter are you going to stick to that anon-sub?>
A: As I stated, I expect the intepreter which invokes>>>cleanup_register() to also be the one which executes the anonymous>>>sub during $sp's pool cleanup. Are you saying that doesn't work if>>>the interpreter is actually a clone, and not the parent?>>
I think what happens is this:>>
If the (very top) parent perl interpreter is attached to the cleanup hook>>(which only makes sense for cleanup callbacks registered at the server>>startup), there is no problem, since when those callbacks are called,>>there are no more threads and all the calls will be serialized and>>that interpreter will never be used at the same time by two threads>>(since there are no threads on shutdown).>>
but you can't stick that very top parent perl interpreter to an anon-sub,>>because it's not thread safe. >
Agreed, in the abstract. So on to the next question- when does the> parent interpreter have to deal with such anon-subs? Certainly> during server config, when it encounters things like>
PerlResponseHandler 'sub { print "foo\n"; return OK }'
Nope. it doesn't compile them at the server startup, it just stores the source code in the modperl struct. B::Deparse has nothing to do with this case. B::Deparse is only used with the case of the perl code pushing an anon-sub:
$r->push_handlers(PerlTransHandler => sub { .... });
Anywhere else besides the Perl*Handlers? Do the <Perl> sections> a similar problem? If not, then a simple way to address this would > be to have the parent interpreter not compile such strings, which> might avoid the deparse overhead.
You mean:
<Perl> $s->push_handlers(PerlPreConnectionHandler => sub { .... }); </Perl>
that's exactly the same as having it in the plain perl module, with the only difference that the <Perl> sections are always executed right away, whereas a plain perl module will get compiled right away when PerlModule is encountered, only if perl interpreter was started early, i.e. if a <Perl> section or PerlLoadModule was encountered before.
If you did, several threads may try to use the same interpreter at the>>same time at run time. Moreover you aren't even allowed to use that>>very top parent interpreter, nowhere but at the startup and shutdown.>>
I hope that answers the question. I'm not sure why you are asking>>about the clone, when there is already a problem with the top-level>>perl, in the simplest case where there are no clones.>
I thought there was always at least one clone, and that the parent> interpreter was reserved for just doing startup/shutdown/cloning:>
When the server is started, a Perl interpreter is constructed, > compiling any code specified in the configuration, just as 1.0> does. This interpreter is referred to as the "parent"> interpreter. Then, for the number of PerlInterpStart configured, a> (thread-safe) clone of the parent interpreter is made (via> perl_clone()) and added to the pool of interpreters. This clone copies> any writeable data (e.g. the symbol table) and shares the compiled> syntax tree.
That's correct (in the case of threaded mpm), but again the problem exists before the first clone is done, since it's quite possible that the parent interpreter is the one that compiles the perl code with anon sub, in which case you can't attach the parent perl interpreter to that compiled code.
So on to the next question- when does the> > parent interpreter have to deal with such anon-subs? Certainly> > during server config, when it encounters things like> > PerlResponseHandler 'sub { print "foo\n"; return OK }'>
Nope. it doesn't compile them at the server startup, it just stores> the source code in the modperl struct. B::Deparse has nothing to do> with this case. B::Deparse is only used with the case of the perl code> pushing an anon-sub:>
$r->push_handlers(PerlTransHandler => sub { .... });
OK- thanks for the correction and clarification!
Do the <Perl> sections a similar problem? If not, then a simple way> > to address this would be to have the parent interpreter not compile> > such strings, which might avoid the deparse overhead.>
You mean:>
<Perl>> $s->push_handlers(PerlPreConnectionHandler => sub { .... });> </Perl>>
Yup, exactly!
This difference (merely compiling code vs. actually executing the code) *does* make a difference in what I'm proposing, since I want the interpreter which *executes* push_handlers() (& its relatives) to stick around and execute the /callback also. Obviously that proposal doesn't quite work when the parent interpreter is the one which executes the code, so I will eventually need to amend my proposal so it deals with that. But lets leave that issue unresolved for the moment...
that's exactly the same as having it in the plain perl module, with> the only difference that the <Perl> sections are always executed right> away, whereas a plain perl module will get compiled right away when> PerlModule is encountered, only if perl interpreter was started early,> i.e. if a <Perl> section or PerlLoadModule was encountered before.
[...]
That's correct (in the case of threaded mpm), but again the problem exists> before the first clone is done, since it's quite possible that the parent> interpreter is the one that compiles the perl code with anon sub, in> which case you can't attach the parent perl interpreter to that> compiled code.
I'm still confused by this line of reasoning, because it seems you're not distinguishing between compile time vs runtime, but the problem might just be a difference in terminology. Perhaps this will help- suppose
"C" = perl code which contains an anon sub "S", e.g. using the above
$s->push_handlers(PerlPreConnectionHandler => sub { .... });
(C represents the whole line, and S is just the anon sub).
Please clarify/comment/correct the following:
If interpreter I compiles C, but does not execute it, and then I clones J, J will not have any problem executing C. However, it is desirable that J should also be the interpreter which executes S - the anon sub which was created by J's execution of C. Because if we guarantee that only J will execute S, we avoid the (slow, and occasionally unreliable) deparse/reparse cycle that a different interpreter would require. If we cannot make that guarantee, J must deparse S when it executes C (and observing that I actually compiled both C and S).
Joe Schaefer wrote: [...]>>That's correct (in the case of threaded mpm), but again the problem exists>>before the first clone is done, since it's quite possible that the parent>>interpreter is the one that compiles the perl code with anon sub, in>>which case you can't attach the parent perl interpreter to that>>compiled code. >
I'm still confused by this line of reasoning, because it seems> you're not distinguishing between compile time vs runtime, but > the problem might just be a difference in terminology. Perhaps > this will help- suppose>
"C" = perl code which contains an anon sub "S", e.g. using the above>
$s->push_handlers(PerlPreConnectionHandler => sub { .... });>
(C represents the whole line, and S is just the anon sub).>
Please clarify/comment/correct the following:>
If interpreter I compiles C, but does not execute it,> and then I clones J, J will not have any problem executing C.
J will never get to run C since it was already compiled by I, (I suppose you meant S in the end of that sentence)
(I'd rather have called them P(arent) and W(orker) but let's stick to I and J)
J has a problem executing S. CV/I is stored in the modperl struct. On perl_clone() call, CV/J is not the same as CV/I, so J can't call CV address generated by I (it'll be a sure segfault).
This problem could be solved for example by pushing the return value of sub {} into a scalar behind the scenes. so the code:
package Foo; $s->push_handlers(PerlPreConnectionHandler => sub { .... });
of course we aren't going to manipulate the code, but just introduce a new scalar which will store that value. If we do that, when perl_clone is called the new variable will correctly contain a pointer to a new anon-sub. So all we need to store in the mod_perl struct is the name of that scalar.
I think that will work, if one can ensure that all anon-subs are compiled before perl_clone is called. But that can't be ensured.
However, it is desirable that J should also be the interpreter > which executes S - the anon sub which was created by J's execution > of C. Because if we guarantee that only J will execute S, we > avoid the (slow, and occasionally unreliable) deparse/reparse cycle > that a different interpreter would require. If we cannot make> that guarantee, J must deparse S when it executes C (and observing> that I actually compiled both C and S).
Sorry, but I can't see how can this stick interpreter-in work in the perlscope environment.
I think the simplest solution we could try in order to avoid B::Deparse is to ensure that any anon-subs are compiled before perl_clone (i.e. croak if any anon-sub is encountered after the startup phase). Combined with my idea from above (if it proves to work) that will ensure that anon-sub will be identical to named subs and the problem is solved. Since $Foo::__anonsub1 will function as a name for the sub.
The same solution can be applied for:
PerlTransHandler 'sub { ... }'
by compiling it after startup (not waiting till the run-time) and perl_clone will ensure to clone those correctly. but this is just an optimization for the inlined handlers.