`pcrejit` ( 3 )

Perl-совместимые регулярные выражения (Perl-compatible regular expressions)

JIT STACK FAQ

(1) Why do we need JIT stacks?

PCRE (and JIT) is a recursive, depth-first engine, so it needs a stack where the local data of the current node is pushed before checking its child nodes. Allocating real machine stack on some platforms is difficult. For example, the stack chain needs to be updated every time if we extend the stack on PowerPC. Although it is possible, its updating time overhead decreases performance. So we do the recursion in memory.

(2) Why don't we simply allocate blocks of memory with malloc()?

Modern operating systems have a nice feature: they can reserve an address space instead of allocating memory. We can safely allocate memory pages inside this address space, so the stack could grow without moving memory data (this is important because of pointers). Thus we can allocate 1M address space, and use only a single memory page (usually 4K) if that is enough. However, we can still grow up to 1M anytime if needed.

(3) Who "owns" a JIT stack?

The owner of the stack is the user program, not the JIT studied pattern or anything else. The user program must ensure that if a stack is used by pcre_exec(), (that is, it is assigned to the pattern currently running), that stack must not be used by any other threads (to avoid overwriting the same memory area). The best practice for multithreaded programs is to allocate a stack for each thread, and return this stack through the JIT callback function.

(4) When should a JIT stack be freed?

You can free a JIT stack at any time, as long as it will not be used by pcre_exec() again. When you assign the stack to a pattern, only a pointer is set. There is no reference counting or any other magic. You can free the patterns and stacks in any order, anytime. Just do not call pcre_exec() with a pattern pointing to an already freed stack, as that will cause SEGFAULT. (Also, do not free a stack currently used by pcre_exec() in another thread). You can also replace the stack for a pattern at any time. You can even free the previous stack before assigning a replacement.

(5) Should I allocate/free a stack every time before/after calling pcre_exec()?

No, because this is too costly in terms of resources. However, you could implement some clever idea which release the stack if it is not used in let's say two minutes. The JIT callback can help to achieve this without keeping a list of the currently JIT studied patterns.

(6) OK, the stack is for long term memory allocation. But what happens if a pattern causes stack overflow with a stack of 1M? Is that 1M kept until the stack is freed?

Especially on embedded sytems, it might be a good idea to release memory sometimes without freeing the stack. There is no API for this at the moment. Probably a function call which returns with the currently allocated memory for any stack and another which allows releasing memory (shrinking the stack) would be a good idea if someone needs this.

(7) This is too much of a headache. Isn't there any better solution for JIT stack handling?

No, thanks to Windows. If POSIX threads were used everywhere, we could throw out this complicated API.

Исходный текст на man7.org

pcrejit ( 3 )

JIT STACK FAQ

`pcrejit` ( 3 )