Upstream "mark-delay" change from flambda-backend. (details)
Hack to work around accounting problem: artificially catch-up work_counter at the start of any slice when it falls very far behind alloc_counter. (details)
In https://github.com/ocaml/ocaml/pull/13580#issuecomment-3092253963 jmid reports that he needed to tweak the GC verbosity setting to avoid getting spammed by minor-gc messages when debugging an assertion failure.
The other sub-phases of the GC minor all uses `caml_gc_log` rather than CAML_GC_MESSAGE, and do not seem to cause similar spamming issues. Fixing the code to be consistent will avoid inconsistent verbosity levels in end-user scripts.
runtime: free the minor heap when leaving STW participants
The reserved address space for the minor heap area is a global resource shared by all domains; each domain owns a portion of it, within which it commits a part for its minor heap. (Having contiguous address space allows for an efficient Is_young check.). When we need more reserved space because the user increased the minor heap size, we use a STW section to change the reservation: each domain in the STW section first decommits its minor heap, a single domain changes the reserved area, and then each domain re-commits its minor heap.
If a domain does not participate to STW sections, the boundaries of its minor heap will change without the domain decommitting the previous minor heap first. If the same domain structure is used for a newly spawned domain later on, it will start by decommitting its minor heap following the new boundaries, which is incorrect as it never committed this adress range in the first place.
(In practice calling `caml_mem_decommit` incorrectly in this way does not appear to crash the program. I think this is because `decommit` has a fairly liberal behavior, it will happily do nothing if the memory range is not committed. The code remains logically wrong, and could become a hard failure if other parts of the runtime change in reasonable ways later on.)
The present commit ensures that we systematically decommit the minor heap of each domain when it leaves the set of STW participants. This way, only STW-active domains have their minor heap allocated, and changing the minor heap address space within STW section works as intended.
(I tried to remove the new call to `free_minor_heap` in `domain_terminate`, and checked that the testsuite fails in debug mode when the `allocate_minor_heap` call in `domain_create` later on notices an already-committed minor heap.)
This fixes a bug in the interaction between polymorphic variants and polymorphic parameters. The actual bug fix is just changing two falses to trues but I also changed the instance_poly interface to make similar mistakes less likely to happen.
The OPTIONAL_BYTECODE_TOOLS, OPTIONAL_NATIVE_TOOLS and OPTIONAL_LIBRARIES should be being used to affect build and installation, not definitions. If ocamltest et al were disabled, then the definitions of these programs were omitted, which prevents the reproducible generation of dependency information.
Running config.status works correctly, but individually requesting links in otherlibs/dynlink did not because the names were specified using a shell variable (i.e. at configure-time) instead of a m4sh variable (i.e. at autoconf-time).
The current codebase use 'caml_minor_heaps_{start,end}' for the boundaries of a global address space that is reserved, 'dom->caml_minor_heap_area_{start,end}' for a 'minor heap area', a segment of this address space that is owned by each domain, and then finally 'dom->young_{start,end}' for the prefix of this segment that is actually committed and used as the minor heap of each domain. Some comments refer to the latter as the 'minor heap arena', following terminology from the Retrofitting Parallelism into OCaml paper.
On a suggestion by KC, I am trying to make the naming scheme more regular by consistently using 'reservation' for a reserved block of address space:
- Use 'minor heaps reservation' for the global reservation. Its boundaries remain stored in 'caml_minor_heaps_{start,end}' to avoid compatibility issues in third-party code.
- Use 'minor heap reservation' for the per-domain segment of the global reservation. Its boundaries are stored in 'dom->minor_heap_reservation_{start,end}'.
- Use 'minor heap' for the prefix of the minor heap reservation that is actually committed, whose boundaries remain 'dom->young_{start,end}'.
My PR #14158 merged today introduced a bug in the logic to resize the minor heaps reservation. It added the following to the `free_minor_heap_arena` function:
domain_state->minor_heap_wsz = 0;
Doing this is correct when we are freeing the minor heap arena of a domain that is leaving the STW participant set (the focus of #14158); it is also correct in
int caml_reallocate_minor_heap_arena(asize_t wsize) { free_minor_heap_arena(); return allocate_minor_heap_arena(wsize); }
which is called to change the size of the memory area, so zeroing it in `free` before setting it in `allocate` is fine. However, it is *not* correct in
if (allocate_minor_heap_arena(Caml_state->minor_heap_wsz) < 0) { caml_fatal_error("Fatal error: No memory for minor heap arena"); } }
This function changes the global minor heaps reservation during a STW event where each domain first deallocates its arena and then reallocates it in the new reservation. The problem is that `free_minor_heap_arena` now changes the value of `Caml_state->minor_heap_wsz` to 0, so the re-allocation that follows will try to allocate a 0-word (in fact a 512-word due to the page-alignment normalization logic) arena.
This bug can only be encountered by calling `caml_update_minor_heap_max`, so it affects few programs.
I see two approaches to fix it:
1. we could remove the zeroing of `minor_heap_wsz`, and instead use the previous check `young_start == NULL && young_end == NULL` to detect uninitialized arenas
2. ... or we do assume that `free_minor_heap_arena` will unset the arena size (which is reasonable), and we preserve the desired size value within the `stw_resize_minor_heaps_reservation` function.
The present commit implements approach (2). I prefer to avoid a situation (as with (1)) where the `free` would leave the state only partially initialized, and it would be important for correctness.
Memprof.start replaces any existing profile in the domain, Memprof.is_sampling
The change to Memprof.start increases its compositionality while conforming to the previous behaviour (it simply fails in fewer situations). This is necessary for us to implement the Memprof interface on top of the package memprof-limits.
The new function is_sampling is for clients that do want to fail early, e.g. when detecting that two Memprof clients are interfering.
runtime: host aligned fibers inside the fiber cache whenever possible (#14169)
* When growing a fiber, zero the alignment word before computing the next size in order to make this new size fits inside the fiber cache. * Add an assertion to check that small fibers are using the cache.
Add runtime counters EV_C_MINOR_PROMOTED_WORDS and EV_C_MINOR_ALLOCATED_WORDS.
EV_C_MINOR_PROMOTED_WORDS reports words promoted by minor GC and EV_C_MINOR_ALLOCATED_WORDS reports words allocated by minor GC. Both have equivalent bytes counters.
Update the documentation for EV_C_MINOR_PROMOTED and EV_C_MINOR_ALLOCATED to qualify scope of the values reported.