Cleanup of the CSet chooser class: standardize on uints for region num and indexes (instead of int, jint, etc.), make the method / field naming style more consistent, remove a lot of dead code.
Reviewed-by: johnc, brutisso
Change the type of fields / variables / etc. that represent region counts and indeces from size_t to uint.
Reviewed-by: iveresov, brutisso, jmasa, jwilhelm
Added log levels "fine", "finer" and "finest". Let PrintGC map to "fine" and PrintGCDetails map to "finer". Separated out the per worker information in the G1 logging to the "finest" level.
Reviewed-by: stefank, jwilhelm, tonyp, johnc
Tiered compilation has increased the number of nmethods in the code cache. This has, in turn, significantly increased the number of marked nmethods processed during the StrongRootsScope destructor. Create a specialized version of CodeBlobToOopClosure for G1 which places only those nmethods that contain pointers into the collection set on to the marked nmethods list.
Reviewed-by: iveresov, tonyp
Make sure that MutableNUMASpace::ensure_parsability() only calls CollectedHeap::fill_with_object() with valid sizes and make sure CollectedHeap::filler_array_max_size() returns a value that can be converted to an int without overflow
Reviewed-by: azeemj, jmasa, iveresov
Make two G1 cmd line flags available in product builds: G1HeapWastePercent (previously called: G1OldReclaimableThresholdPercent) and G1MixedGCCountTarget (previous called: G1MaxMixedGCNum). Also changed the default of the former from 1% to 5% and the default for G1OldCSetRegionLiveThresholdPercent to 90%.
Reviewed-by: azeemj, jwilhelm, johnc
Attempting to initiate a marking cycle when allocating a humongous object can, if a marking cycle is successfully initiated by another thread, result in the allocating thread spinning until the marking cycle is complete. Eliminate a deadlock between the main ConcurrentMarkThread, the SurrogateLocker thread, the VM thread, and a mutator thread waiting on the SecondaryFreeList_lock (while free regions are going to become available) by not manipulating the pending list lock during the prologue and epilogue of the cleanup pause.
Reviewed-by: brutisso, jcoomes, tonyp
Revamp of the mechanism that chooses old regions for inclusion in the CSet. It simplifies the code and introduces min and max bounds on the number of old regions added to the CSet at each mixed GC to avoid pathological cases. It also ensures that when we do a mixed GC we'll always find old regions to add to the CSet (i.e., it eliminates the case where a mixed GC will collect no old regions which can happen today).
Reviewed-by: johnc, brutisso
Replace calls to os::javaTimeMillis() that are used to update the milliseconds since the last GC to an equivalent that uses a monotonically non-decreasing time source.
Reviewed-by: ysr, jmasa
If we try to schedule an initial-mark GC in order to explicit start a conc mark cycle and it gets pre-empted by antoher GC, we should retry the attempt as long as it's appropriate for the GC cause.
Reviewed-by: brutisso, johnc
Some minor profile based optimizations. Reduce the number of branches and branch mispredicts by removing some virtual calls, through closure specalization, and refactoring some conditional statements.
Reviewed-by: brutisso, tonyp
Re-enable survivors during the initial-mark pause. Afterwards, the concurrent marking threads have to scan them and mark everything reachable from them. The next GC will have to wait for the survivors to be scanned.
Reviewed-by: brutisso, johnc
Remove the separate counting phase of concurrent marking by tracking the amount of marked bytes and the cards spanned by marked objects in marking task/worker thread local data structures, which are updated as individual objects are marked.
Reviewed-by: brutisso, tonyp
During an initial mark pause, signal the Concurrent Mark thread after the pause output from PrintGC/PrintGCDetails is complete.
Reviewed-by: tonyp, brutisso
There is a high number of mispredicted branches associated with calling BitMap::iteratate() from within CMBitMapRO::iterate(). Implement a version of CMBitMapRO::iterate() directly using inline-able routines.
Reviewed-by: tonyp, iveresov
This change simplifies the interaction between GC and concurrent marking. By disabling survivor spaces during the initial-mark pause we don't need to propagate marks of objects we copy during each GC (since we never need to copy an explicitly marked object).
Reviewed-by: johnc, brutisso
Store the "next chunk start index" in the length field of the to-space object, instead of the from-space object, so that we can always reliably read the size of all from-space objects.
Reviewed-by: johnc, ysr, jmasa
Parallelize the removal of self forwarding pointers etc. by wrapping in a HeapRegion closure, which is then wrapped inside an AbstractGangTask.
Reviewed-by: tonyp, iveresov
Introduce a flag that is set when a heap expansion attempt during a GC fails so that we do not consantly attempt to expand the heap when it's going to fail anyway. This not only prevents the excessive ergo output (which is generated when a region allocation fails) but also avoids excessive and ultimately unsuccessful expansion attempts.
Reviewed-by: jmasa, johnc
Change variables representing the number of GC workers to uint from int and size_t. Change the parameter in work(int i) to work(uint worker_id).
Reviewed-by: brutisso, tonyp
Serialize the worker threads that are generating output during parallel heap verification to make sure the output is consistent.
Reviewed-by: brutisso, johnc, jmasa