Detailed breakdown of time spent in the evacuation failure handling phases to make the "Other" time roughly correspond to the sum of its parts.
Reviewed-by: jwilhelm, jmasa
The initialization for the remembered set summary data structures used the wrong thread count, i.e. number of worker threads instead of number of refinement threads.
Reviewed-by: brutisso
Add memory consumption breakdown on a per region type in the G1 remembered set summary statistics. This simplifies remembered set memory consumption analysis.
Reviewed-by: brutisso
Add a list of nmethods to the RSet for a region that contain references into the region. Skip scanning the code cache during root scanning and scan the nmethod lists during RSet scanning instead.
Reviewed-by: tschatzl, brutisso, mgerdin, twisti, kvn
In a full GC, move the verification after the GC to after RSet rebuilding. Verify RSet entries during a full GC under control of a flag.
Reviewed-by: tschatzl, brutisso
Fixed the output of G1SummarizeRSetStats: too small datatype for the number of concurrently processed cards, added concurrent remembered set thread time retrieval for Linux and Windows (BSD uses os::elapsedTime() now), and other cleanup. The information presented during VM operation is now relative to the previous output, not always cumulative if G1SummarizeRSetStatsPeriod > 0. At VM exit, the code prints a cumulative summary.
Reviewed-by: johnc, jwilhelm
Refactor G1's hot card cache and card counts table into their own files. Simplify the card counts table, including removing the encoding of the card index in each entry. The card counts table now has a 1:1 correspondence with the cards spanned by heap. Space for the card counts table is reserved from virtual memory (rather than C heap) during JVM startup and is committed/expanded when the heap is expanded. Changes were also reviewed-by Vitaly Davidovich.
Reviewed-by: tschatzl, jmasa
Remove PermGen, allocate meta-data in metaspace linked to class loaders, rewrite GC walking, rewrite and rename metadata to be C++ classes
Co-authored-by: Stefan Karlsson <stefan.karlsson@oracle.com>
Co-authored-by: Mikael Gerdin <mikael.gerdin@oracle.com>
Co-authored-by: Tom Rodriguez <tom.rodriguez@oracle.com>
Reviewed-by: jmasa, stefank, never, coleenp, kvn, brutisso, mgerdin, dholmes, jrose, twisti, roland
7151532: DCmd for hotspot native memory tracking
Implementation of native memory tracking phase 1, which tracks VM native memory usage, and related DCmd
Reviewed-by: acorn, coleenp, fparain
Some minor profile based optimizations. Reduce the number of branches and branch mispredicts by removing some virtual calls, through closure specalization, and refactoring some conditional statements.
Reviewed-by: brutisso, tonyp
Change variables representing the number of GC workers to uint from int and size_t. Change the parameter in work(int i) to work(uint worker_id).
Reviewed-by: brutisso, tonyp
Parallelize the serial code that was used to mark objects reachable from survivor objects in the collection set. Some minor improvments in the timers used to track the freeing of the collection set along with some tweaks to PrintGCDetails.
Reviewed-by: tonyp, brutisso
Major cleanup of the G1CollectorPolicy class. It removes a lot of unused fields and methods and also consolidates replicated information (mainly various ways of counting the number of CSet regions) into one copy.
Reviewed-by: johnc, brutisso
7102445: G1: Unnecessary Resource allocations during RSet scanning
Add a new per-worker thread line in the PrintGCDetails output. GC Worker Other is the difference between the elapsed time for the parallel phase of the evacuation pause and the sum of the times of the sub-phases (external root scanning, mark stack scanning, RSet updating, RSet scanning, object copying, and termination) for that worker. During RSet scanning, stack allocate DirtyCardToOopClosure objects; allocating these in a resource area was causing abnormally high GC Worker Other times while the worker thread freed ResourceArea chunks.
Reviewed-by: tonyp, jwilhelm, brutisso
During remembered set scanning, the reference processor could discover a reference object whose referent was in the process of being copied and so may not be completely initialized. Do not perform reference discovery during remembered set scanning.
Reviewed-by: tonyp, ysr
G1 now uses two reference processors - one is used by concurrent marking and the other is used by STW GCs (both full and incremental evacuation pauses). In an evacuation pause, the reference processor is embedded into the closures used to scan objects. Doing so causes causes reference objects to be 'discovered' by the reference processor. At the end of the evacuation pause, these discovered reference objects are processed - preserving (and copying) referent objects (and their reachable graphs) as appropriate.
Reviewed-by: ysr, jwilhelm, brutisso, stefank, tonyp
Remove two unnecessary iterations over the collection set which are supposed to prepare the RSet's of the CSet regions for parallel iterations (we'll make sure this is done incrementally). I'll piggyback on this CR the removal of the G1_REM_SET_LOGGING code.
Reviewed-by: brutisso, johnc
We should only undirty cards after we decide that they are not on a young region, not before. The fix also includes improvements to the verify_dirty_region() method which print out which cards were not found dirty.
Reviewed-by: johnc, brutisso
7007446: G1: expand the heap with a single step, not one region at a time
Changed G1CollectedHeap::expand() to expand the committed space by calling VirtualSpace::expand_by() once rather than for every region in the expansion amount. This allows the success or failure of the expansion to be determined before creating any heap regions. Introduced a develop flag G1ExitOnExpansionFailure (false by default) that, when true, will exit the VM if the expansion of the committed space fails. Finally G1CollectedHeap::expand() returns a status back to it's caller so that the caller knows whether to attempt the allocation.
Reviewed-by: brutisso, tonyp
An evacuation failure while copying the roots caused an object, A, to be forwarded to itself. During the subsequent RSet updating a reference to A was processed causing the reference to be added to the RSet of A's heap region. As a result of adding to the remembered set we ran into the issue described in 6930581 - the sparse table expanded and the RSet scanning code walked the cards in one instance of RHashTable (_cur) while the occupied() counts the cards in the expanded table (_next).
Reviewed-by: tonyp, iveresov
During RSet updating, when ParallelGCThreads is zero, references that point into the collection set are added directly the referenced region's RSet. This can cause the sparse table in the RSet to expand. RSet scanning and the "occupied" routine will then operate on different instances of the sparse table causing the assert to trip. This may also cause some cards added post expansion to be missed during RSet scanning. When ParallelGCThreads is non-zero such references are recorded on the "references to be scanned" queue and the card containing the reference is recorded in a dirty card queue for use in the event of an evacuation failure. Employ the parallel code in the serial case to avoid expanding the RSets of regions in the collection set.
Reviewed-by: iveresov, ysr, tonyp
The per-worker _new_refs array is used to hold references that point into the collection set. It is populated during RSet updating and subsequently processed. In the event of an evacuation failure it processed again to recreate the RSets of regions in the collection set. Remove the per-worker _new_refs array by processing the references directly. Use a DirtyCardQueue to hold the cards containing the references so that the RSets of regions in the collection set can be recreated when handling an evacuation failure.
Reviewed-by: iveresov, jmasa, tonyp
During concurrent refinment, filter cards in young regions after it has been determined that the region has been allocated from and the young type of the region has been set.
Reviewed-by: iveresov, tonyp, jcoomes
Implemented block-based work stealing. Moved copying during the rset scanning phase to the main copying phase. Made the size of rset table depend on the region size.
Reviewed-by: apetrusenko, tonyp
The first worker thread is delayed when entering the GC because it clears the card count table that is used in identifying hot cards. Replace the card count table with a dynamically sized evicting hash table that includes an epoch based counter.
Reviewed-by: iveresov, tonyp