Instead of globally waiting for completion of strong nmethod processing during evacuation, synchronize the nmethods processing on a per-nmethod basis so that only one thread processes one nmethod at once using a state. This state indicates what work (strong/weak processing) needs to be done and what has already been done.
Reviewed-by: sjohanss, kbarrett
Before scanning the heap for roots into the collection set, merge them into a single remembered set (card table) and do work distribution based on location like other collectors do.
Reviewed-by: kbarrett, lkorinth
Collectors like G1 implementing non-contiguous generations previously used an inexact but conservative area for discovery. Concurrent and STW reference processing could discover the same reference multiple times, potentially missing referents during evacuation. So these collectors had to take extra measures while concurrent marking/reference discovery has been running. This change makes discovery exact for G1 (and any collector using non-contiguous generations) so that concurrent discovery and STW discovery discover on strictly disjoint memory areas. This means that the mentioned situation can not occur any more, and extra work is not required any more too.
Reviewed-by: kbarrett, sjohanss