21a6cf848d
Reviewed-by: dlong, dholmes, jpai
Copyright (c) 2015, 2021, Oracle and/or its affiliates. All rights reserved. DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. This code is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 only, as published by the Free Software Foundation. This code is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License version 2 for more details (a copy is included in the LICENSE file that accompanied this code). You should have received a copy of the GNU General Public License version 2 along with this work; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA or visit www.oracle.com if you need additional information or have any questions. DESCRIPTION The purpose of this library is gathering diagnostic information on test failures and timeouts. The library runs platform specific tools, which are configured in the way described below. The collected data will be available in HTML format next to JTR files. The library uses JTHarness Observer and jtreg TimeoutHandler extensions points. DEPENDENCES The library requires jtreg 4b13+ and JDK 15+. BUILDING The library is built using the top level build-test-failure-handler target and is automatically included in the test image and picked up by hotspot and jdk test makefiles. CONFIGURATION Properties files are used to configure the library. They define which actions to be performed in case of individual test failure or timeout. Each platform family uses its own property file (named '<platform>.properties'). For platform independent actions, 'common.properties' is used. Actions to be performed on each failure are listed in 'environment' property. Extra actions for timeouts are listed in 'onTimeout'. Each action is defined via the following parameters: - 'javaOnly' -- run the action only for java applications, false by default - 'app' -- an application to run, mandatory parameter - 'args' -- application command line arguments, none by default - 'params' -- a structure which defines how an application should be run, described below Actions listed in 'onTimeout' are "patterned" actions. Besides the parameters listed above, they also have 'pattern' parameter -- a string which will be replaced by PID in 'args' parameter before action execution. 'params' structure has the following parameters: - repeat -- how many times an action will be run, 1 by default - pause -- delay in ms between iterations, 500 by default - timeout -- time limitation for iteration in ms, 20 000 by default - stopOnError -- if true, an action will be interrupted after the first error, false by default From '<platform>.properties', the library reads the following parameters - 'config.execSuffix' -- a suffix for all binary application file names - 'config.getChildren' -- a "patterned" action used to get the list of all children For simplicity we use parameter values inheritance. This means that we are looking for the most specified parameter value. If we do not find it, we are trying to find less specific value by reducing prefix. For example, if properties contains 'p1=A', 'a.p1=B', 'a.b.p1=C', then parameter 'p1' will be: - 'C' for 'a.b.c' - 'B' for 'a.c' - 'A' for 'b.c' RUNNING To enable the library in jtreg, the following options should be set: - '-timeoutHandlerDir' points to the built jar ('jtregFailureHandler.jar') - '-observerDir' points to the built jar - '-timeoutHandler' equals to jdk.test.failurehandler.jtreg.GatherProcessInfoTimeoutHandler - '-observer' equals to jdk.test.failurehandler.jtreg.GatherDiagnosticInfoObserver In case of environment issues during an action execution, such as missing application, hung application, lack of disk space, etc, the corresponding warning appears and the library proceeds to next action. EXAMPLES $ ${JTREG_HOME}/bin/jtreg -jdk:${JAVA_HOME} \ -timeoutHandlerDir:./image/lib/jtregFailureHandler.jar \ -observerDir:./image/lib/jtregFailureHandler.jar \ -timeoutHandler:jdk.test.failurehandler.jtreg.GatherProcessInfoTimeoutHandler\ -observer:jdk.test.failurehandler.jtreg.GatherDiagnosticInfoObserver \ ${WS}/hotspot/test/ TESTING There are a few make targets for testing the failure_handler itself. - Everything in `test/failure_handler/Makefile` - The `test-failure-handler` target in `make/RunTests.gmk` - The `test` target in `make/test/BuildFailureHandler.gmk` All of these targets are written for manual testing only. They rely on manual inspection of generated artifacts and cannot be run as part of a CI. They are tests which timeout, crash, fail in various ways and you can check the failure_handler output yourself. They might also leave processes running on your machine so be extra careful about cleaning up.