mirror of https://github.com/mongodb/mongo
SERVER-106474 Enforce a maximum idle timeout for resmoke tasks on required variants (#40934)
GitOrigin-RevId: 1df2ed0c9ad0a0aaf89008b9b5fa0830e7fdc609
This commit is contained in:
parent
3be04ed957
commit
851e20dbaf
|
|
@ -253,6 +253,10 @@ WORKSPACE.bazel @10gen/devprod-build @svc-auto-approve-bot
|
||||||
/buildscripts/smoke_tests/**/server_storage_engine_integration.yml @10gen/server-storage-engine-integration @svc-auto-approve-bot
|
/buildscripts/smoke_tests/**/server_storage_engine_integration.yml @10gen/server-storage-engine-integration @svc-auto-approve-bot
|
||||||
/buildscripts/smoke_tests/**/server_ttl.yml @10gen/server-ttl @svc-auto-approve-bot
|
/buildscripts/smoke_tests/**/server_ttl.yml @10gen/server-ttl @svc-auto-approve-bot
|
||||||
|
|
||||||
|
# The following patterns are parsed from ./buildscripts/tests/OWNERS.yml
|
||||||
|
/buildscripts/tests/ @10gen/devprod-build @svc-auto-approve-bot
|
||||||
|
/buildscripts/tests/test_evergreen_task_timeout.py @10gen/devprod-correctness @svc-auto-approve-bot
|
||||||
|
|
||||||
# The following patterns are parsed from ./buildscripts/tests/burn_in_tests_end2end/OWNERS.yml
|
# The following patterns are parsed from ./buildscripts/tests/burn_in_tests_end2end/OWNERS.yml
|
||||||
/buildscripts/tests/burn_in_tests_end2end/ @10gen/devprod-correctness @svc-auto-approve-bot
|
/buildscripts/tests/burn_in_tests_end2end/ @10gen/devprod-correctness @svc-auto-approve-bot
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -50,7 +50,7 @@ DEFAULT_NON_REQUIRED_BUILD_TIMEOUT = timedelta(hours=2)
|
||||||
|
|
||||||
# An idle timeout will expire in the presence of an exceptionally long running test in a resmoke task.
|
# An idle timeout will expire in the presence of an exceptionally long running test in a resmoke task.
|
||||||
# This helps prevent the introduction of new long-running tests in required build variants.
|
# This helps prevent the introduction of new long-running tests in required build variants.
|
||||||
DEFAULT_REQUIRED_BUILD_IDLE_TIMEOUT = timedelta(minutes=16)
|
MAXIMUM_REQUIRED_BUILD_IDLE_TIMEOUT = timedelta(minutes=16)
|
||||||
|
|
||||||
|
|
||||||
class TimeoutOverride(BaseModel):
|
class TimeoutOverride(BaseModel):
|
||||||
|
|
@ -296,14 +296,14 @@ class TaskTimeoutOrchestrator:
|
||||||
LOGGER.info("Overriding configured timeout", idle_timeout_secs=override.total_seconds())
|
LOGGER.info("Overriding configured timeout", idle_timeout_secs=override.total_seconds())
|
||||||
determined_timeout = override
|
determined_timeout = override
|
||||||
|
|
||||||
elif self._is_required_build_variant(variant) and (
|
if self._is_required_build_variant(variant) and (
|
||||||
determined_timeout is None or determined_timeout > DEFAULT_REQUIRED_BUILD_IDLE_TIMEOUT
|
determined_timeout is None or determined_timeout > MAXIMUM_REQUIRED_BUILD_IDLE_TIMEOUT
|
||||||
):
|
):
|
||||||
LOGGER.info(
|
LOGGER.info(
|
||||||
"Overriding required-builder idle timeout",
|
"Overriding required-builder idle timeout",
|
||||||
idle_timeout_secs=DEFAULT_REQUIRED_BUILD_IDLE_TIMEOUT.total_seconds(),
|
idle_timeout_secs=MAXIMUM_REQUIRED_BUILD_IDLE_TIMEOUT.total_seconds(),
|
||||||
)
|
)
|
||||||
determined_timeout = DEFAULT_REQUIRED_BUILD_IDLE_TIMEOUT
|
determined_timeout = MAXIMUM_REQUIRED_BUILD_IDLE_TIMEOUT
|
||||||
|
|
||||||
return determined_timeout
|
return determined_timeout
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,8 @@
|
||||||
|
version: 2.0.0
|
||||||
|
filters:
|
||||||
|
- "*":
|
||||||
|
approvers:
|
||||||
|
- 10gen/devprod-build
|
||||||
|
- "test_evergreen_task_timeout.py":
|
||||||
|
approvers:
|
||||||
|
- 10gen/devprod-correctness
|
||||||
|
|
@ -409,7 +409,7 @@ class TestDetermineIdleTimeout(unittest.TestCase):
|
||||||
build_variant="variant-required",
|
build_variant="variant-required",
|
||||||
display_name="! required",
|
display_name="! required",
|
||||||
timeout_override=None,
|
timeout_override=None,
|
||||||
expected_timeout=under_test.DEFAULT_REQUIRED_BUILD_IDLE_TIMEOUT,
|
expected_timeout=under_test.MAXIMUM_REQUIRED_BUILD_IDLE_TIMEOUT,
|
||||||
)
|
)
|
||||||
|
|
||||||
def test_prefer_shorter_that_default_on_required_variants(self):
|
def test_prefer_shorter_that_default_on_required_variants(self):
|
||||||
|
|
|
||||||
|
|
@ -1,35 +1,30 @@
|
||||||
# Evergreen Task Timeouts
|
# Evergreen Task Timeouts
|
||||||
|
|
||||||
## Type of timeouts
|
## Types of timeouts
|
||||||
|
|
||||||
There are two types of timeouts that [evergreen supports](https://github.com/evergreen-ci/evergreen/wiki/Project-Commands#timeoutupdate):
|
There are two types of timeouts that [Evergreen supports](https://github.com/evergreen-ci/evergreen/wiki/Project-Commands#timeoutupdate):
|
||||||
|
|
||||||
- **Exec timeout**: The _exec_ timeout is the overall timeout for a task. Once the total runtime for
|
- **Exec Timeout**: The _exec timeout_ is the overall timeout for a task. Once the total runtime for a test exceeds this value, the timeout logic will be triggered. This value is specified by `exec_timeout_secs` in the Evergreen configuration.
|
||||||
a test hits this value, the timeout logic will be triggered. This value is specified by
|
- **Idle Timeout**: The _idle timeout_ is the amount of time Evergreen will wait for output to be generated before considering the task hung and triggering the timeout logic. This value is specified by `timeout_secs` in the Evergreen configuration.
|
||||||
**exec_timeout_secs** in the evergreen configuration.
|
|
||||||
- **Idle timeout**: The _idle_ timeout is the amount of time in which evergreen will wait for
|
|
||||||
output to be created before it considers the task hung and triggers timeout logic. This value
|
|
||||||
is specified by **timeout_secs** in the evergreen configuration.
|
|
||||||
|
|
||||||
**Note**: In most cases, **exec_timeout** is usually the more useful of the timeouts.
|
**Note**: In most cases, the **exec timeout** is the more useful of the two timeouts.
|
||||||
|
|
||||||
## Setting the timeout for a task
|
## Setting the timeout for a task
|
||||||
|
|
||||||
There are a few ways in which the timeout can be determined for a task running in evergreen.
|
There are several ways to set the timeout for a task running in Evergreen.
|
||||||
|
|
||||||
- **Specified in 'etc/evergreen.yml'**: Timeout can be specified directly in the 'evergreen.yml' file,
|
### Specifying timeouts in the Evergreen YAML configuration
|
||||||
both on tasks and build variants. This can be useful for setting default timeout values, but is limited
|
|
||||||
since different build variants frequently have different runtime characteristics and it is not possible
|
|
||||||
to set timeouts for a task running on a specific build variant.
|
|
||||||
|
|
||||||
- **etc/evergreen_timeouts.yml**: The 'etc/evergreen_timeouts.yml' file for overriding timeouts
|
Timeouts can be specified directly in the `evergreen.yml` (and related) files, both for tasks and build variants. This approach is useful for setting default timeout values but is limited because different build variants often have varying runtime characteristics. This means it is not possible to set timeouts for a specific task running on a specific build variant using only this method.
|
||||||
for specific tasks on specific build variants. This provides a work-around for the limitations of
|
|
||||||
specifying the timeouts directly in the 'evergreen.yml'. In order to use this method, the task
|
|
||||||
must run the "determine task timeout" and "update task timeout expansions" functions at the beginning
|
|
||||||
of the task evergreen definition. Most resmoke tasks already do this.
|
|
||||||
|
|
||||||
- **buildscripts/evergreen_task_timeout.py**: This is the script that reads the 'etc/evergreen_timeouts.yml'
|
### Overrides: [etc/evergreen_timeouts.yml](../../etc/evergreen_timeouts.yml)
|
||||||
file and calculates the timeout to use. Additionally, it will check the historic test results of the
|
|
||||||
task being run and see if there is enough information to calculate timeouts based on that. It can
|
The `etc/evergreen_timeouts.yml` file allows overriding timeouts for specific tasks on specific build variants. This workaround helps address the limitations of directly specifying timeouts in `evergreen.yml`. To use this method, the task must include the `determine task timeout` and `update task timeout expansions` functions at the beginning of its Evergreen definition. Many Resmoke tasks already incorporate these functions.
|
||||||
also be used for more advanced ways of determining timeouts (e.g. the script is used to set much
|
|
||||||
more aggressive timeouts on tasks that are run in the commit-queue).
|
### Resmoke tasks: [buildscripts/evergreen_task_timeout.py](../../buildscripts/evergreen_task_timeout.py)
|
||||||
|
|
||||||
|
This script reads the `etc/evergreen_timeouts.yml` file to calculate the appropriate timeout settings. Additionally, it checks historical test results for the task being run to determine if enough information is available to calculate timeouts based on past data. The script also supports more advanced methods of determining timeouts, such as applying aggressive timeout measures for tasks executed in the commit queue or on required build variants. In cases of conflict, the commit queue and required build variant limits take precedence over the previous two methods.
|
||||||
|
|
||||||
|
### Compile tasks: [evergreen/generate_override_timeout.py](../../evergreen/generate_override_timeout.py)
|
||||||
|
|
||||||
|
This script is used for compile tasks defined in files such as `etc/evergreen_yml_components/tasks/compile_tasks.yml` and `etc/evergreen_yml_components/tasks/compile_tasks_shared.yml`. The script reads the `etc/evergreen_timeouts.yml` file and calculates appropriate timeouts. The Evergreen function `override task timeout` then runs this script to update the timeouts accordingly.
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue