Commit 9b967cc
OCTRL-949 [core] Improve reaction to controlled nodes becoming unreachable
Includes:
- fixed copy-paste logs "received executor failed" -> "received agent failed"
- added an operator log in case of connection issues to a mesos slave
- allowed to re-register agent and executor IDs for a Task once they come back (they are removed when an Agent/Executor failure is received). Effectively, this allows an environment to be torn down correctly, fixing at least some of the leftover task issues (OCTRL-611).
- added documentation about configuring the node-down timeout1 parent 4451613 commit 9b967cc
File tree
4 files changed
+24
-3
lines changed- core
- environment
- task
- docs/handbook
4 files changed
+24
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
116 | 116 | | |
117 | 117 | | |
118 | 118 | | |
119 | | - | |
| 119 | + | |
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
123 | 123 | | |
124 | 124 | | |
125 | | - | |
| 125 | + | |
126 | 126 | | |
127 | 127 | | |
128 | 128 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1047 | 1047 | | |
1048 | 1048 | | |
1049 | 1049 | | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
| 1056 | + | |
1050 | 1057 | | |
1051 | 1058 | | |
1052 | 1059 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
246 | 246 | | |
247 | 247 | | |
248 | 248 | | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
249 | 252 | | |
250 | 253 | | |
251 | 254 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
0 commit comments