TEZ-4689: Introduce Node abstraction for DAGAppMaster instead of sepagrate NodeManager-related fields#461
TEZ-4689: Introduce Node abstraction for DAGAppMaster instead of sepagrate NodeManager-related fields#461Aggarwal-Raghav wants to merge 4 commits intoapache:masterfrom
Conversation
| private final int nmPort; | ||
| private final int nmHttpPort; | ||
| private String nmHost; | ||
| private int nmPort; |
There was a problem hiding this comment.
nmPort is not used anywhere but still keeping getAppNMPort() public api
| if (!isLocal) { | ||
| this.nmHost = nodeContext.getNodeHostString(); | ||
| int nmHttpPort = Integer.parseInt(nodeContext.getNodeHttpPortString()); | ||
| this.containerLogs = |
There was a problem hiding this comment.
this.containerLogs is making use of nmHttpPort and nmHost, moved it from constructor to serviceInit() and made nmHttpPort as local variable
c7923b1 to
fca17d7
Compare
| DAGProtos.ConfigurationProto confProto = amExtensions.loadConfigurationProto(); | ||
| TezUtilsInternal.addUserSpecifiedTezConfiguration(conf, confProto.getConfKeyValuesList()); | ||
|
|
||
| NodeContext nodeContext = new YarnNodeManagerContext(); |
There was a problem hiding this comment.
I don't think we need to wrap in if statement for checking if framework is yarn. As the YarnNodeManagerContext is using supplier it won't be evaluated immediately and evaluation will happen only it !isLocal.
NOTE: in tez-am docker image tez.local.mode = true
There was a problem hiding this comment.
I think we should assume that tez.local.mode is going to become false with this whole initiative eventually, when reaching the optimal state of separate AM and Task containers
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
…rate NodeManager-related fields
fca17d7 to
5e12682
Compare
This comment was marked as outdated.
This comment was marked as outdated.
tez-dag/src/main/java/org/apache/tez/dag/app/LocalNodeContext.java
Outdated
Show resolved
Hide resolved
This comment was marked as outdated.
This comment was marked as outdated.
| // Install the tez class loader, which can be used add new resources | ||
| TezClassLoader.setupTezClassLoader(); | ||
| Thread.setDefaultUncaughtExceptionHandler(new YarnUncaughtExceptionHandler()); | ||
| final String pid = System.getenv().get("JVM_PID"); |
There was a problem hiding this comment.
while I like using ProcessHandle instead of JVM_PID, I'm not sure if this is 100% correct
I'm concerned by the fact that JVM_PID is used in yarn too, see in hadoop code, e.g. https://github.com/apache/hadoop/blob/f49c49aa083b9324ae432902f94a7aaf98f543ea/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java#L126
also, not 100% sure if JVM_PID belongs to the actual Java process id or the launch_container.sh shell script provided by yarn: even if the shell script must be the parent of the jvm process, they are not the same
There was a problem hiding this comment.
I'll investigate on this and get back to you on this.
There was a problem hiding this comment.
So i tested by putting both
LOG.info("######");
LOG.info("System.getenv().get(\"JVM_PID\") value is: {}", old_pid);
LOG.info("ProcessHandle.current().pid() value is : {}", pid);
LOG.info("######");
the JVM_PID belongs to launch_container.sh and
ProcessHandler one is DagAppMaster
Screenshot confirming the same.


Will revert this change. Thanks for pointing this!!
| public int getAppNMPort() { | ||
| return nmPort; | ||
| return nodeContext.getNodePort(); | ||
| } | ||
|
|
||
| public int getAppNMHttpPort() { | ||
| return nmHttpPort; | ||
| return nodeContext.getNodeHttpPort(); |
There was a problem hiding this comment.
is there a chance we can remove getAppNMPort and getAppNMHttpPort methods altogether?
There was a problem hiding this comment.
i thought so but they are public api, but its a tez major version change so we can do so. I wanted to discuss the same with you. Let me know.
there are few variables that can be made final and local to method instead of class varaibles in DagAppMaster. I wanted to clean that up and use JDK 21 features like enhanced switch but the change would have been outside the scope of PR. I can create a separte PR for code hygiene?
There was a problem hiding this comment.
code hygiene patches are always welcome
IMO you can remove these methods here, and do more in separate PRs
There was a problem hiding this comment.
sure. will update PR shortly and raise separate JIRA/PR for code hygiene. Thanks
|
🎊 +1 overall
This message was automatically generated. |
|
hm, how is this possible: I thought sealed interfaces are present since JDK17, so that must not be a problem...however, it compiled successfully, but the checkstyle:checkstyle target was the one that failed, I hope the same can be reproduced locally |
I'm doing some investigation on checkstyle issue as part of #462 . but using checkstyle 3.6.0 and removing puppycrawler explicit dependency, then its working. |
| NodeContext nodeContext = | ||
| (frameworkService instanceof YarnServerFrameworkService) | ||
| ? new YarnNodeManagerContext() | ||
| : null; |
There was a problem hiding this comment.
I think instead of an instanceof check we should let ServerFrameworkService interface handle this: the advantage of separate framework service implementations is to eliminate these type checks (the ZkStandaloneServerFrameworkService can simply return null or a LocalNodeContext with dummy values)
| } | ||
|
|
||
| @Override | ||
| public String getNodeHostString() { |
There was a problem hiding this comment.
could this be nodeHost? I believe "String" doesn't add any value here, it could be eliminated from variable names as well as interface
| private final Supplier<String> nodeHostStringSupplier; | ||
| private final Supplier<Integer> nodePortStringSupplier; | ||
| private final Supplier<Integer> nodeHttpPortStringSupplier; |
There was a problem hiding this comment.
I think we can also remove "Supplier" suffix from the variable names: by looking at the variables, it should be clear that they are Suppliers
|
@Aggarwal-Raghav : left some minor comments, but this looks good so far |
Thanks! Will update PR I'm some time. |
|
🎊 +1 overall
This message was automatically generated. |
No description provided.