Skip to content

[SPARK-55267][INFRA] Make sbt less verbose in github actions#54052

Closed
gaogaotiantian wants to merge 2 commits intoapache:masterfrom
gaogaotiantian:quiet-sbt
Closed

[SPARK-55267][INFRA] Make sbt less verbose in github actions#54052
gaogaotiantian wants to merge 2 commits intoapache:masterfrom
gaogaotiantian:quiet-sbt

Conversation

@gaogaotiantian
Copy link
Contributor

What changes were proposed in this pull request?

For all github actions, do not print warning/info logs to stdout.

Why are the changes needed?

We print thousands of warning messages and no one cares of them. These lines flooded the stdout which makes finding the interesting part super difficult. Also github takes seconds to even load the log because there are too many lines.

If we don't care about it, just don't print it.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Let's see the result of CI.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions
Copy link

JIRA Issue Information

=== Improvement SPARK-55267 ===
Summary: Make sbt less verbose on github actions
Assignee: None
Status: Open
Affected: ["4.2.0"]


This comment was automatically generated by GitHub Actions

@github-actions github-actions bot added the BUILD label Jan 29, 2026
@HyukjinKwon
Copy link
Member

Hm, actually I think we should better make it compain. people actually care about it

@gaogaotiantian
Copy link
Contributor Author

Hm, actually I think we should better make it compain. people actually care about it

You meant the 3000 lines of warnings? I don't think anyone can realize that their new code adds any new warnings. If we have 3000 lines of warning, people don't care about warnings. I don't see any effort to reduce the number of warnings. They only way to make people care about warnings is to have 0 warning.

We produce the same 3000 lines of warnings for every job - at least let me turn it off for pyspark tests. If people do care about these warnings, they can find it in build jobs or java test jobs.

@HyukjinKwon
Copy link
Member

people actually take a look and fix them :-). e.g., @LuciferYang

@gaogaotiantian
Copy link
Contributor Author

Okay then how about we control the behavior with an env var - and enable that on pyspark related actions. I bet @LuciferYang does not fix stuff based on pyspark actions :) The same warnings should be in pure build actions and sql actions. If people are not complaining about the overhead to scroll over 3000 lines they can keep that :)

@HyukjinKwon
Copy link
Member

wdyt @dongjoon-hyun @LuciferYang ?

@LuciferYang
Copy link
Contributor

Yes, I do pay attention to compilation warning logs: After fixing some compilation warnings, such as those related to the use of deprecated APIs, I confirm whether the fixes meet expectations by checking the output results from Github Actions. Therefore, I don't recommend only printing logs at the error level.

@LuciferYang
Copy link
Contributor

LuciferYang commented Jan 30, 2026

Okay then how about we control the behavior with an env var - and enable that on pyspark related actions. I bet @LuciferYang does not fix stuff based on pyspark actions :) The same warnings should be in pure build actions and sql actions. If people are not complaining about the overhead to scroll over 3000 lines they can keep that :)

Is it possible to ensure that these compilation logs are printed normally in at least one Task?

@gaogaotiantian
Copy link
Contributor Author

Is it possible to ensure that these compilation logs are printed normally in at least one Task?

Pull request checks are comprehensive so you'll find the logs in many build tasks like - https://github.com/gaogaotiantian/spark/actions/runs/21501114389/job/61948012217 . That should help you with comparing logs.

If you don't need the commit check on master (after a PR is merged to master, a test will run to check that specific commit) to have the log, this is sufficient and always available.

@LuciferYang
Copy link
Contributor

Is it possible to ensure that these compilation logs are printed normally in at least one Task?

Pull request checks are comprehensive so you'll find the logs in many build tasks like - https://github.com/gaogaotiantian/spark/actions/runs/21501114389/job/61948012217 . That should help you with comparing logs.

If you don't need the commit check on master (after a PR is merged to master, a test will run to check that specific commit) to have the log, this is sufficient and always available.

What I mean is that you could try using some ways to enable at least one Task to print the full log while masking the logs of the other Tasks.

@HyukjinKwon
Copy link
Member

Let's just leave it for now. Even we disable this, we still have to download the log right. I think we should better make the test report nicer https://github.com/apache/spark/blob/master/.github/workflows/test_report.yml

@gaogaotiantian
Copy link
Contributor Author

gaogaotiantian commented Jan 30, 2026

What I mean is that you could try using some ways to enable at least one Task to print the full log while masking the logs of the other Tasks.

That is more difficult because tasks do not communicate with each other very well. It would be easier if we can turn them off for pyspark tests.

Let's just leave it for now. Even we disable this, we still have to download the log right. I think we should better make the test report nicer

No we don't need to download the log anymore, it will be on github action result page. We print out information when a test failed. At least for python that's good enough for debugging.

For example for this failure - https://github.com/apache/spark/actions/runs/21474225951/job/61854343710 if you scroll down tho the bottom, you'll know what's the issue (have not thought about how to fix it properly but I know what's going on).

Or this one (in the same workflow) - https://github.com/apache/spark/actions/runs/21474225951/job/61854343627 you'll have the log print out to help you understand what's going on. I never download the log. It's just painful to scroll to the bottom (and my computer sometimes freeze when loading 3k lines of log).

@yaooqinn
Copy link
Member

FYI, #54069 hope this feature helps

@gaogaotiantian
Copy link
Contributor Author

I think a separate test result report definitely helps. However, that is based on the xml we generated from tests right? Which means it does not contain information from our infra. For example, our test script dumps python thread stack trace when the test itself timeout - there will be no xml file, but we have information in the test log.

@yaooqinn
Copy link
Member

I am not quite familiar with python infra. here is an case for how python test failure to be summarized like. https://github.com/yaooqinn/spark/actions/runs/21505022749#summary-61961633818

@HyukjinKwon
Copy link
Member

I still don't quite feel strongly about it. Seems like @allisonwang-db approved this so I leave it to her.

@gaogaotiantian
Copy link
Contributor Author

This is closed in favor of #54524

HyukjinKwon pushed a commit that referenced this pull request Mar 3, 2026
### What changes were proposed in this pull request?

In github actions, group the sbt build output so it is collapsable.

### Why are the changes needed?

This PR is made in favor of #54052. The sbt build outputs thousands of lines that many people do not care. It lags the webpage and made it difficult for people to find the part they are interested in.

However, some people are interested in the message so it might not be the best solution to completely make `sbt build` quiet.

By making it collapsable, I think everyone gets what they want.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Check the result of the CI of this PR to see the difference.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #54524 from gaogaotiantian/group-sbt-build-message.

Authored-by: Tian Gao <gaogaotiantian@hotmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants