[HTTPCLIENT-1843] - Delegate compression handling to Apache Commons Compress#580
[HTTPCLIENT-1843] - Delegate compression handling to Apache Commons Compress#580arturobernalg wants to merge 0 commit into
Conversation
garydgregory
left a comment
There was a problem hiding this comment.
1.27.1 is the current version of Commons Compress.
a5a5890 to
48de43f
Compare
changed. |
|
|
||
| private static final Map<String, String> COMPRESSION_ALIASES; | ||
| static { | ||
| final Map<String, String> aliases = new HashMap<>(); |
There was a problem hiding this comment.
@arturobernalg
Why just those stream types?
How about using:
org.apache.commons.compress.compressors.CompressorStreamProvider.getInputStreamCompressorNames()org.apache.commons.compress.compressors.CompressorStreamProvider.getOutputStreamCompressorNames()- Should I add an API to return
org.apache.commons.compress.compressors.CompressorStreamFactory.ALL_NAMES?
| public InputStream getContent() throws IOException { | ||
| if (super.isStreaming()) { | ||
| if (content == null) { | ||
| content = getDecompressingStream(); |
There was a problem hiding this comment.
If we do not lock, we should document this class as not thread-safe.
300da7f to
0c68966
Compare
|
HI @garydgregory |
There was a problem hiding this comment.
@arturobernalg
Thank you for your update.
- I have comments embedded throughout.
- There is some lazy-initialization code here and there, this should be called out in the class Javadoc making the class thread-unsafe, we have an annotation for that:
org.apache.hc.core5.annotation.ThreadingBehavior.
| final GzipDecompressingEntity gunzipe = new GzipDecompressingEntity(out); | ||
| Assertions.assertEquals("some kind of text", EntityUtils.toString(gunzipe, StandardCharsets.US_ASCII)); | ||
| } | ||
|
|
There was a problem hiding this comment.
Remove extra blank lines, the comments suffice.
d20a4d8 to
4cdc08a
Compare
HI @garydgregory |
a6aab71 to
ebfbd8c
Compare
5b84d95 to
9793eaf
Compare
| * @since 5.5 | ||
| */ | ||
| @Contract(threading = ThreadingBehavior.UNSAFE) | ||
| public class DecompressEntity extends HttpEntityWrapper { |
There was a problem hiding this comment.
The class name is mismatched with the "CompressingEntity" name, either they should both be "...ing" names, or not.
There was a problem hiding this comment.
I git-hub resolved most of the comment I wrote. I think we should consistently name ALL our new classes either with or without the "ing" word part, either [Compress|Decompress]Entity or [Compressing|Decompressing]Entity. I think "ing" is better and in keeping with our existing other HttpEntity implementations like IncomingHttpEntity and DecompressingEntity (conflicting).
I also think that all this new compression/decompression code should live in its own package, maybe org.apache.hc.client5.http.entity.compress.
Please @garydgregory take another look. |
garydgregory
left a comment
There was a problem hiding this comment.
Hi @arturobernalg
Thank you for the updates! 😄
I added more comments throughout.
We're very close I think 😄
TY!
| */ | ||
| public String getFormattedName(final String name) { | ||
| if (name == null || name.isEmpty()) { | ||
| LOG.warn("Compression name is null or empty"); |
There was a problem hiding this comment.
This API should not do any logging, especially at such a high level. This is just a string-to-string conversion API, it has no business rasing warnings IMO.
| Args.notNull(entity, "Entity"); | ||
| Args.notNull(contentEncoding, "Content Encoding"); | ||
| if (!isSupported(contentEncoding, false)) { | ||
| LOG.warn("Unsupported decompression type: {}", contentEncoding); |
There was a problem hiding this comment.
This class should not do any logging IMO, especially warnings, it should throw an illegal argument exception or let the call site deal with a null result.
| private Set<String> fetchAvailableInputProviders() { | ||
| final Set<String> inputNames = compressorStreamFactory.getInputStreamCompressorNames(); | ||
| return inputNames.stream() | ||
| .map(String::toLowerCase) |
There was a problem hiding this comment.
This might have some odd results depending on the locale context (https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/) if you do not use a Locale like the PR does above using ROOT.
| private Set<String> fetchAvailableOutputProviders() { | ||
| final Set<String> outputNames = compressorStreamFactory.getOutputStreamCompressorNames(); | ||
| return outputNames.stream() | ||
| .map(String::toLowerCase) |
| * | ||
| * @return a set of available output stream compression providers in lowercase. | ||
| */ | ||
| private Set<String> fetchAvailableOutputProviders() { |
There was a problem hiding this comment.
The "fetch" method name prefix is unusual and surprising to me, I would follow the "get" naming convention.
| * Common base class for decompressing {@link HttpEntity} implementations. | ||
| * | ||
| * @since 4.4 | ||
| * @deprecated |
There was a problem hiding this comment.
The Javadoc tag is missing its comment to redirect reader.
| try { | ||
| compressorStream = CompressingFactory.INSTANCE.getCompressorOutputStream(contentEncoding, outStream); | ||
| } catch (final CompressorException e) { | ||
| throw new IOException("Error initializing decompression stream", e); |
There was a problem hiding this comment.
Don't you mean "Error initializing compression stream"?
| // Write compressed data | ||
| super.writeTo(compressorStream); | ||
| // Close the compressor stream after writing | ||
| compressorStream.close(); |
There was a problem hiding this comment.
Could this method be rewritten to use a try-with-resources block?
| } | ||
|
|
||
| /** | ||
| * Returns the formatted name of the provided compression format. |
There was a problem hiding this comment.
I don't understand what this Javadoc means. What's a "formatted" name? Maybe the API is misnamed? Is the job of the method to map from a content type value to a Commons Compress key? Is it something else? I think the comment should at least make it clear, which should guide us to a better API name.
| final String formattedName = getFormattedName(name); | ||
| return isSupported(formattedName, false) | ||
| ? createCompressorInputStream(formattedName, inputStream, noWrap) | ||
| : inputStream; |
There was a problem hiding this comment.
The Javadoc and implementation don't match:
- Javadoc states decompression is performed
- The code doesn't decompress if the format isn't supported.
Either the Javadoc needs updating, or, more likely, we should throw an exception indicating the operation is not supported for that format type name.
I must say I find the API and Javadoc confusing, we are creating a "Compressor" input stream in order to "decompress" with a "compression" format! What?
- Maybe calling the API
getDecompressorInputStreamwould help because it "decompresses". I expect a "Compressor" to compress, not the other way around. - Maybe using the term "format type name" instead "compression format" would remove the need for the reader to constantly parse "compress" vs. "decompress" in the text.
- Maybe only using one term "decompress" or "compress" for one method (Javadoc and code) would alleviate the issue.
There was a problem hiding this comment.
Thanks, @garydgregory . Agreed—the Javadoc and implementation don’t match. We’ll:
-
Throw an exception on unsupported formats instead of returning the original stream, aligning with Javadoc expectations.
-
Rename to getDecompressorInputStream to reflect decompression.
-
Standardize to “format type” instead of “compression format” to clarify.
These changes should make both the API and docs clearer. Thanks for flagging this.
There was a problem hiding this comment.
HI @garydgregory
Can you do another pass??
Thank you.
197c21a to
63088bc
Compare
63088bc to
153155f
Compare
garydgregory
left a comment
There was a problem hiding this comment.
- TY for the work 👍
- Minor: For the Javadoc
sincetags, I think the next feature release will be5.5since the current release is5.4.1and the git master POM already has 5.5 in it. - Side note: In Commons Compress,
CompressorExceptionusually wraps anIOException, so it's too bad that we wrap it in another IOException here ;-) Since compression and decompression is IO, I'll see about making CompressorException extend IOException.
ea08574 to
363c7ca
Compare
Hi @garydgregory , Let me know if there’s anything else to address. |
|
Hi @arturobernalg, What do you think about apache/commons-compress#605 |
Hi @garydgregory |
|
Thank you for the review on apache/commons-compress#605, I merged that PR. |
All right. Let me know if we need to do another change. |
363c7ca to
15a49eb
Compare
|
Hi @garydgregory , do I need to do anything else for PR #580? Any timing to validate it? Thanks! |
|
@ok2c would you mind to validate this PR. |
@arturobernalg Sure. Will do. |
ok2c
left a comment
There was a problem hiding this comment.
@arturobernalg Unfortunately CompressingFactory in its present form does not work if commons-compress is not present on the classpath, but it should be
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/compress/compressors/deflate/DeflateCompressorInputStream
at org.apache.hc.client5.http.impl.classic.ContentCompressionExec.<init>(ContentCompressionExec.java:90)
at org.apache.hc.client5.http.impl.classic.ContentCompressionExec.<init>(ContentCompressionExec.java:106)
at org.apache.hc.client5.http.impl.classic.HttpClientBuilder.build(HttpClientBuilder.java:1048)
at org.apache.hc.client5.http.impl.classic.HttpClients.createDefault(HttpClients.java:57)
at org.example.App.main(App.java:16)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.compress.compressors.deflate.DeflateCompressorInputStream
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 5 more
Your code must use reflection to see if commons-compress is present before trying to call any of its functions and it must not fail if it is not present.
There is still a lot of work that needs to be done here.
| </dependency> | ||
| <dependency> | ||
| <groupId>org.apache.commons</groupId> | ||
| <artifactId>commons-compress</artifactId> |
There was a problem hiding this comment.
@arturobernalg This is wrong. commos-compress must be optional
| <groupId>org.apache.commons</groupId> | ||
| <artifactId>commons-compress</artifactId> | ||
| <version>${commpress.version}</version> | ||
| <optional>true</optional> |
There was a problem hiding this comment.
@arturobernalg I do not think this has any effect. Anyway, dependencyManagement section is the wrong place to declare artifact's optionality. This needs to be declared in the module's dependecies section, not here.
649c60b to
cf33879
Compare
363b15c to
c5bd9af
Compare
This PR addresses HTTPCLIENT-1843, allowing Apache HttpClient to delegate compression handling to Apache Commons Compress. The goal is to provide broader support for various compression algorithms while reducing custom implementation and maintenance.