Open
Conversation
Keep the symmetry with set_request_body() which encodes the unicode object (Robot Framework defaults to unicode for all strings) into an instance of str by decoding the response into an instance of unicode in get_response_body(). Failure to do so causes UnicodeDecodeErrors in all keywords that compare the response body against a given string if any of the two contains Unicode characters.
Robot Framework defaults to unicode objects for all strings created. As
a result, httplib.py:848 has unicode += str (msg += message_body) which
crashes with UnicodeDecodeError when the body contains Unicode
characters:
File "/usr/lib64/python2.7/httplib.py", line 1001, in request
self._send_request(method, url, body, headers)
File "/usr/lib64/python2.7/httplib.py", line 1035, in _send_request
self.endheaders(body)
File "/usr/lib64/python2.7/httplib.py", line 997, in endheaders
self._send_output(message_body)
File "/usr/lib64/python2.7/httplib.py", line 848, in _send_output
msg += message_body
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 32: ordinal not in range(128)
A similar stack trace has been posted as part of Python bug #11898 [1]
(duplicated by #12398 [2]) which has later been closed as a programming
error.
Both robotframework-httplibrary and webtest treat the 'Content-Type'
header special which results in surprising behavior regarding the tests:
Had the 'Accept' header been removed from the provided test case no
UnicodeDecodeError would have been thrown in httplib.py:848.
Turning unicode objects into str objects is the right thing to do
according to the HTTP standard ([3], updated by [5]) because it requires
conformity to MIME [5] (hint by [6]).
[1] http://bugs.python.org/issue11898
[2] http://bugs.python.org/issue12398
[3] https://tools.ietf.org/html/rfc2616
[4] https://tools.ietf.org/html/rfc2047
[5] https://tools.ietf.org/html/rfc7230
[6] https://stackoverflow.com/a/5426648
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi @peritus,
please consider these fixes regarding Unicode in request/response bodies for merging.
It seems likely that this PR fixes the issues mentioned by @jinlxz in #18 (comment).
Btw, while debugging this I found that the special treatment regarding 'Content-Type' originally
added in 7cf67c8 for POST and in 04c9552 for PUT may have become superfluous.
Thanks
Oliver