Use memcpy in uninitialized_meow#6161
Open
AlexGuteniev wants to merge 3 commits intomicrosoft:mainfrom
Open
Conversation
memcpy in uninitialized_copy, uninitialized_copy_n, uninitialized_move, , uninitialized_move_nmemcpy in uninitialized_meow
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolves #6029
memcpyis the same asmemmovewhen it is usual library call.But intrinsic optimizations are possible for known size for
memcpybut not formemmove.These optimizations happen when size is known at compile time (compile time constant or propagated constant), and this size is small enough (the boundary is at about kilobyte order of magnitude).
Uninitialized algorithms require non-overlapped range: See [uninitialized.copy]/1, and the others. It would have been strange to have overlap between initialized memory and uninitialized memory.
There hasn't been
_Copy_memcpy/_Copy_memcpy_n, so I've added them. They might be useful in other places, besides uninitialized algorithms, as some other algorithms have useful non-overlap guarantee, but for now lets focus on uninitialized algorithms.Benchmark results show overall improvement, although for some larger amount of data that is 800 bytes it flips. Probably the compiler should have its internal threshold to always use the library
memcpyset to a lower value. For very large amounts where the compiler does emitmemcpythe results before and after the change do match.