Skip to content

Implement end-to-end encryption#168

Open
KubaZ2 wants to merge 68 commits intoalphafrom
feature/e2ee
Open

Implement end-to-end encryption#168
KubaZ2 wants to merge 68 commits intoalphafrom
feature/e2ee

Conversation

@KubaZ2
Copy link
Member

@KubaZ2 KubaZ2 commented Sep 12, 2025

No description provided.

@github-actions
Copy link

The documentation preview is available at https://preview.netcord.dev/168.

KubaZ2 and others added 27 commits September 12, 2025 22:19
Refactored encryption/decryption APIs to accept media type and return detailed result codes. Updated error handling in VoiceOutStream and VoiceClient for retries and exceptions. Changed native library reference from "dave.dll" to "libdave.dll" and removed the old DLL. Improves flexibility and error reporting for voice operations.
Add session.OnSpeaking call on Speaking event to notify DaveSession of active speakers. Refactor ClientDisconnect handling to ensure user disconnect event is awaited before calling session.OnClientDisconnect, improving event flow and consistency.
Refactored Dave interop to use BufferHandle<T> SafeHandle for native buffer management, replacing raw pointer outputs with safer handles and spans. Updated P/Invoke signatures to use Span/ReadOnlySpan where possible. Added new EncryptorHasKeyRatchet and EncryptorIsPassthroughMode APIs. Updated VoiceClient and VoiceCommands to use new buffer patterns and float PCM audio. Updated libdave.dll for compatibility. Improves memory safety and modernizes interop usage.
Introduce TryEncrypt and TryDecrypt methods to IVoiceEncryption, allowing encryption and decryption to signal failure via return value instead of exceptions. Refactor Aes256GcmRtpSizeEncryption and XChaCha20Poly1305RtpSizeEncryption to implement these methods. Update VoiceClient to use TryDecrypt, improving error handling and memory management. Simplify LibsodiumException and adjust WebSocketClient and tests to accommodate the new API.
Refactor VoiceOutStream to encapsulate encryption in a new EncryptDave method, introducing DaveEncryptionResult for buffer/resource management. Replace manual array pooling and in-line encryption with a safer, clearer pattern using using statements and IDisposable. Update to use ReadOnlyMemory/Span for buffers, rename GetMaxCiphertextByteSize to GetMaxCiphertextSize, and switch to async sending with cancellation support. Improve error handling, code clarity, and resource safety throughout the voice pipeline.
Refactored DaveEncryptorException into a separate file under the NetCord.Gateway.Voice namespace. Removed DaveDecryptorException. Updated VoiceOutStream to throw DaveEncryptorException via a local static method for improved clarity and static analysis support.
@KubaZ2 KubaZ2 marked this pull request as ready for review February 4, 2026 15:36
Refactored the echo logic in VoiceModule to use voiceClient.SendVoice instead of writing to an output stream. Added handling to skip packets with null timestamps, ensuring that packet loss is mirrored to echo recipients. Removed outStream creation and usage.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 55 out of 56 changed files in this pull request and generated 12 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Refactored delay computation in SpeedNormalizingStream for clarity and correctness, passing pre-calculated millisecond values to Task.Delay and Thread.Sleep. Updated VoiceCommands to specify frame duration when creating OpusEncodeStream, ensuring consistent audio encoding configuration.
Replaced GetValueOrDefault with the null-coalescing operator (??) for assigning _externalSocketAddressDiscoveryTimeout, simplifying the default value logic when ExternalSocketAddressDiscoveryTimeout is null.
- Make OpusEncodeStream and helper classes sealed for clarity.
- Separate Opus and PCM buffer sizes in OpusEncodeStreamInternal.
- Use correct buffer sizes for encoding to improve safety and efficiency.
- Override Flush/FlushAsync to send silence frames for clean stream end.
- Simplify VoiceOutStream flush methods; silence now handled upstream.
- Add 2s delay after playback loop in VoiceCommands.
- Minor code style improvements and test path comment.
Marked OpusDecodeStream, SpeedNormalizingStream, and VoiceOutStream as sealed to prevent inheritance. Renamed _bufferSize to _pcmBufferSize in OpusDecodeStream and updated all related usages for improved clarity and consistency. These changes enhance code readability and enforce immutability of the stream classes.
Expanded PlayAsync to accept pcmFormat, voiceChannels, opusApplication, and frameDuration for customizable audio streaming. Updated OpusEncodeStream and ffmpeg invocation to use these parameters, supporting both float and short PCM formats. Refactored related code for consistency.
RecordAsync now accepts optional pcmFormat and voiceChannels
parameters, allowing callers to customize the audio format and
number of channels. ffmpeg arguments and OpusDecodeStream
construction have been updated to use these new options.
Removed commented-out code related to audio streaming and Opus encoding/decoding in the VoiceCommands class. This cleanup does not affect current functionality.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 55 out of 56 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

NetCord/Gateway/WebSocketClient.cs:1037

  • InvokeEventWithDisposalAsync disposes data (via disposeData(data)) before awaiting the returned ValueTasks. If any handler returns an incomplete task and relies on resources owned by data (e.g., pooled buffers), this can lead to use-after-return/data corruption. Dispose should happen only after all handler tasks have completed (e.g., in a finally after await HandleTasksAsync(...)).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Several P/Invoke method declarations in Dave.cs have been commented out, disabling their usage while preserving the original code for future reference. These methods include native interop functions related to session, commit, welcome, encryptor, and decryptor operations. No other logic or structure was modified.
Replaced hardcoded "opus" library names with a DllName constant in LibraryImport attributes for maintainability. Removed [UnmanagedCallConv] attributes to rely on default calling conventions. No functional changes.
Simplify interop signatures by removing SuppressGCTransition and UnmanagedCallConv attributes. Introduce DllName constant for "libsodium" to avoid hardcoding. No functional changes to method signatures.
Replaced hardcoded "libzstd" strings with a DllName constant for all native library imports. Removed [UnmanagedCallConv] attributes and added [SuppressGCTransition] to IsError for improved performance and maintainability.
The [UnmanagedCallConv] attribute was removed from the
CryptoSignEd25519VerifyDetached method declaration, so the
calling convention for the unmanaged import is no longer
explicitly specified.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 58 out of 59 changed files in this pull request and generated 13 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 857 to 870
/// <summary>
/// Creates a stream that you can write to to send voice. Each write must be exactly one Opus frame.
/// </summary>
/// <param name="normalizeSpeed">Whether to normalize the voice sending speed.</param>
/// <param name="configuration">The configuration of the voice stream.</param>
/// <returns></returns>
public Stream CreateOutputStream(bool normalizeSpeed = true)
public Stream CreateVoiceStream(VoiceStreamConfiguration? configuration = null)
{
Stream stream = new VoiceOutStream(this);
var frameDuration = configuration?.FrameDuration ?? Opus.DefaultFrameDuration;
var normalizeSpeed = configuration?.NormalizeSpeed ?? true;

Stream stream = new VoiceOutStream(this, frameDuration);
if (normalizeSpeed)
stream = new SpeedNormalizingStream(stream);
stream = new SpeedNormalizingStream(stream, frameDuration);
return stream;
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateVoiceStream replaces the previous CreateOutputStream API. Since this is a public surface, consider keeping CreateOutputStream(...) as an [Obsolete] wrapper forwarding to CreateVoiceStream to avoid breaking existing consumers unexpectedly.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to 5
namespace NetCord.Gateway.Voice;

namespace NetCord.Gateway.Voice;

public sealed class OpusEncoder : IDisposable
public readonly struct OpusEncoder : IDisposable
{
private readonly OpusEncoderHandle _encoder;
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpusEncoder changed from class to readonly struct. This is a breaking API change and also changes ownership semantics (struct copies share the same underlying handle). If compatibility matters, consider keeping the class (or providing an obsolete class wrapper) and/or preventing accidental copying (e.g., by making it a sealed class or documenting copy/dispose behavior clearly).

Copilot uses AI. Check for mistakes.
Comment on lines +14 to 32
[SuppressGCTransition]
[LibraryImport(DllName, EntryPoint = "ZSTD_isError")]
[return: MarshalAs(UnmanagedType.Bool)]
public static partial bool IsError(nuint code);

[LibraryImport("libzstd", EntryPoint = "ZSTD_getErrorName", StringMarshalling = StringMarshalling.Utf8)]
[UnmanagedCallConv(CallConvs = [typeof(System.Runtime.CompilerServices.CallConvCdecl)])]
[LibraryImport(DllName, EntryPoint = "ZSTD_getErrorName", StringMarshalling = StringMarshalling.Utf8)]
public static partial string GetErrorName(nuint code);

[LibraryImport("libzstd", EntryPoint = "ZSTD_createDStream")]
[UnmanagedCallConv(CallConvs = [typeof(System.Runtime.CompilerServices.CallConvCdecl)])]
[LibraryImport(DllName, EntryPoint = "ZSTD_createDStream")]
public static partial DStreamHandle CreateDStream();

[LibraryImport("libzstd", EntryPoint = "ZSTD_freeDStream")]
[UnmanagedCallConv(CallConvs = [typeof(System.Runtime.CompilerServices.CallConvCdecl)])]
[LibraryImport(DllName, EntryPoint = "ZSTD_freeDStream")]
public static partial nuint FreeDStream(nint zds);

[LibraryImport("libzstd", EntryPoint = "ZSTD_initDStream")]
[UnmanagedCallConv(CallConvs = [typeof(System.Runtime.CompilerServices.CallConvCdecl)])]
[LibraryImport(DllName, EntryPoint = "ZSTD_initDStream")]
public static partial nuint InitDStream(DStreamHandle zds);

[LibraryImport("libzstd", EntryPoint = "ZSTD_decompressStream")]
[UnmanagedCallConv(CallConvs = [typeof(System.Runtime.CompilerServices.CallConvCdecl)])]
[LibraryImport(DllName, EntryPoint = "ZSTD_decompressStream")]
public static partial nuint DecompressStream(DStreamHandle zds, ref Buffer output, ref Buffer input);
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These LibraryImport declarations don’t specify the native calling convention. For zstd’s C API this is typically cdecl; relying on the platform default can cause stack corruption on Windows. Consider adding an explicit calling convention (e.g., [UnmanagedCallConv(CallConvs = [typeof(CallConvCdecl)])]).

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +15
[LibraryImport(DllName, EntryPoint = "crypto_aead_xchacha20poly1305_ietf_encrypt")]
internal static partial int CryptoAeadXChaCha20Poly1305IetfEncrypt(ref byte c, ref ulong clen_p, ref byte m, ulong mlen, ref byte ad, ulong adlen, ref byte nsec, ref byte npub, ref byte k);

[LibraryImport(DllName, EntryPoint = "crypto_aead_xchacha20poly1305_ietf_decrypt")]
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These LibraryImport declarations don’t specify a calling convention. Libsodium’s C API is typically cdecl; using the platform default can corrupt the stack on Windows. Make the calling convention explicit (e.g., CallConvCdecl).

Suggested change
[LibraryImport(DllName, EntryPoint = "crypto_aead_xchacha20poly1305_ietf_encrypt")]
internal static partial int CryptoAeadXChaCha20Poly1305IetfEncrypt(ref byte c, ref ulong clen_p, ref byte m, ulong mlen, ref byte ad, ulong adlen, ref byte nsec, ref byte npub, ref byte k);
[LibraryImport(DllName, EntryPoint = "crypto_aead_xchacha20poly1305_ietf_decrypt")]
[LibraryImport(DllName, EntryPoint = "crypto_aead_xchacha20poly1305_ietf_encrypt", CallingConvention = CallingConvention.Cdecl)]
internal static partial int CryptoAeadXChaCha20Poly1305IetfEncrypt(ref byte c, ref ulong clen_p, ref byte m, ulong mlen, ref byte ad, ulong adlen, ref byte nsec, ref byte npub, ref byte k);
[LibraryImport(DllName, EntryPoint = "crypto_aead_xchacha20poly1305_ietf_decrypt", CallingConvention = CallingConvention.Cdecl)]

Copilot uses AI. Check for mistakes.
Comment on lines +35 to +37
/// <param name="frameDuration">The duration of the Opus frame, in milliseconds.</param>
/// <returns>The maximum number of bytes that an Opus frame of the specified duration can occupy.</returns>
public static int GetMaxOpusFrameSize(float frameDuration) => (int)(frameDuration * 2) * (MaxBitrate / 8 / 1000) / 2;
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetMaxOpusFrameSize uses integer division (MaxBitrate / 8 / 1000) which truncates (e.g. 63.75 -> 63) and underestimates the true maximum frame size. This can cause OpusEncode* to fail with BufferTooSmall. Compute using floating-point or integer math that rounds up (ceiling) to guarantee the buffer is large enough.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to 5
namespace NetCord.Gateway.Voice;

namespace NetCord.Gateway.Voice;

public sealed class OpusDecoder : IDisposable
public readonly struct OpusDecoder : IDisposable
{
private readonly OpusDecoderHandle _decoder;
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpusDecoder changed from class to readonly struct. This is a breaking API change and can lead to confusing copy/dispose semantics (multiple copies referencing the same handle). Consider retaining the class API (or adding an obsolete wrapper) or clearly documenting the value-type ownership model.

Copilot uses AI. Check for mistakes.
Comment on lines +1010 to 1019
private protected ValueTask InvokeEventWithDisposalAsync<T>(ImmutableList<Func<T, ValueTask>> handlers, T data, Action<T> disposeData, [CallerArgumentExpression(nameof(handlers))] string handlersName = "") where T : allows ref struct
{
int count = handlers.Count;

if (count is 0)
return default;

var data = dataFunc(rawData);

var tasks = ArrayPool<ValueTask>.Shared.Rent(count);

for (int i = 0; i < count; i++)
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In InvokeEventWithDisposalAsync, ensure the disposeData callback is invoked only after all handler ValueTasks have completed. Disposing immediately after scheduling handlers can return pooled buffers (or otherwise invalidate data) while handlers are still running, causing corruption/use-after-free.

Copilot uses AI. Check for mistakes.
Comment on lines +161 to +176
[LibraryImport(DllName, EntryPoint = "daveMaxSupportedProtocolVersion")]
public static partial ushort MaxSupportedProtocolVersion();

[LibraryImport(DllName, EntryPoint = "daveFree")]
public static partial void Free(nint ptr);

[LibraryImport(DllName, EntryPoint = "daveSessionCreate")]
public static partial SessionHandle SessionCreate(void* context, ReadOnlySpan<byte> authSessionId, delegate* unmanaged<byte*, byte*, void*, void> mlsFailureCallback, void* userData);

[LibraryImport(DllName, EntryPoint = "daveSessionDestroy")]
public static partial void SessionDestroy(nint session);

[LibraryImport(DllName, EntryPoint = "daveSessionInit")]
public static partial void SessionInit(SessionHandle session, ushort version, ulong groupId, ReadOnlySpan<byte> selfUserId);

[LibraryImport(DllName, EntryPoint = "daveSessionReset")]
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The libdave P/Invokes don’t specify a calling convention. If the library exports use cdecl (typical for C), relying on the platform default can cause runtime crashes on Windows. Consider adding an explicit calling convention (e.g., CallConvCdecl) to these imports.

Suggested change
[LibraryImport(DllName, EntryPoint = "daveMaxSupportedProtocolVersion")]
public static partial ushort MaxSupportedProtocolVersion();
[LibraryImport(DllName, EntryPoint = "daveFree")]
public static partial void Free(nint ptr);
[LibraryImport(DllName, EntryPoint = "daveSessionCreate")]
public static partial SessionHandle SessionCreate(void* context, ReadOnlySpan<byte> authSessionId, delegate* unmanaged<byte*, byte*, void*, void> mlsFailureCallback, void* userData);
[LibraryImport(DllName, EntryPoint = "daveSessionDestroy")]
public static partial void SessionDestroy(nint session);
[LibraryImport(DllName, EntryPoint = "daveSessionInit")]
public static partial void SessionInit(SessionHandle session, ushort version, ulong groupId, ReadOnlySpan<byte> selfUserId);
[LibraryImport(DllName, EntryPoint = "daveSessionReset")]
[LibraryImport(DllName, EntryPoint = "daveMaxSupportedProtocolVersion", CallingConvention = CallingConvention.Cdecl)]
public static partial ushort MaxSupportedProtocolVersion();
[LibraryImport(DllName, EntryPoint = "daveFree", CallingConvention = CallingConvention.Cdecl)]
public static partial void Free(nint ptr);
[LibraryImport(DllName, EntryPoint = "daveSessionCreate", CallingConvention = CallingConvention.Cdecl)]
public static partial SessionHandle SessionCreate(void* context, ReadOnlySpan<byte> authSessionId, delegate* unmanaged<byte*, byte*, void*, void> mlsFailureCallback, void* userData);
[LibraryImport(DllName, EntryPoint = "daveSessionDestroy", CallingConvention = CallingConvention.Cdecl)]
public static partial void SessionDestroy(nint session);
[LibraryImport(DllName, EntryPoint = "daveSessionInit", CallingConvention = CallingConvention.Cdecl)]
public static partial void SessionInit(SessionHandle session, ushort version, ulong groupId, ReadOnlySpan<byte> selfUserId);
[LibraryImport(DllName, EntryPoint = "daveSessionReset", CallingConvention = CallingConvention.Cdecl)]

Copilot uses AI. Check for mistakes.
Comment on lines +728 to +736
int datagramLength = frame.Length + encryption.Expansion + 12;

var datagramArray = ArrayPool<byte>.Shared.Rent(datagramLength);

WriteDatagram(frame.Span, new(datagramArray, 0, datagramLength), encryption, sequenceNumber, timestamp, ssrc);

await connection.SendAsync(new(datagramArray, 0, datagramLength), cancellationToken).ConfigureAwait(false);

ArrayPool<byte>.Shared.Return(datagramArray);
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SendVoiceAsync rents a datagram buffer from ArrayPool but only returns it on the success path. If SendAsync throws (including cancellation), the buffer will leak from the pool. Wrap the send in a try/finally and always return the rented array.

Copilot uses AI. Check for mistakes.
Comment on lines +763 to +772
int datagramLength = frame.Length + encryption.Expansion + 12;

var datagramArray = ArrayPool<byte>.Shared.Rent(datagramLength);

WriteDatagram(frame, new(datagramArray, 0, datagramLength), encryption, sequenceNumber, timestamp, ssrc);

connection.Send(new(datagramArray, 0, datagramLength));

ArrayPool<byte>.Shared.Return(datagramArray);
}
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SendVoice rents a datagram buffer from ArrayPool but only returns it after connection.Send. If Send throws, the buffer won’t be returned to the pool. Use try/finally to ensure the array is always returned.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant