Skip to content

ALSA: invalid handling of capture overruns #730

@codepainters

Description

@codepainters

I'm working on an application using cpal over ALSA to continuously capture audio stream on Raspberry Pi. I use both RPI and PC for testing. Note: I'm using a version from master branch, as I need both i16 and i32 sample formats, with i32 not being supported in stable version.

Unfortunately I'm experiencing odd behavior upon capture overruns (which I trigger setting buffer size too low).

  1. The most severe case is when poll() return EPOLERR which is not handled properly.
476794 1669022562.665530 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN}])
476794 1669022562.666440 ioctl(5, SNDRV_PCM_IOCTL_STATUS_EXT,      0x7f6b10495f88) = 0
476794 1669022562.666532 ioctl(5, SNDRV_PCM_IOCTL_HWSYNC, 0x98) =      0
476794 1669022562.666577 ioctl(5, SNDRV_PCM_IOCTL_READI_FRAMES,      0x7f6b10496760) = 0
476794 1669022562.666707 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN|POLLERR}])
476794 1669022562.669590 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN|POLLERR}])
476794 1669022562.669765 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN|POLLERR}])

And then I no longer get any more callbacks from cpal, but my application is eating up 100% of CPU core, as poll() is executed in a tight loop.

The problem is here:

cpal/src/host/alsa/mod.rs

Lines 746 to 753 in 7776c66

let stream_type = match stream.channel.revents(&descriptors[1..])? {
alsa::poll::Flags::OUT => StreamType::Output,
alsa::poll::Flags::IN => StreamType::Input,
_ => {
// Nothing to process, poll again
return Ok(PollDescriptorsFlow::Continue);
}
};

This match doesn't recognize POLLIN|POLLERR and returns PollDescriptorsFlow::Continue, causing the worker thread to poll() again immediately, leading to a tight loop.

The problem seems obvious, I'll try to come with a pull request.

  1. Another scenario is when the second ioctl call fails.
476936 1669022662.246764 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5, revents=POLLIN}])
476936 1669022662.247721 ioctl(5, SNDRV_PCM_IOCTL_STATUS_EXT, 0x7f70e2810f88) = 0
476936 1669022662.247884 ioctl(5, SNDRV_PCM_IOCTL_HWSYNC, 0x98) = -1 EPIPE (Broken pipe)
476936 1669022662.248019 ioctl(5, SNDRV_PCM_IOCTL_PREPARE, 0) = 0
476936 1669022662.248112 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 0 (Timeout)
476936 1669022662.448722 write(1, "Die! A backend-specific error has occurred: `alsa::poll()` spuriously returned\n", 79) = 79
476936 1669022662.449074 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 0 (Timeout)
476936 1669022662.649849 write(1, "Die! A backend-specific error has occurred: `alsa::poll()` spuriously returned\n", 79) = 79
476936 1669022662.650342 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 0 (Timeout)
476936 1669022662.851166 write(1, "Die! A backend-specific error has occurred: `alsa::poll()` spuriously returned\n", 79) = 79

In this scenario I no longer get any data callbacks, instead I get error callback periodically (poll timeout).

I believe the problem is here:

cpal/src/host/alsa/mod.rs

Lines 604 to 609 in 7776c66

PollDescriptorsFlow::XRun => {
if let Err(err) = stream.channel.prepare() {
error_callback(err.into());
}
continue;
}

The call to prepare() passes OK, but the stream is stopped - according to ALSA, this call changes the state to SND_PCM_STATE_PREPARED, there also need to be a call to start() to transition to SND_PCM_STATE_RUNNING and re-enable data flow.

I'm a bit confused here about what is the intended cpal behavior here:

  • should I assume that stream is broken when my error callback is called for the first time, drop the Stream and create a new one? In such case I believe cpal should do some cleanup (e.g. stop the polling loop).
  • should I assume that the error is temporary and the data flow will continue? In such case start() call is missing.
  1. The third scenario is when ioctl(SNDRV_PCM_IOCTL_READI_FRAMES) fails with EPIPE:
476728 1669022521.635527 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5,        revents=POLLIN}])
476728 1669022521.636362 ioctl(5, SNDRV_PCM_IOCTL_STATUS_EXT, 0x7f8b2368af88) = 0
476728 1669022521.636521 ioctl(5, SNDRV_PCM_IOCTL_HWSYNC, 0x98) = 0
476728 1669022521.636681 ioctl(5, SNDRV_PCM_IOCTL_READI_FRAMES, 0x7f8b2368b760) = -1 EPIPE (Broken pipe)
476728 1669022521.636777 write(1, "Die! A backend-specific error has occurred: ALSA function 'snd_pcm_readi' failed with error 'EPIPE: Broken pipe'\n", 113) = 113
476728 1669022521.636823 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5,        revents=POLLIN|POLLERR}])
476728 1669022521.636860 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5,        revents=POLLIN|POLLERR}])
476728 1669022521.636892 poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN|POLLERR|POLLNVAL}], 2, 200) = 1 ([{fd=5,        revents=POLLIN|POLLERR}])

Here, the error from process_input() is properly delivered to my error callback, but then we enter the tight loop scenario, due to POLLERR not being recognized.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions