Jump to content
IGNORED

PlayPcmWin, Opensource audio player for Windows


Recommended Posts

  • 10 years later...
  • 8 months later...

Hello @yamamoto2002,

 

We've been reducing the footprint of the entire OS last year, so far it's getting down to 17 processes / 250MB or so

 

http://jplay.eu/forum/index.php?/topic/4410-windows-11-pe-audiophile-creation-guide/?p=62884

Quote

processes=17
threads=148
handles=3718
CPU=0.23%

 

http://jplay.eu/forum/index.php?/topic/4410-windows-11-pe-audiophile-creation-guide/?p=62885

Quote

SIZE= 257 mb (Ultralite WITHOUT WinXShell or PECMD)

 

Since it's quite essential to avoid adding any components related to .NET Framework for the sake of minimizing the size, I'm more interested in the command-line version (i.e. PlayPcm since it's written in C++) and found an older version here

 

https://web.archive.org/web/20151220044103id_/https://bitspersampleconv2.googlecode.com/files/PlayPcm200.zip

 

Since it didn't come with many features back then, just wondering if there were any simple ways to compile the last updated one from September 2021 at all?

 

https://sourceforge.net/p/playpcmwin/code/HEAD/tree/PlayPcmWin/PlayPcm/

 

I guess that something like this might work just fine but not exactly sure about that

 

https://mjt.hatenadiary.com/entry/20170218/p1

 

Maybe it should look somewhat similar to this stuff then?

 

https://github.com/elishacloud/dxwrapper/issues/126

git clone --branch master --recursive https://github.com/elishacloud/dxwrapper.git
cd dxwrapper
msbuild /p:WindowsTargetPlatformVersion=%Version_Number%;Platform=win32;Configuration=Release /t:Clean,Build dxwrapper.sln
Link to comment
On 1/27/2023 at 12:44 PM, seeteeyou said:

Since it didn't come with many features back then, just wondering if there were any simple ways to compile the last updated one from September 2021 at all?

 

https://sourceforge.net/p/playpcmwin/code/HEAD/tree/PlayPcmWin/PlayPcm/

 

PlayPcmWin project was originally stored on google code but google code shut down on 2014 and now sourceForge is main repository and somewhat older mirror is on github.

 

This is PlayPcm console description page https://sourceforge.net/p/playpcmwin/wiki/PlayPcmConsole/

Open PlayPcmVs2019.sln with Visual Studio 2019, set Release x64 build target and press F5 to build/run PlayPcm.exe.

Zipped executable file PlayPcm107.zip is uploaded this page: https://sourceforge.net/projects/playpcmwin/files/others/

 

PlayPcm console program was inspired by play-exclusive program of Matthew van Eerde, it is even more simpler https://github.com/mvaneerde/blog/tree/develop/play-exclusive

 

PlayPcm.exe program size is currently 44544 bytes but if you strip functionalities such as DSF/DSDIFF decode and experimental unnecessary large page memory allocation code, 22000 ~ 23000 bytes EXE file is possible but I think it is waste of time, use sysinternals vmmap to measure actual program footprint, many DLLs (total 28MB) are loaded onto VM to run PlayPcm.

PlayPcm.thumb.png.863e3792fac8ea0775a79a128d5d9744.png

 

Sunday programmer since 1985

Developer of PlayPcmWin

Link to comment
3 hours ago, yamamoto2002 said:

PlayPcm.exe program size is currently 44544 bytes but if you strip functionalities such as DSF/DSDIFF decode and experimental unnecessary large page memory allocation code, 22000 ~ 23000 bytes EXE file is possible but I think it is waste of time, use sysinternals vmmap to measure actual program footprint, many DLLs (total 28MB) are loaded onto VM to run PlayPcm.

 

Many thanks for your help.

 

First of all, just wondering if large page memory allocation were similar to something like this?

 

https://github.com/eladkarako/7z_bundle/tree/ef339c2db0119bdfa8f29208671f92ad1cb9f277#readme

Quote
  1. linker patch for x32/x64 exe/dlls with LARGEADDRESSAWARE to be able to allocate addresses larger than 2 gigabytes.
  2. multiple embedded manifest patches:
  • 2.1. adding missing Windows 10/11 compatibility section, this will allow the applications/dlls to run outside of the vista-virtualization mode. this speeds up the applications and RAM allocation.
  • 2.2. ready to use unlimited path-length support. user would need new W10/11 and to opt-in longPathAware support with a one-time registry patch as explained in here: https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=registry
  • 2.3. full DPI awareness for clear text in the UI-applications and context-menu.
  • 2.4. using segmented-heap by default, the new efficient implementation of stack in Windows 10/11. very useful for 7zip, makes memory allocation more efficient.

 


 

Secondly, I checked HRESULT from main.cpp and found that only 44.1kHz / 48kHz / 176.4kHz were listed.

 

Please feel free to correct me if I were mistaken, I'm assuming that's only meant for testing purposes and therefore we should be able to play files that are above 176.4kHz.

 

In that case, are you expecting both 705.6kHz and 768kHz files are compatible with PlayPcm console program by default? Basically WASAPI might have some issues with specific software players while others seemed to be OK so far

 

https://community.roonlabs.com/t/chord-mojo-downsampling-to-352-8/188357/2

Quote

On my Win11 machine, foobar2000 is able to play 705.6kHz files with WASAPI exclusive. Roon on the other hand can’t open fail to access the device, works only with ASIO at that rate.

 

https://addictedtoaudio.com.au/blogs/what-we-think/in-search-of-768khz

Quote

I used JRiver Media Center to send it to the Topping E30 DAC, and it duly displayed “768.0 PCM” on its front panel display (see the photo up top). That worked using the ASIO driver, the WASAPI driver and even Windows Kernel Streaming. It would not work using Direct Sound.

 

https://audiophilestyle.com/forums/topic/58090-holo-audio-may-dac/page/54/#comment-1145517  

On 7/6/2021 at 12:49 AM, Gavin1977 said:

WASAPI limits to 384kHz in Windows, but I get 768kHz using WASAPI in HQPlayer desktop..

 


 

If getting 705.6kHz and 768kHz to work reliably were proven to be too much of a challenge, do you think that maybe ASIO could be an even better choice? I found that you should have conducted some experiments several years ago

 

https://sourceforge.net/p/playpcmwin/code/HEAD/tree/PlayPcmWin/00experiments/AsioIO/

 

XMOS also wrote a little something for ASIO

 

https://github.com/xmos/xplay

 

Here's yet another example by Fukuroda-san

 

https://gist.github.com/fukuroder/7658921

 

Both of them were written in C++ and hopefully they're somewhat useful for reference.

 

Other examples are also available on GitHub, though many of them would require implementation of BASS audio library

 

https://www.un4seen.com/bassasio.html

https://github.com/naudio/NAudio/tree/master/NAudio.Asio

https://github.com/aidan-g/BASS_GAPLESS/tree/master/bass_gapless_asio

https://github.com/Raimusoft/FoxTunes/tree/master/FoxTunes.Output.Bass.Asio

https://github.com/ManagedBass/ManagedBass/tree/master/src/AddOns/BassAsio

https://github.com/koobar/RabbitTune/tree/master/RabbitTune.AudioEngine/AudioOutputApi

 


 

BTW, I would prefer not to debate the merits of playing 768kHz files and just FYI they're created by PGGB 256

 

https://audiophilestyle.com/forums/topic/62699-a-toast-to-pggb-a-heady-brew-of-math-and-magic/page/51/#comment-1227948

Link to comment
5 hours ago, seeteeyou said:

First of all, just wondering if large page memory allocation were similar to something like this?

 

https://github.com/eladkarako/7z_bundle/tree/ef339c2db0119bdfa8f29208671f92ad1cb9f277#readme

 

No, it is large address aware feature of 32bit-windows app. default 32-bit windows app can use up to 2GB of user land VM, but specifying LARGEADDRESSAWARE flags to linker, app VM space is increased to 3.5GB. BUT there is a DLL loaded at 2GB and VM space is fragmented and contiguous memory space of > 2GB is not available even with the LARGEADDRESSAWARE flag. For further info, please read https://github.com/yamamoto2002/bitspersampleconv2/issues/42

 

Large page memory is x64 processor feature to increase memory page size and reduce page table size, reduce frequency of TLB reload event, generally it improves efficiency, very minuscule improvement with music playback scenario, and it needs special user privileges to run app, extra step is necessary to run app and it degrades app user experience to some extent. I tested large page memory on PlayPcm and decided not to implement to PlayPcmWin for this reason.

 

5 hours ago, seeteeyou said:

Secondly, I checked HRESULT from main.cpp and found that only 44.1kHz / 48kHz / 176.4kHz were listed.

 

Please feel free to correct me if I were mistaken, I'm assuming that's only meant for testing purposes and therefore we should be able to play files that are above 176.4kHz.

 

In that case, are you expecting both 705.6kHz and 768kHz files are compatible with PlayPcm console program by default?

 

I think limitation of sample rate of PlayPcm WAV playback is 2147483647 Hz.

44.1/48/176.4 kHz is frequency of device testing.

 

I had had WAV files of order of MHz sample rate, 16bit depth, to view it on WavSpectra. Such WAV files are convenient to check performance of PCM to 1bit format converter. I don't have sound playback device yet to play 2.8MHz 16bit PCM.

 

I tested 705.6 kHz PCM playback before, with PlayPcmWin (it has very similar playback code). Also about max number of channels, 26 channels with Profire 2626.

 

 

Sunday programmer since 1985

Developer of PlayPcmWin

Link to comment
19 hours ago, yamamoto2002 said:

I tested 705.6 kHz PCM playback before, with PlayPcmWin (it has very similar playback code).

 

Thank you again for answering my questions, I also saw this page on your Wiki

 

https://sourceforge.net/p/playpcmwin/wiki/RF64 WAVE/

Quote

4GB limitation is serious problem for high resolution audio or multichannel audio. For example, 768kHz 32bit 2ch PCM (or 192kHz 32bit 8ch PCM) exceeds 4GB in 11 minutes.

 

Just wondering if Sony Wave64 were also able to support both 705.6 and 768 kHz by any chance?

 

http://martinleese.epizy.com/MyTemporaryDownloads/Sony_Wave64.pdf

 

I'm interested in that particular audio container format because it turned out to sound superior according to multiple sources

 

https://www.whatsbestforum.com/threads/ive-finally-made-up-my-mind-about-flac-vs-uncompressed-sigh-hard-drive-makers-rejoice.33345/#post-737665

Quote

https://www.audioshark.org/computer-digital-audio-11/any-sound-quality-difference-between-flac-wav-15491-post-255146.html#post255146

Quote

I believe it truly depends on the server and playback mechanism. I am ripped to WAV64 at 32/44.1 for Redbook CD. The sound difference on my music server/DAC combo was significant between FLAC, WAV and W64.

 

https://forum.audiogon.com/discussions/absolute-top-tier-dac-for-standard-res-redbook-cd/post?postid=1742865#1742865

Quote

I have a Memory Player installed with JRiver and I ripped some cd's to WAVE 64, as Sam included some WAVE64 files on my unit and I really liked how they sounded, compared to the same albums I have on cd and WAV.

 

https://www.thememoryplayer.net/2016-features

Quote

Typical use is conversion of entire libraries to a better sounding file type or format. For example, ALL lossless compression is highly jittered. The Resampler can convert, say, FLAC to W64, and dramatically improve its fidelity.

 

http://v2.stereotimes.com/post/laufer-teknik-memory-player-mini

Quote

Songs added to your music drive from thumb drive aren’t processed nor auto-added to JRiver MC, so you need to process them with the Mini’s Upsampler to bring them up to 32-bit Wave64 format where the Memory Player’s software does its magic.

So, what was CP doing during our listening sessions? He was dropping processed Wave64 files onto ’Burn Memory’ slots in his Memory Player 32, then dropping that converted file into JRiver, and playing from JRiver. This provides a noticeable improvement over just playing from JRiver referencing the Music drive.

 

FFmpeg already supported *.w64 a long time ago

 

https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/w64.c

https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/w64.h

https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/wavdec.c

 

Yet another implementation with either public domain or MIT License

 

https://github.com/mackron/dr_libs/blob/master/dr_wav.h

https://github.com/mackron/dr_libs/tree/master/tests/wav

https://mackron.github.io

Quote

dr_wav is an open source library for decoding WAV files. It's written in C in a single file with no dependencies except for the C standard library, and is released into the public domain.

 

Are you expecting any difficulties in terms of creating something called "WWW64Reader" or not?

 

My programming skills (or lack thereof) are very limited since I only took just a few classes about 20 years ago, I didn't really touch any compilers after my graduation except for trying a couple of things with GitHub Actions last year.

Link to comment
27 minutes ago, seeteeyou said:

Just wondering if Sony Wave64 were also able to support both 705.6 and 768 kHz by any chance?

I looked into Wave64, RF64, Original WAV formats, and, all of them support high resolution sample frequencies

up to 2GHz 16bit 1ch, 1GHz 16bit 2ch, 536MHz 32bit 2ch, 134MHz 32bit 7.1ch. There is no limitation around 700kHz.

 

33 minutes ago, seeteeyou said:

I'm interested in that particular audio container format because it turned out to sound superior according to multiple sources

I don't believe it. Those WAV header data is stripped off when read onto memory and the difference is disappeared on early stage of sound data processing that is performed before playback starts (app picks up all the needed data for playback from the header: sample rate, number of channels, sound duration then discard it, and then header itself does not exist on memory) and header structure is not sent to DAC.

 

Sunday programmer since 1985

Developer of PlayPcmWin

Link to comment
  • 9 months later...

Good day, I just checked this page on SourceForge since I've been reading quite a bit about anything that's related to remote control

 

https://sourceforge.net/p/playpcmwin/wiki/PPWRemote/

 

Obviously even Android 7.0 would be considered "fairly large" when the size of Win11PE could be reduced drastically

 

http://jplay.eu/forum/index.php?/topic/4410-windows-11-pe-audiophile-creation-guide/?p=63862

Quote

I have removed "iertutil.dll" and "imageres.dll" , contained in \windows\system32, which has allowed me to get a boot.wim size of 99.7 MB, with my ASIO drivers and F2k and BugHead installed. Without ASIO drivers and without audio apps the size is 84.4 MB !!

 

Therefore I'm more interested in something like the RFB protocol while some of the oldest / smallest 64-bit executables (i.e. VNC viewer) were found after I extracted them

 

vnc-E4_2_7-x64_win32_viewer.zip

TurboVNC64-1.0.exe

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2006-11-07 00:23:12 .....      1076728      1076728  vnc-E4_2_7-x64_win32_viewer.exe
2010-10-20 19:22:52 ....A       521728       521728  vncviewer.exe

 

How about some of the most "ancient" releases when we're only getting 32-bit executables back then?

 

https://web.archive.org/web/20021004135501id_/http://www.realvnc.com/dist/vnc-3.3.4-x86_win32.exe

https://web.archive.org/web/20000816060724id_/http://www.uk.research.att.com/vnc/dist/vnc-3.3.3r2_x86_win32.zip

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2002-09-20 16:51:46 ....A       233472       233472  vncviewer.exe
1999-10-12 16:52:57 ....A       176128       176128  vnc_x86_win32/vncviewer/vncviewer.exe

 

At least in theory we could attempt to get the source code compiled so that we'll end up with 64-bit executables

 

https://web.archive.org/web/20051029062240id_/http://www.realvnc.com/dist/vnc-3.3.7-winsrc.zip

https://github.com/petersenna/pvnc#readme 

 


 

Actually I'd like to ask you some questions about Tcl/Tk since you've been a programmer for many years.

 

Despite the fact that running Tcl scripts should require nothing more than tclsh

 

https://wiki.tcl-lang.org/page/Running+tcl+scripts+directly+on+Windows+command+line+

https://modules.readthedocs.io/en/stable/INSTALL-win.html

Quote

Modules consists of one Tcl script so to run it from a user shell the only requirement is to have a working version of tclsh (version 8.5 or later) available on your system. tclsh is a part of Tcl.

 

While we might be able to get away with using version 8.2 all the way from 1999

 

https://github.com/tcltk/tclapps/blob/master/apps/tkvnc/tkvnc.tcl#L40

package require Tk 8.2

 

https://archive.today/aS42

Quote

Tcl 8.2.0 リリース 1999/8/18

 

Do you really think that such an old version could still be compiled for modern versions of Windows, so that 64-bit executables of tclsh might still work with TkVNC by any chance?

 

https://wiki.tcl-lang.org/page/TkVNC

Quote

Wow, a 13 Kb portable VNC client!

 

BTW, I already checked several binary installers of Tcl/Tk but so far only fairly new ones would be still available, they're like way more than 1 MB while vncviewer.exe from 64-bit version of TurboVNC 1.0 turned out to be only 0.5 MB

 

https://www.tcl.tk/software/tcltk/bindist.html

https://wiki.tcl-lang.org/page/Binary+Distributions

 

More importantly, not sure if anything written in Tcl/Tk were consuming more (i.e. when compared to alternatives that are written in C++) system resources or not?

 


 

Other than that tiny little TkVNC mentioned above, are you aware of any 64-bit executables that are genuinely "super compact" or maybe even better solutions could be found somewhere else?

 

BTW, I read something interesting about RealVNC recently and that's the reason why I took my time to learn more about the history of the RFB protocol

 

https://www.realvnc.com/en/blog/realvnc-raspberry-pi-prize-finalist-kunikatsu-takase-robot-carer/

https://avsc.jp/info.html

Quote

■The 2023 RealVNC Raspberry Pi Prize受賞のお知らせ
当社技術顧問の高瀬國克氏が開発した「Teleoperation of a Robotic Grabber for Personal Use via Internet」が「The 2023 RealVNC Raspberry Pi Prize」にて銀賞相当の賞を受賞しました。

Link to comment
  • 2 months later...
17 hours ago, yamamoto2002 said:

I don't have interest in Kernel Streaming

In your opinion, is the use of WASAPI-to-KS or ASIO-to-KS wrapper justified?

The question is prompted by the recently introduced KS mode support in the Audirvana player.

Link to comment

WASAPI exclusive mode was introduced on Windows Vista and it can send unaltered PCM to the audio device reliably.

 

On Windows XP or earlier version, Windows audio has fundamental design flaw. on some user scenario, sample rate conversion is performed without noticed to the user.

 

If you are still running Windows XP, Windows 2000, Windows Me, or Windows 98, Kernel streaming is one method to send unaltered PCM to the audio device, still you should check output bit pattern using PCM recorder if bit perfectness is important

 

I'm not fan of audio stream wrapper libraries in general, they always introduce delay that is inevitable, sometimes sound stuttering, random unstableness, sometimes silent bit truncation and bug fix request comes to the app developer:D one of the reason to decide not to use ASIO library to develop app is existence of ASIO4ALL

Sunday programmer since 1985

Developer of PlayPcmWin

Link to comment
6 hours ago, yamamoto2002 said:

If you are still running Windows XP, Windows 2000, Windows Me, or Windows 98, Kernel streaming

Windows Server 2016 and Win11PE. DSD-over-PCM incapsulation can be used as the simple method to check output bit pattern perfectness. In case of DoP-marker corruption target device (DAC or DDC) turns from DSD to PCM mode. 

Link to comment
6 hours ago, pm325 said:

Windows Server 2016 and Win11PE. DSD-over-PCM incapsulation can be used as the simple method to check output bit pattern perfectness. In case of DoP-marker corruption target device (DAC or DDC) turns from DSD to PCM mode. 

 

DoP signal is small magnitude (max 4.7% magnitude PCM) and the test does not cover the signal level range from 4.8% to 100% magnitude. For example, checking bitperfectness of signal path with DoP cannot detect Limiter APO bit altering which starts to work at ≥ 95% magnitude PCM signal.

On 16bit PCM, sometimes only -32768 (0x8000) is corrupted and rounded to -32767 due to implementation error

 

Sunday programmer since 1985

Developer of PlayPcmWin

Link to comment
On 1/21/2024 at 8:29 AM, yamamoto2002 said:

WASAPI exclusive mode was introduced on Windows Vista and it can send unaltered PCM to the audio device reliably.

Quote

 

WASAPI in exclusive mode has almost the same efficiency as KS but may be affected by system layer implementation (for example, there are some problems in Win7 fixed in Win8). The "low-level" and "high-level" terms are quite relative. In one case, interface level may represent its universality and usability (the simpler, the higher level). In another case, the level may represent interface features and efficiency (the more features or the higher efficiency, the lower level).

WASAPI, DirectSound and MME interfaces can be considered "high-level" only in comparison to Kernel Streaming because they are built on top of KS. In modern Windows versions, these interfaces are often considered "low-level" because higher-level ones (AudioGraph, MediaCapture, MediaElement, XAudio2) are offered. Meanwhile, XAudio2 is often called "low-level" because it offers hardware-close streaming control.

 

Kernel Streaming protocols

To exchange audio data and control the stream, KS driver and client must negotiate a protocol. There are two different streaming protocols used between KS client and KS driver: "legacy" and "realtime".

Legacy (or "standard streaming") is a native KS streaming protocol, available in all KS implementations, starting from Windows 98. Audio data are passed via buffer chains, as in MME interface. To send and receive each buffer in the chain, a switch to kernel mode must be performed. The higher processing event frequency, the lower latency, the more overhead.

"Realtime" protocol ("looped streaming" or "RT Audio mode") was added in Windows 6.x+ (Vista, Server 2008, Win7 and Win8). Audio data are passed via single circular buffer (usually located in the hardware) that is directly accessible to user-mode client. No periodic kernel mode switching is required to write data on playback and read them on recording. If the driver supports a position register, no kernel mode switching is required to obtain current playback/recording position.

Inside each protocol, different processing modes can be negotiated between the driver and client.

Don't confuse the "realtime" in protocol naming with real-time performance/streaming. Since all audio streaming protocols are designed and used for playback or recording real world audio signals, they all definitely work in real time. The "realtime" term in protocol naming has a meaning like "more suitable for real time processing", "very low latency" etc.

Although standard streaming protocol is supported by all Windows versions that support KS, particular drivers support it by their own choice. Starting from Windows 6.x (Vista, Server 2008, Win7 and Win8), RT Audio is considered preferred for most embedded hardware. Only USB audio drivers still support legacy protocol because there is no direct access from the CPU to internal circular buffer inside USB device.

 

Stream processing modes

Represent various peculiarities of Kernel Streaming protocol used by KS clients.

  • Looped - a looped (circular) data buffer is used.
    In legacy KS protocol, a buffer chain is normally used, when completed buffer parts are being returned to client, and new parts are being submitted in reply. In looped mode, buffer parts are never returned until explicitly requested, so the driver continuously loops submitted parts.
    In real-time protocol, a single circular buffer is the only way to interchange data between driver and client, so this mode is always indicated for RT streams.
  • Event notification - driver signals the events to notify the client about stream progress. Currently, RT protocol allows to specify up to two events that signal as the appropriate half of looped buffer is completed.
    If events are not used, clients have to poll stream position with a sufficient frequency.
  • Packet mode - stream data are submitted and completed in packets (parts of circular buffer). Packet mode is a kind of flow control. Currently, the system supports only two packets (halves) per buffer.
    In packet mode, both the driver and its client maintain packet counters to check stream integrity and detect potential data overflows/underflows.
    Without packet mode, data are submitted and completed in portions of any size, and only client can detect overflows/underflows by the stream position maintained by the driver. The driver never knows how much data are submitted or completed by the client.
  • Clock register - a hardware (or emulated) register is used by the client to read stream's clock information directly, without issuing a special API request and switching from user mode to kernel mode, and then back to user mode.
  • Position register - a hardware (or emulated) register is used by the client to read directly current stream position.

Looped mode can be used in both legacy and RT protocols. Other modes are used in RT protocol only.

Event notifications are supported by Windows 7 and later. Packet mode is supported in Windows 10 and later.

 

https://github.com/dechamps/FlexASIO/blob/master/BACKENDS.md

image.thumb.png.067e2ba461366b799f4ed65b3e3e4941.png

Quote

 

The Windows audio engine itself (WASAPI) uses Kernel Streaming internally to communicate with audio device drivers. It logically follows that any device that behaves as a normal Windows audio device de facto comes with a WDM driver that implements Kernel Streaming (usually not directly but through a PortCls miniport driver). Calls made through any of the standard Windows audio APIs (MME, DirectSound, WASAPI) eventually become Kernel Streaming calls as they cross the boundary into kernel mode and enter the Windows audio device driver.

In the typical case, the only client of the Windows audio device driver is the Windows audio engine. It is, however, technically possible, albeit highly atypical, for an application to issue Kernel Streaming requests directly, bypassing the Windows audio engine and talking to the Windows kernel directly. This is what the WDM-KS PortAudio backend does.

Given the above, Kernel Streaming offers the most direct path to the audio device among all PortAudio backends, but comes with some downsides:

  • Many audio outputs only handle a single stream at a time, because many audio devices do not support hardware mixing. (In Kernel Streaming terms, their pins only support one instance at a time.) These devices can therefore only be used by one KS client at a time, making Kernel Streaming an exclusive backend in this case. Because the Windows audio engine is itself a KS client, it is usually not possible to access an audio device using KS if the Windows audio engine is already using it. Use the device option to select a device that the Windows audio engine is not currently using.
  • Kernel Streaming is a very flexible API, which also makes it quite complicated. Even just enumerating audio devices involves quite a bit of complex, error-prone logic to be implemented in the application (here, in the PortAudio KS backend). Different device drivers implement KS calls in different ways, report different topologies, and even different ways of handling audio buffers. This presents a lot of opportunities for things to go wrong in a variety of different ways depending on the specific audio device used. Presumably this is the reason why most applications do not attempt to use KS directly, and the reason why Microsoft does not recommend this approach.

Note that the list of devices that the PortAudio WDM-KS backend exposes might look a bit different from the list of devices shown in the Windows audio settings. This is because the Windows audio engine generates its own list of devices (or, more specifically, audio endpoint devices) by interpreting information returned by Kernel Streaming. When using KS directly this logic is bypassed, and the PortAudio WDM-KS backend uses its own logic to discover devices. Furthermore, the concept of a device "name" is specific to the Windows audio engine and does not apply to KS, which explains why PortAudio WDM-KS device names do not necessarily match the Windows audio settings.

In principle, similar results should be obtained when using WASAPI Exclusive and Kernel Streaming, since they both offer exclusive access to the hardware. WASAPI is simpler and less likely to cause problems, but Kernel Streaming is more direct and more flexible. Furthermore, their internal implementation in PortAudio are very different. Therefore, the WASAPI Exclusive and WDM-KS PortAudio backends might behave somewhat differently depending on the situation.

The WDM-KS backend cannot redirect the stream if the default Windows audio device changes while streaming.

 

 

Just my 2c.

Link to comment

Windows Audio Subsystem

Windows Audio Subsystem includes several components, the most important of which are the following:

  • ks.sys - common Kernel Streaming kernel-mode library. Provides common routines to process various KS requests and objects. Used as a helper by most KS drivers.

  • portcls.sys - kernel-mode Port Class Driver. Offers a framework to simplify KS driver development. Performs most typical KS operations, while the actual device driver (called "Miniport Driver") provides device-specific operations only.

  • ksproxy.ax - user-mode component that wraps KS filters to represent them as DirectShow filters. Thanks to this, every device that has a KS driver, automatically becomes accessible from a DirectShow filter graph, with the minimum possible overhead costs.

  • AudioDG.exe - System Audio Engine. Communicates with KS device drivers, mixes sounds played back by applications, splits sounds to be recorded by applications, performs format conversion etc.

  • Audiosrv.dll - System audio services. Perform various device/endpoint maintenance tasks.

 

System Audio Engine

The System Audio Engine is a system code that supports most of system audio features. It is hosted by the AudioDG (Audio Device Graph [Isolation]) process.

System Audio Engine acts as a "proxy" to each WDM/KS audio driver accessed via WASAPI, MME, DirectSound and  other higher-level interfaces in shared mode. When an application uses shared connection mode, a separate pin instance is implicitly created to the System Audio Engine. See Audio layering issues for details.

Additionally, the Engine hosts Audio Processing Objects (APOs) implementing local and global audio effects (LFX/GFX).

Before Win 6.x, the same role was played by the KMixer (kernel-mode audio mixer), a system kernel-mode audio component (a special kind of an audio driver), a part of the Windows 98/ME and 2k/XP/2k3 audio subsystem.

 

System audio services

Starting from Vista, the system has a dedicated Audio Service (AudioSrv), named Windows Audio in the service list. This service maintains audio endpoint properties.

Audio endpoint database is built by the Windows Audio Endpoint Builder service (AudioEndpointBuilder). This service queries all audio pins exposed by KS filters and creates an endpoint for each pin.

These services are running in the Service Host process container (svchost.exe). An instance of such process may run several different services. To help finding the appropriate service, VAC driver shows service tags in its event log.

In some cases, restarting System Audio Service may help to eliminate some audio endpoint problems without rebooting the entire system.

 

Service tags

Most Windows services are running in dedicated Service Host process container (svchost.exe). An instance of such process may run several different services. Each service acts on behalf of its container process. When the service accesses a device, device driver can determine only process (PID) and thread (TID) identifiers, but not the service name. To identify a particular service, the driver may access the Service Tag, a numeric identifier of the service. VAC driver shows service tags in its event log.

Unfortunately, a driver cannot access Service Manager database to identify the name of the service. To identify the service by its tag, use third-party "sctagquery" command-line utility. For example, if the PID is 184 and service tag is 12, enter the following command line under an administrator account:

sctagqry -n 12 -p 184

 

Shared and exclusive pin access

Most audio device drivers support only a single instance of each capture or render pin (they are single-client drivers). To allow to access these pins from several applications at the same time, an intermediate (proxy) layer is required. In Windows, this layer is provided by the System Audio Engine: MME (all) and DirectSound/WASAPI (by default) connections are established in the shared mode when the engine creates a single pin instance for itself and all clients are connected to the engine, not immediately to the filter and the pin. System Audio Engine chooses an appropriate format for the pin instance, and then converts audio data between pin format and client stream formats. This mode is convenient but often not efficient enough.

DirectSound (in Windows 5.x), WASAPI (in Windows 6.x+) and WDM/KS (in all systems) support exclusive pin access modes while the pin instance is created for a requestor application only. No other clients (applications and even system sounds) are allowed to share this instance. The pin is instantiated with the format requested and no format conversion is performed between client application and the driver. This mode is efficient but not convenient enough because there can be only a single application that can use the pin at a time. If the driver supports multiple pin instances like VAC, there is no such restriction.

Implementing multi-client pin access, VAC behaves like System Audio Engine in the shared mode, mixing playback streams together, distributing cable data among recording streams and performing format conversions. So most efficient VAC usage method is to use exclusive access modes when connecting to Virtual Cables is possible.

In WASAPI, exclusive access mode is supported in two forms: polling (also called "push" for playback and "pull" for recording) and event-driven notification. In the polling mode, the client periodically queries the status of the stream to determine when to write or read the next portion of audio data. In the notification mode, the driver raises the event every time the room/data are available. In addition to more optimal CPU resource utilization, notification mode allows to use very small KS buffers (down to 1 ms).

 

PortCls

PortCls stands for "Port Class Driver". It is Windows kernel-mode module (portcls.sys) implementing most common multimedia driver functions and intended to simplify drivers for particular multimedia hardware. A driver based on PortCls functionality is called "minidriver", or "miniport driver". PortCls receives all KS client and some system internal requests, translates them and passes to a miniport driver. So a miniport driver must implement only device-specific code.

In Windows XP and later systems, on a multi-CPU/core hardware, PortCls has some bugs. To avoid problems linked to them, VAC implements a workaround, processing most streaming WavePci requests without calling to PortCls. Processing can be switched back to PortCls engine for particular cables using cable configuration parameters.

 

Port class driver port/miniport types

VAC, as well as most other audio drivers, is built in a "miniport driver" model while driver binary module contains only code that handles driver-specific functions. Common functions are handled by standard Windows Port Class Driver module. The "port" and "miniport" terms mean internal system interfaces provided for software module communication. They are not related to hardware ports used for device connection, or I/O ports used for low-level device communications.

To communicate with audio miniport driver, Port Class Driver provides three internal port (interface) types:

  • WaveCyclic - intended for legacy audio adapters with a single circular hardware buffer common for all clients. It is the simplest (and usually most stable) interface but also the slowest one.

  • WavePci - intended for adapters with multiple bus mastering buffers individual for each client and internal hardware mixing support. Can provide less latency than WaveCyclic but port/miniport communication is much more complex and may cause problems in some cases.

  • WaveRT - intended for modern adapters having one or more circular hardware buffers directly accessible to user-mode clients. It is the most efficient interface having almost no overhead.

WaveCyclic and WavePci exist in all Kernel Streaming implementations. WaveRT was introduced in Windows Vista so it is not available in XP and older versions.

For a user-mode Kernel Streaming client (including System Audio Engine), audio drivers that use WaveCyclic or WavePci port interfaces are indistinguishable. In Windows terms, they support a "standard streaming protocol". Kernel Streaming version of Audio Repeater application calls such drivers "legacy". On the contrary, drivers using WaveRT port interface support "looped streaming protocol" and are considered "realtime". Audio Repeater calls them "RT Audio".

Most modern audio drivers for embedded (hidden under the cover) hardware support RT Audio protocol. USB audio drivers usually support legacy one.

Link to comment

Thank you for providing info about KS. My knowledge is somewhat old and it contained latest updates.

I'd like to add, Vista or layer versions of DirectSound is just a wrapper layer built on WASAPI shared mode and it is nothing direct other than the name itself, and it is deprecated and replaced with XAudio2

 

It is interesting the document said some WASAPI Exclusive unstableness problem of Windows 7 was fixed on Windows 8.

Also, I read somewhere, WASAPI of Vista gold was full of bugs and unusable, and it somewhat stabilized on SP1.

 

On early 2010, PC of very poor performance were still in use, also sound device vendor provides poorly implemented device drivers, and these small performance deficiency did make playback glitches.

 

in recent years, sound devices use stable Microsoft provided driver and computer performance was improved in general and experiencing sound stuttering is rare,

it is only when something becomes seriously wrong

 

In general, WASAPI exclusive event mode is much more stable than WASAPI exclusive push mode and IMO event mode should be default mode of apps. But foobar2000 WASAPI component uses WASAPI exclusive push mode to send PCM when it was released and this may damage the credit to WASAPI exclusive mode. Several years ago it supports WASAPI exclusive event mode as a option

Sunday programmer since 1985

Developer of PlayPcmWin

Link to comment
17 hours ago, yamamoto2002 said:

I'd like to add, Vista or layer versions of DirectSound is just a wrapper layer built on WASAPI shared mode and it is nothing direct other than the name itself, and it is deprecated and replaced with XAudio2

Similarly for applications that use WASAPI Exclusive output the wrapper built on Kernel Streaming always works.

 

17 hours ago, yamamoto2002 said:

It is interesting the document said some WASAPI Exclusive unstableness problem of Windows 7 was fixed on Windows 8.

Information taken from the following link:

https://vac.muzychenko.net/en/manual/glossary.htm#Interface

https://vac.muzychenko.net/en/manual/winbugs.htm#PROPOSEDATAFORMAT_Requests_With_Invalid_Parameters

https://vac.muzychenko.net/en/manual/layers.htm

You can contact the developer directly and forward the question to him via feedback form:

https://vac.muzychenko.net/en/support.htm

 

A quick search gives a link to an example of instability WASAPI Exclusive in Event mode under Windows 7 Enterprise x64 SP1:

https://www.mediamonkey.com/forum/viewtopic.php?t=62311

_________________________________________________

Yamamoto-san, thank you for the beautiful player!

Link to comment
50 minutes ago, pm325 said:

Similarly for applications that use WASAPI Exclusive output the wrapper built on Kernel Streaming always works.

 

no, DirectSound was introduced as a part of DirectX software development kit I remember it was somewhere between 1996 or 1997, from Windows 95 OSR2 or Windows 98 to XP it was actually more direct, but DirectSound after Windows Vista is built on top of WASAPI shared mode and it has 30ms of delay, it is legacy feature, left just for backward compatibiilty of old apps and not recommended to use for new app development. I read somewhere WASAPI shared mode delay is reduced to 15ms on Windows 10 but I don't have a pointer to the source

 

from https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ee416960(v=vs.85)

Quote

Microsoft suggests that existing code that uses the legacy APIs be rewritten to use the new APIs if possible.

 

Sunday programmer since 1985

Developer of PlayPcmWin

Link to comment
20 minutes ago, yamamoto2002 said:

no, DirectSound was introduced as a part of DirectX software development kit

My post has nothing to do with DirectSound. I apologize for the sudden change of context. The point was that WASAPI Exclusive output is essentially implemented through KS too (de facto).

Link to comment

On WASAPI shared mode, app copies PCM data to shared mode buffer that is exposed by API, then OS resamples it to shared mode sample rate, filtered with APOs, and mixes sounds from other apps, gain adjusted, send it to the endpoint of the audio device.

 

On the other hand, WASAPI exclusive mode is very thin layer and it should not have significant overhead. WASAPI exclusive wakes up App playback thread when its PCM buffer can accommodate the new PCM samples and exposes the memory pointer to write PCM data, app write PCM to the buffer, then PCM data will be sent to the audio device unaltered. I'd like to measure it on my spare time what amount of CPU cycles are wasted on the wrapper tasks

 

this image is from http://blogs.msdn.com/b/windows_multimedia_jp/archive/2010/06/28/4-windows7.aspx

On this chart, DirectShow HW Mix path is discontinued and it becomes one of the audio session of WASAPI shared mode

windows7audio.png.0973d5edf09b9ed22ec69d937f7247a2.png

 

 

This is informative page on Windows audio engine internals. I read his blog to write PlayPcm (console version) on 2010

https://matthewvaneerde.wordpress.com/tag/audio/

 

 

Sunday programmer since 1985

Developer of PlayPcmWin

Link to comment
17 hours ago, yamamoto2002 said:

On WASAPI shared mode, app copies PCM data to shared mode buffer that is exposed by API, then OS resamples it to shared mode sample rate, filtered with APOs, and mixes sounds from other apps, gain adjusted, send it to the endpoint of the audio device.

Totally agree and this is supported by the information in the links previously provided.

17 hours ago, yamamoto2002 said:

On the other hand, WASAPI exclusive mode is very thin layer and it should not have significant overhead. WASAPI exclusive wakes up App playback thread when its PCM buffer can accommodate the new PCM samples and exposes the memory pointer to write PCM data, app write PCM to the buffer, then PCM data will be sent to the audio device unaltered.

 

2 hours ago, yamamoto2002 said:

My understanding is,

when the audio device is WaveRT, PCM write buffer WASAPI exclusive exposes is the cyclic buffer pointer memory address of Fig.1 that is mapped to userland VM

From your point of view WASAPI exclusive mode has direct access to the output device buffer. However, the WaveRT port driver miniport is part of the Kernel Streaming implementation (along with the WaveCyclic and WavePci).

On 1/23/2024 at 5:32 AM, pm325 said:

portcls.sys - kernel-mode Port Class Driver. Offers a framework to simplify KS driver development. Performs most typical KS operations, while the actual device driver (called "Miniport Driver") provides device-specific operations only.

 

On 1/23/2024 at 5:32 AM, pm325 said:

WaveCyclic and WavePci exist in all Kernel Streaming implementations. WaveRT was introduced in Windows Vista so it is not available in XP and older versions.

For a user-mode Kernel Streaming client (including System Audio Engine), audio drivers that use WaveCyclic or WavePci port interfaces are indistinguishable. In Windows terms, they support a "standard streaming protocol". Kernel Streaming version of Audio Repeater application calls such drivers "legacy". On the contrary, drivers using WaveRT port interface support "looped streaming protocol" and are considered "realtime".

 

Link to comment
3 hours ago, yamamoto2002 said:

This is from the documentation of WaveRT port driver.

From https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/introducing-the-wavert-port-driver

Introducing the WaveRT Port Driver

  • Article
  • 12/15/2021
 

In Windows Vista and later operating systems, support is provided for a wave real-time (WaveRT) port driver that achieves improved performance but uses a simple cyclic buffer for rendering and capturing audio streams.

The improved performance of the WaveRT port driver includes the following characteristics:

  • Low-latency during wave-capture and wave-rendering

  • A glitch-resilient audio stream

Like the WaveCyclic and WavePci port drivers in earlier versions of Microsoft Windows, the WaveRT port driver provides the generic functionality for a kernel streaming (KS) filter. The WaveRT port driver provides support for audio devices that can do the following:

  • They can connect to a system bus, for example the PCI Express bus.

  • They can playback or record wave data (audio data that is described by a WAVEFORMATEX or WAVEFORMATEXTENSIBLE structure).

  • They can use the improved scheduling support that is available in Windows Vista, to reduce the latency of an audio stream.

If you want your audio device to take advantage of the improvements in audio offered in Windows, your audio device must be able to play or capture audio data with little or no intervention by the driver software during streaming. A properly designed audio device that uses the WaveRT port driver requires little or no help from the driver software from the time the audio stream enters the run state until it exits from that state.

The main client of the WaveRT port driver is the audio engine running in shared mode. For more information about the Windows Vista audio engine, see the Exploring the Windows Vista Audio Engine topic.

_____________________________________________________________________________________________________________

From https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/understanding-the-wavert-port-driver

Understanding the WaveRT Port Driver

  • Article
  • 12/15/2021
 

The WaveRT port driver combines the simplicity of the previous WaveCyclic port driver with the hardware-accelerated performance of the WavePci port driver.

The WaveRT port driver eliminates the need to continually map and copy audio data by providing its main client (typically, the audio engine) with direct access to the data buffer. This direct access also eliminates the need for the driver to manipulate the data in the audio stream. The WaveRT port driver thus accommodates the needs of the direct memory access (DMA) controllers that some audio devices have.

To distinguish itself from other wave-render and wave-capture devices, the WaveRT port driver registers itself under KSCATEGORY_REALTIME in addition to KSCATEGORY_AUDIO, KSCATEGORY_RENDER and KSCATEGORY_CAPTURE. This self-registration occurs during the installation of the adapter driver.

In Windows Vista and later operating systems, when the operating system starts and the audio engine is initialized, the audio engine enumerates the KS filters that represent the audio devices. During the enumeration, the audio engine instantiates the drivers for the audio devices that it finds. This process results in the creation of filter objects for these devices. For WaveRT audio devices, the resulting filter object has the following components:

  • An instance of the WaveRT port driver to manage the generic system functions for the filter

  • An instance of the WaveRT miniport driver to handle all the hardware-specific functions of the filter

After the filter object is created, the audio engine and the WaveRT miniport driver are ready to open an audio stream for the type of audio processing needed. To prepare the KS filter for audio rendering (playback), for example, the audio engine and the WaveRT miniport driver do the following to open a playback stream:

  1. The audio engine opens a pin on the KS filter, and the WaveRT miniport driver creates an instance of the pin. When the audio engine opens the pin, it also passes the wave format of the stream to the driver. The driver uses the wave format information to select the proper buffer size in the next step.

  2. The audio engine sends a request to the miniport driver for a cyclic buffer of a particular size to be created. The term cyclic buffer refers to the fact that when the buffer position register reaches the end of the buffer in a playback or record operation, the position register can automatically wrap around to the beginning of the buffer. Unlike the WaveCyclic miniport driver that sets up a contiguous block of physical memory, the WaveRT miniport driver does not need a buffer that is contiguous in physical memory. The driver uses the KSPROPERTY_RTAUDIO_BUFFER property to allocate space for the buffer. If the hardware of the audio device cannot stream from a buffer of the requested size, the driver works within the resource limitations of the audio device to create a buffer that is the closest in size to the originally requested size. The driver then maps the buffer into the DMA engine of the audio device and makes the buffer accessible to the audio engine in user-mode.

  3. The audio engine schedules a thread to periodically write audio data to the cyclic buffer.

  4. If the hardware of the audio device does not provide direct support for cyclic buffers, the miniport driver periodically reprograms the audio device to keep using the same buffer. For example, if the hardware does not support buffer looping, the driver must set the DMA address back to the start of the buffer each time it reaches the end of the buffer. This update can be done in either an interrupt service routine (ISR) or a high-priority thread.

The resulting configuration supplies a glitch-resilient audio signal on audio device hardware that either supports cyclic buffers or works with the miniport driver to regularly update its hardware.

To prepare a KS filter for audio capture (recording), the audio engine and the WaveRT miniport driver use similar steps to open a record stream.

One of the performance improvements provided by the WaveRT port driver is a reduction in the delay in the end-to-end processing of the audio stream during wave-render or wave-capture. This delay is referred to as stream latency.

For more information about these two types of stream latency, see the following topics.

For information about how to develop a WaveRT miniport driver that complements the WaveRT port driver, see the Developing a WaveRT Miniport Driver topic.

_____________________________________________________________________________________________

From https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/topology-port-driver

Topology Port Driver

  • Article
  • 12/15/2021

The Topology port driver exposes the topology of the audio adapter's mixing hardware. For example, the hardware that mixes the playback streams from the wave renderer and MIDI synthesizer in a typical adapter can be modeled as a set of control nodes (volume, mute, and sum) plus the data paths that connect the nodes. This topology is exposed as a set of controls and mixer lines by the Windows multimedia mixer API (see Kernel Streaming Topology to Audio Mixer API Translation). The adapter driver provides a corresponding Topology miniport driver that binds to the Topology port driver to form a topology filter.

The Topology port driver exposes an IPortTopology interface to the miniport driver. IPortTopology inherits the methods from base interface IPort; it provides no additional methods.

The Topology port and miniport driver objects communicate with each other through their respective IPortTopology and IMiniportTopology interfaces.

________________________________________________________________________________________________

From https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/low-latency-audio

Low Latency Audio

  • Article
  • 10/28/2022

In this article

  1. Terminology
  2. Windows audio stack
  3. Audio stack improvements
  4. API improvements
  5. Driver improvements
  6. Measurement tools
  7. Samples
  8. FAQ
 

This article discusses audio latency changes in Windows 10.

...

Declare the minimum buffer size

A driver operates under various constraints when moving audio data between Windows, the driver, and the hardware. These constraints may be due to the physical hardware transport that moves data between memory and hardware, or due to the signal processing modules within the hardware or associated DSP.

Beginning in Windows 10, version 1607, the driver can express its buffer size capabilities using the DEVPKEY_KsAudio_PacketSize_Constraints2 device property. This property allows the user to define the absolute minimum buffer size that is supported by the driver, and specific buffer size constraints for each signal processing mode. The mode-specific constraints need to be higher than the drivers minimum buffer size, otherwise they're ignored by the audio stack.

For example, the following code snippet shows how a driver can declare that the absolute minimum supported buffer size is 2 ms, but default mode supports 128 frames, which corresponds to 3 ms if we assume a 48-kHz sample rate.

C++
 
//
// Describe buffer size constraints for WaveRT buffers
//
static struct
{
    KSAUDIO_PACKETSIZE_CONSTRAINTS2 TransportPacketConstraints;
    KSAUDIO_PACKETSIZE_PROCESSINGMODE_CONSTRAINT AdditionalProcessingConstraints[1];
} SysvadWaveRtPacketSizeConstraintsRender =
{
    {
        2 * HNSTIME_PER_MILLISECOND,                // 2 ms minimum processing interval
        FILE_BYTE_ALIGNMENT,                        // 1 byte packet size alignment
        0,                                          // no maximum packet size constraint
        2,                                          // 2 processing constraints follow
        {
            STATIC_AUDIO_SIGNALPROCESSINGMODE_DEFAULT,          // constraint for default processing mode
            128,                                                // 128 samples per processing frame
            0,                                                  // NA hns per processing frame
        },
    },
    {
        {
            STATIC_AUDIO_SIGNALPROCESSINGMODE_MOVIE,            // constraint for movie processing mode
            1024,                                               // 1024 samples per processing frame
            0,                                                  // NA hns per processing frame
        },
    }
};

See the following articles for more in-depth information regarding these structures:

Also, the sysvad sample shows how to use these properties, in order for a driver to declare the minimum buffer for each mode.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...