Jump to content
IGNORED

HQ Player


Recommended Posts

@MiskaHi Miska

After months of running MCH DSD256 upsampling, I found an issue might related to your filters CUDA API call (or EC modulator?). I could reproduce this issue on both version (i.e. x86 and amd build). 

Here's the log:

7A8FC554-6C22-4DA4-8C80-ABD395C9CFA5_1_201_a.thumb.jpeg.dc9b6f23e659890124933e44dea7192b.jpeg

My system is 12th Gen i9 12900K with EVGA 3080Ti, Ubuntu Server 20.04 HWE, nVidia driver 510.47.03.

Music would stall at anytime using gauss-* series with any EC modulator for MCH DSD256 (non-EC could run 24x7 tho). I check the syslog showed Xid 31 error message which indicated GPU memory page fault and possible caused by user's app.

2 channels is fine. gauss-* series filters plus DSD5EC or AMSDM7EC can do DSD1024 with GPU 24x7 on my system. Such error only happened in multichannel.

Would you help me verify this issue? Thank you very much! Guapo

Link to comment
41 minutes ago, Miska said:

I suspect you are running out of GPU RAM... Since cudaMalloc() fails...

Here's the recent screenshot when testing x86 build / AVX512 performance so I disabled all E-cores.

8ch DSD256 from 48KHz source, using gauss-hires-ip + 5EC with IR wav room corrections.

Took less than 600MB of GPU memory... 

 40F1F7C9-B94C-4535-BFCA-C9F7C7C5E90B.thumb.jpeg.ad0c303637c7e0025d39fa8c7e3c12fe.jpeg

Link to comment
6 minutes ago, Miska said:

 

I'm a bit lost on what case fails then? What is the difference between failing and non-failing cases?

Previous screenshot was one of the failing case. Just let you know the GPU usage. Music played couple of minutes then stalled. Have to restart hqplayer daemon to recover.

Link to comment
10 minutes ago, Miska said:

 

OK, a bit of mystery. From the CUDA code point of view, it doesn't know if there are two channels or 8. But memory consumption is certainly 4x higher on 8 channels than on 2 channels.

Understand. Here's the screenshot when using non-EC doing 12ch DSD256 without any issues. Could run 24x7. GPU memory utilization was still low.

A3A8097B-F113-4199-A9D1-6ACE2FEAED1E_1_201_a.thumb.jpeg.7802ec057e234a110903a269d8171b85.jpeg

 

Issue only happened when using any EC modulators doing 6 or 8ch DSD256 (beyond this channel number only non-EC could do DSD256 on my system).

Link to comment
8 minutes ago, Miska said:

That is even more strange, because that doesn't affect GPU work at all. (modulators don't use GPU)

Understand GPU is not related to modulators so I feel so loss when I meet this issue.😅 Really have no idea which part / step I missed. I use non-EC for MCH music playback so far but I really like the 5EC's SQ on Anubis...

Link to comment
58 minutes ago, bobflood said:

You need to get it up to about 4 Ghz to get DSD256.

It wasn’t CPU frequency issue in my case… it’s about GPU’s API call. 
3~4GHz is required by AMD build for 8 channels DSD256 for sure. But 2~3GHz is sufficient for x86 build in the same scenario on the 12th Gen i9 12900K. AVX512 really shines in this part.
If talking about maximum performance for 2 channels, AMD build is the winner. 5.1GHz can do DSD1024 or x48 with DSD5EC or AMSDM7EC modulators. 😄

Link to comment
19 minutes ago, Fredc said:

Quite a few dacs out there have galvanize isolated usb input (e. g. Chord Qutest, ifi pro idsd). Wouldn't that do?

You need USB cable to send 5V to your DAC because your DAC’s USB controller chip needs it. It’s not just for galvanic isolations.

Link to comment
24 minutes ago, Miska said:

According to information I have found, it shouldn't. But the information is scarce so it is hard to say for sure until someone tests and reports.

My previous RTX3070 / 8GB GPU was LHR version. I didn't see large performance penalty compare to my current 3080ti / 12GB. Limitation was VRAM, not GPU's computing power.

Here's my old test data, 3070 LHR doing AMSDM7EC -> DSD1024

790464553_Image2022-4-11at7_19PM.thumb.jpg.b0da6bf6305b1192890405393229e701.jpg

 

Here's my current 3080ti no LHR

311058125_Image2022-4-11at7_20PM.thumb.jpg.be5cc94bf4e0f5fb79d1a00f3c2a3d03.jpg

Link to comment
  • 2 weeks later...
56 minutes ago, pis99 said:

I have 12900K running all cores at 5.3, DDR5 at 5200 with CUDA off load from RTX-A6000. I can only do this with DSD5EC module.

Just curious... What brand/model of the Z690 motherboard you're using? Your spec is better than mine but I can use DSD5EC upsampling to DSD1024 / x48 with all filters except 1x sinc-L due to GPU 12GB memory limit and DSD1024 / x48 using 80% filters with AMSDM7EC...

Glad I left a message on the 1000 page!

Link to comment
  • 3 weeks later...
3 minutes ago, sledwards said:

Is the initial Jammy release of desktop CUDA compatible?

I'm not sure about 22.04 Desktop but Server is fully CUDA compatible.

 

4 minutes ago, sledwards said:

Any reason I should not use 22.04 LTS instead of 20.04?

I personally suggest 22.04. Better support for Z690.

Link to comment
4 hours ago, sledwards said:

Does the jammy version of embedded (hqplayerd_4.32.0-134avx2_amd64.deb) require Rocm to function with CUDA?

This initial release does not include ROCm (Jammy not supported yet, even no pgp sign data) so it's not required for HQPe installation.

Link to comment
  • 2 weeks later...
42 minutes ago, GMG said:

Any benefit or is the running the additional thread for the NAA just taking up resources?

I'm using this way. Because USB thru ALSA would increase the utilization rate of the primary cores. Let NAA handle the USB using least usage cores could squeeze more MHz on primary cores for higher DSD rate. 

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...