Jump to content
IGNORED

HQ Player


Recommended Posts

1 hour ago, Bob Stern said:

I infer full TCP/IP stack offloading is implemented in the ethernet controllers used in all Macs and in modern Synology and QNAP NAS's?

 

For NAS side it doesn't matter because CPU there only runs the NAS file service functions and nothing else. Instead of NAS I just run a regular PC server because I want to have some extra functions the NAS devices don't usually offer, such as full disk encryption.

 

I have not checked what ethernet controllers Macs use these days... My old Mac Mini has Broadcom controller, but I have not paid much attention to what offload features it has, since the machine has the external Thunderbolt HDD for content.

 

My Windows-based server has Qualcomm Atheros Killer E2201 on the Gigabyte motherboard. It is a bit older model, here you can find out about their current lineup:

https://www.killernetworking.com

This is probably closest in their current models:

https://www.killernetworking.com/products/killer-e2500/

 

On many boards Intel's I210-family is quite popular:

https://www.intel.com/content/www/us/en/embedded/products/networking/i210-ethernet-controller-family-brief.html

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
On 4/17/2019 at 7:32 AM, Miska said:

 

OK, strange, I just checked and I don't get anything in /var/log/syslog or "journalctl -a" on Ubuntu Studio 18.04. In fact, normal userid shouldn't be able to spam journal/syslog in first place due to access right restrictions...

 

Hi Miska,

 

I'm very sorry for posting my premature conclusion on the suggestion from you. Later, I again tried it and I'm very happy to let you know that the problem has completely gone. The initial failure to try your suggestion was simply due to my wrong presumption that the change of environmental variable will take effect on the fly. After restarting the HQP, I confirmed the log messages were totally silenced.

 

Thank you very much.

 

# BTW I'll report my positive result of the i7-9700k CPU dealing with poly-sinc-xtr filter for DSD512 without cuda support in another thread later.

Link to comment
On 4/17/2019 at 8:51 PM, Miska said:

 

It performs pretty nicely also at 705.6/768k, but then you have it's digital volume control and modulator on the path. Since it is SDM DAC, there's no direct path for PCM.

 

No need to set DAC Bits, it can accept 32-bit and that is best input for it's internal DSP.

 

I see that you actually own a ADI 2 dac, is there a big difference in sound between sending PCM X16 Vs DSD256 via hqplayer or is it subtle improvements only?

Link to comment
4 hours ago, Yviena said:

I see that you actually own a ADI 2 dac, is there a big difference in sound between sending PCM X16 Vs DSD256 via hqplayer or is it subtle improvements only?

 

I have actually two ADI-2 Pro's. Objectively the difference gets smaller at 16x PCM rates. I don't really listen it much in PCM mode, so it is hard to say much about how the sound differs, I mostly run it at DSD256 in DSD Direct mode. After quick refresh listen, to me PCM sounds a little flatter/shallower which is quite typical. But you really need to listen yourself. Overall, ADI-2 is very neutral.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
9 hours ago, Miska said:

 

I have actually two ADI-2 Pro's. Objectively the difference gets smaller at 16x PCM rates. I don't really listen it much in PCM mode, so it is hard to say much about how the sound differs, I mostly run it at DSD256 in DSD Direct mode. After quick refresh listen, to me PCM sounds a little flatter/shallower which is quite typical. But you really need to listen yourself. Overall, ADI-2 is very neutral.

 

I see... do you know if it also supports 48khz family DSD?

Link to comment
4 hours ago, Yviena said:

I see... do you know if it also supports 48khz family DSD?

 

Yes, both 44.1 and 48 families work.

 

If you like to compare PCM and DSD Direct, set volume control to -3.5 dB. This gives you same output level for both PCM and DSD. Since DSD spec allows for +3.15 dB short term peaks, this is sensible headroom choice from AKM.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
3 hours ago, Miska said:

 

Yes, both 44.1 and 48 families work.

 

If you like to compare PCM and DSD Direct, set volume control to -3.5 dB. This gives you same output level for both PCM and DSD. Since DSD spec allows for +3.15 dB short term peaks, this is sensible headroom choice from AKM.

 

BTW miska about the thread scheduling issue with closed-form+ DSD256 it seems that it also negatively affects the GPU output somehow as i get graphical corruption on my 4k monitor  unless i untick cores 0-4 while gpu is under load, i have tried changing DP cable, and borrowing another nvidia GPU but only unticking the cores or changing to a different filter fixed it.

I initially thought it was my monitor dying but it was only happening while hqplayer was active.

 

 

Link to comment
7 hours ago, Yviena said:

BTW miska about the thread scheduling issue with closed-form+ DSD256 it seems that it also negatively affects the GPU output somehow as i get graphical corruption on my 4k monitor  unless i untick cores 0-4 while gpu is under load, i have tried changing DP cable, and borrowing another nvidia GPU but only unticking the cores or changing to a different filter fixed it.

I initially thought it was my monitor dying but it was only happening while hqplayer was active.

 

I have never seen that kind of effect in my four machines where I test CUDA offload. But what do you mean by unticking cores?

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
3 hours ago, Miska said:

 

I have never seen that kind of effect in my four machines where I test CUDA offload. But what do you mean by unticking cores?

 

I mean threads 0-4 in  process affinity/task manager for hqplayer process, I think the corruption happens because 4096x2160 is at the edge of what DP can do, I had a hard time finding a cable that actually worked, but I've confirmed that it only happens with the closed form+DSD256, and it happens on both of my PC's.

 

Is there anything I can do to capture the threading issue, and send it to you or does that need special software?

 

 

Link to comment
10 hours ago, Yviena said:

I mean threads 0-4 in  process affinity/task manager for hqplayer process, I think the corruption happens because 4096x2160 is at the edge of what DP can do, I had a hard time finding a cable that actually worked, but I've confirmed that it only happens with the closed form+DSD256, and it happens on both of my PC's.

 

Mmh, sounds like you are messing with something you shouldn't be doing. What OS are you using? None of my supported OS have such controls.

 

10 hours ago, Yviena said:

Is there anything I can do to capture the threading issue, and send it to you or does that need special software?

 

Yes, use a supported OS, in supported form and likely the problem doesn't appear.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

Miska, et al,

 

My DSD collection has one master folder.  Inside that folder are 233 sub folders.  And there are an unknown number of sub folders within those 233 folder.  When I do "scan tree", HQPlayer is not importing all of my DSD albums.  How can I adjust the import settings to scan multiple folders deep?

 

Thanks

Link to comment
48 minutes ago, Miska said:

 

Mmh, sounds like you are messing with something you shouldn't be doing. What OS are you using? None of my supported OS have such controls.

 

 

Yes, use a supported OS, in supported form and likely the problem doesn't appear.

 

Uhhm windows has had thread/core affinity since the vista days you can find the option in the windows task manager > details, I'm using windows 10 1709, and the problem happens even on a clean install so can't really be anything i'm doing.

Link to comment
8 hours ago, Yviena said:

Uhhm windows has had thread/core affinity since the vista days you can find the option in the windows task manager > details, I'm using windows 10 1709, and the problem happens even on a clean install so can't really be anything i'm doing.

 

OK, maybe that messes up the CUDA framework/driver. It goes to same category as touching the process priority because it screws up HQPlayer's internal thread priority management. I have not seen such with stock Windows 10 install and not touching anything through task manager.

 

Do you have latest Nvidia driver?

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
8 hours ago, DancingSea said:

My DSD collection has one master folder.  Inside that folder are 233 sub folders.  And there are an unknown number of sub folders within those 233 folder.  When I do "scan tree", HQPlayer is not importing all of my DSD albums.  How can I adjust the import settings to scan multiple folders deep?

 

Scanning traverses the entire tree, including all sub folders without any limitation on depth. If something is not imported, the reason is somewhere else. For example if the files are DST compressed DFF they are ignored.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
8 hours ago, Miska said:

 

OK, maybe that messes up the CUDA framework/driver. It goes to same category as touching the process priority because it screws up HQPlayer's internal thread priority management. I have not seen such with stock Windows 10 install and not touching anything through task manager.

 

Do you have latest Nvidia driver?

 

Yes i have the 425.xx driver it's happening when i don't touch the core/thread affinity in task manager, but like i said unchecking the first 4 threads aka core 0+SMT+core1+SMT does fix it, also turning CUDA offload on also fixes it.

after further testing it does seem to be a  issue with the first 4 threads/cores as unchecking threads 2-3 greatly alleviates the problem so in my case.

 

Closed-form filters + DSD256.

1: all threads ticked = worst behavior, stuttering in games , frame skipping in videos etc,

2: cuda offload enabled, = OK

3: threads 0-3 unticked in hqplayer process = OK

4. Linux = OK

It's weird that you can't reproduce it on your setup, but maybe there is something on my setup that is triggering it even on a clean install... as both of my processors have the issue 2700x+1600x.

Link to comment

Hello. New to hqplayer here. I have maybe a bit specific question about hqplayer's performance (I am a computer eng. working with software but not in digital signal processing field). I am using hqplayer on an old macbook air, mid-2011, i7-2677M, two cores.

 

This cpu is sandy bridge, it has AVX (but not AVX2/FMA). As far as I can see FP MUL and ADD are at different ports, so I assume 4 double precision add and mul can be performed at each cycle. Its base frequency is 1.8Ghz. so 1.8e+9 * 2 (cores) * 4 =  lets say approx 16 billion, 16e+9 double precision add+mul can be performed every second.

 

Using for example sinc-M/1M length filter, operating on 44.1/stereo upsampling 16x, so if I am not wrong (assuming only 1/16 of samples have non-zero value in zero inserted 16x oversampled data stream) 1M/16 = 62500 double precision add+mul is to be done for every sample in upsampled data stream, which is 705.6/stereo so 1411200 samples per second. So the total number of operations needed per second is 62500 ops * 1411200 samples/sec is approx 88 billion, 88e+9 ops/sec.

 

So my question is. Selecting sinc-M filter on hqplayer and playing a 44100/stereo song upsampling to 705.6 uses ~30% of the cpu (top output, so considering two cores, it is actually 15%), which according to above calculation is well above what can be done with this cpu even in ideal conditions. How does this even work ? Does hqplayer always use FFT convolution so my calculation above is actually not applicable ? If so, is there any disadvantage of using FFT convolution rather than direct convolution ? 

Link to comment
4 hours ago, Yviena said:

Yes i have the 425.xx driver it's happening when i don't touch the core/thread affinity in task manager, but like i said unchecking the first 4 threads aka core 0+SMT+core1+SMT does fix it, also turning CUDA offload on also fixes it.

after further testing it does seem to be a  issue with the first 4 threads/cores as unchecking threads 2-3 greatly alleviates the problem so in my case.

 

Closed-form filters + DSD256.

1: all threads ticked = worst behavior, stuttering in games , frame skipping in videos etc,

2: cuda offload enabled, = OK

3: threads 0-3 unticked in hqplayer process = OK

4. Linux = OK

It's weird that you can't reproduce it on your setup, but maybe there is something on my setup that is triggering it even on a clean install... as both of my processors have the issue 2700x+1600x.

 

Sounds like some serious OS/driver problem. What GPU model do you have? Overall the CPU + GPU combination doesn't sound like some extraordinarily rare piece. I would expect Nvidia to test their drivers and frameworks on both Intel and AMD platforms. Do you have more than one 16x PCIe slot? Are you using the primary PCIe slot designated for GPU?

 

I have one Ryzen 7 system with first generation Ryzen (on ASRock mobo) and IIRC, GTX 1070. I'll test this again tomorrow, but I don't remember seeing any problems with games or HQPlayer on it. The machine has both Windows 10 and Ubuntu 18.04 LTS to test on. Display output corruption sounds like something going seriously wrong with the GPU. Most likely a cache coherency issue where CPU/GPU caches are not properly flushed and synchronized.

 

If Linux works fine (assuming you have new enough Nvidia drivers there too and thus CUDA available), it indicates that it is certainly not a hardware problem, but instead some software problem.

 

To minimize variables it is good to stick with stock Win 10 + latest Nvidia drivers + latest HQPlayer. This way the environment is deterministic and known for Nvidia drivers/framework and HQPlayer.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
1 hour ago, mete said:

So my question is. Selecting sinc-M filter on hqplayer and playing a 44100/stereo song upsampling to 705.6 uses ~30% of the cpu (top output, so considering two cores, it is actually 15%), which according to above calculation is well above what can be done with this cpu even in ideal conditions. How does this even work ? Does hqplayer always use FFT convolution so my calculation above is actually not applicable ? If so, is there any disadvantage of using FFT convolution rather than direct convolution ? 

 

Not always, but sinc-M is processed as complex convolution. Disadvantage is that it limits possible rate conversion factors.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
2 hours ago, Miska said:

 

Sounds like some serious OS/driver problem. What GPU model do you have? Overall the CPU + GPU combination doesn't sound like some extraordinarily rare piece. I would expect Nvidia to test their drivers and frameworks on both Intel and AMD platforms. Do you have more than one 16x PCIe slot? Are you using the primary PCIe slot designated for GPU?

 

I have one Ryzen 7 system with first generation Ryzen (on ASRock mobo) and IIRC, GTX 1070. I'll test this again tomorrow, but I don't remember seeing any problems with games or HQPlayer on it. The machine has both Windows 10 and Ubuntu 18.04 LTS to test on. Display output corruption sounds like something going seriously wrong with the GPU. Most likely a cache coherency issue where CPU/GPU caches are not properly flushed and synchronized.

 

If Linux works fine (assuming you have new enough Nvidia drivers there too and thus CUDA available), it indicates that it is certainly not a hardware problem, but instead some software problem.

 

To minimize variables it is good to stick with stock Win 10 + latest Nvidia drivers + latest HQPlayer. This way the environment is deterministic and known for Nvidia drivers/framework and HQPlayer.

 

Hmm further testing it seems like that it also happens with the xtr filter but less often, i tested another game and there where no issues there no need to mess with threading, but the game where there are problems, and i need to mess with threading is Path of exile, so i think there is some specific games that does it's own thing in thread scheduling leading to problems for other software.

 

The gpu is a 1080ti, and it's attached to the first pcie slot.

With nothing changed in taskmanager just giving focus to the poe game lowers the cpu usage of hqplayer to 8% so yeah i think i found the cause.... and it's the game itself.

Atleast you can test it, and see if the same happens to you as it's free.

Link to comment
11 hours ago, Yviena said:

With nothing changed in taskmanager just giving focus to the poe game lowers the cpu usage of hqplayer to 8% so yeah i think i found the cause.... and it's the game itself.

 

But this happens with HQPlayer alone, without running any other applications/games?

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
2 hours ago, Miska said:

 

But this happens with HQPlayer alone, without running any other applications/games?

 

Running HQplayer itself is fine, seems it is the game itself that somehow triggers HQplayer CPU limiting.to 8-10% even if the CPU load is not maxed out.

Link to comment
2 hours ago, Yviena said:

Running HQplayer itself is fine, seems it is the game itself that somehow triggers HQplayer CPU limiting.to 8-10% even if the CPU load is not maxed out.

 

Or is it maxing out the GPU instead, if you have CUDA offload enabled?

 

Prior to Pascal generation (GTX 1080), Nvidia's handling of multitasking with GPUs was quite poor. They have further improved this on Turing (RTX 2080). This handling may also depend on whether games use Direct 3D vs OpenGL, or whether they use things like PhysX.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
10 minutes ago, Miska said:

 

Or is it maxing out the GPU instead, if you have CUDA offload enabled?

 

Prior to Pascal generation (GTX 1080), Nvidia's handling of multitasking with GPUs was quite poor. They have further improved this on Turing (RTX 2080). This handling may also depend on whether games use Direct 3D vs OpenGL, or whether they use things like PhysX.

GPU usage is around 30% (game+HQplayer)with cuda offload on.

 

Edit: turning gsync off fixes the graphical corruption that occurs when HQplayer is sending DSD while the game is active, but it doesn't fix the capping of CPU usage  while the game is focused when running xtr filter.

So monitor corruption is fixed now but the other issue remains.

 

I also  tried running some other free game (Dota) just to confirm it doesn't happen in another game  and the HQplayer CPU usage capping doesn't happen.

 

 

Link to comment
4 hours ago, Yviena said:

GPU usage is around 30% (game+HQplayer)with cuda offload on.

 

Edit: turning gsync off fixes the graphical corruption that occurs when HQplayer is sending DSD while the game is active, but it doesn't fix the capping of CPU usage  while the game is focused when running xtr filter.

So monitor corruption is fixed now but the other issue remains.

 

I also  tried running some other free game (Dota) just to confirm it doesn't happen in another game  and the HQplayer CPU usage capping doesn't happen.

 

I think this is likely all related to Nvidia driver and the way each piece of software is using the GPU. Combination seems to be something Nvidia hasn't been fully optimizing/testing.

 

Unfortunately trying to report such to Nvidia in a way that it would actually reach the developers is pretty futile. I've tried it myself couple of times. It gets brickwalled at the first level support tier who doesn't have slightest clue about anything really technical.

 

Root is that likely Nvidia doesn't expect CUDA users to really run anything else than a CUDA application. So running something else seriously GPU loading at the same time with CUDA is likely outside of their "typical use cases" list.

 

Here GPU is trying to switch between visual GPU and GPGPU tasks and the two end up disturbing each other. Possible solution would be to have more than one GPU, but that would require that I'd add option for explicitly selecting the GPU to use (doable). While the one used for visual tasks is the one that has display attached. I've only shortly done some small tests to connect display to the i7-7700K GPU and have Nvidia GPU without any display (this doesn't need extra software hassle). But with number of motherboards one could actually have two Nvidia cards as well (this needs some config options). For this to work they would need to be NOT set up as SLI/NVlink.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...