Best Nvidia CUDA Card for HQPlayer

Miska · July 26, 2017

5 hours ago, sbenyo said:

Is it for sure that dual-gpu (e.g. gtx 1070, 1080) is not supported?

At least not explicitly. Depending on how Nvidia implemented some of the CUDA functionality, multiple GPUs could possibly work when you have also convolution enabled. But since I don't have any multi-GPU machines I have not checked whether this happens or not.

john925 · August 4, 2017

Is eGPU possible for cuda offload?

mirekti · August 4, 2017

28 minutes ago, john925 said:

Is eGPU possible for cuda offload?

I'd say it is, but:

1. TB3 has its limits and you loose some of your GPU power. From what I've read, the better the card the more you lose.

2. External GPU cases are loud (in case you plan to keep it in the same room).

3. The eGPU cases are expensive.

bigbear2003 · August 24, 2017

I tried the 940mx on my lenovo laptop and i dont see any improvement on cpu util.

Miska · August 30, 2017

On 8/24/2017 at 5:11 AM, bigbear2003 said:

I tried the 940mx on my lenovo laptop and i dont see any improvement on cpu util.

What is your output rate and filter settings?

Ipoci · September 12, 2017

On 8/24/2017 at 4:11 AM, bigbear2003 said:

I tried the 940mx on my lenovo laptop and i dont see any improvement on cpu util.

Did you check with <nvidia-smi> if hqplayer is using or not the card?

My laptop is HP zBook G2 17" with i7-4940MX and Nvidia Quadro K5100M. In my case, hqp is always using Nvidia card or it seems to use it even if I deselect "cuda offload" ... So CPU% is always the same on the different cores 1-3% for plain DSD64

Have a nice day, Massimiliano

ferenc · October 1, 2017

On 2017. 07. 26. at 10:40 PM, Miska said:

At least not explicitly. Depending on how Nvidia implemented some of the CUDA functionality, multiple GPUs could possibly work when you have also convolution enabled. But since I don't have any multi-GPU machines I have not checked whether this happens or not.

This should be interesting with 16x 1080 Ti cards for example :

https://www.onestopsystems.com/product/4u-value-gpu-accelerator-system

yamamoto2002 · December 20, 2017

It seems double precision arithmetic of Titan V card is 22 times faster than GTX 1080.

Hoang Anh · May 11, 2018

Since My PC (core i5 6400; 12GB RAM) doesn't support well GTX 1060 (need more power) I intent to use gtx 1050 with it. I want to ask is that any benefit. Since I know that too slow GPU will necklack HQplayer, I want to ask will be an option to choose how many % to seperate work for GPU and CPU. To make sure it work best.

Hoang Anh · May 13, 2018

Hope amd or older Quadro (6000 for example, used one price very low now, and have good fp64) will support in next version Hqplayer. There wI'll be more chooice.

yamamoto2002 · June 12, 2018

This is my Titan V result.

About 6 TFLOPS doubleprec, One-seventh of Earth Simulator Gen1 Supercomputer .

I hope upcoming Geforce Volta products may have some doubleprec capability

louisxiawei · June 12, 2018

26 minutes ago, yamamoto2002 said:

This is my Titan V result.

About 6 TFLOPS doubleprec, One-seventh of Earth Simulator Gen1 Supercomputer .

I hope upcoming Geforce Volta products may have some doubleprec capability

What a beast! Have you run any HQplayer heavy filter setting with it? Something like upsampling 44.1/16 → 48 x 512 using poly-sinc-xtr filter?

Meanwhile AMD and Intel are having the "multi-core" CPU competition. All good for HQplayer

yamamoto2002 · June 14, 2018

On 6/12/2018 at 11:31 PM, louisxiawei said:

What a beast! Have you run any HQplayer heavy filter setting with it? Something like upsampling 44.1/16 → 48 x 512 using poly-sinc-xtr filter?

No.

It seems, in order to run CUDA programs on Volta, programs should be compiled using latest version of CUDA Toolkit, which dropped support of older Fermi based GPUs such as Geforce GTX 580 or Quadro 6000. I'm not sure this affects HQP

> Meanwhile AMD and Intel are having the "multi-core" CPU competition. All good for HQplayer

Yes it is good thing. On Windows, non-processor-group-aware apps can handle up to 64 core (or 64 hyper-thread). Process affinity mask is 64bit (one bit is associated to one core(or hyper-thread), so it can express up to 64 core(or hyper-thread)). With 32 core 64 thread CPU, all the available affinity mask bit is used and free performance improvement of multi thread app by increasing CPU core ends there. If this trend continues and say 64 core 128 thread CPU is arrived, app should be rewritten to use multiple processor groups to squeeze all the CPU resource.

Miska · June 15, 2018

10 hours ago, yamamoto2002 said:

It seems, in order to run CUDA programs on Volta, programs should be compiled using latest version of CUDA Toolkit, which dropped support of older Fermi based GPUs such as Geforce GTX 580 or Quadro 6000. I'm not sure this affects HQP

Latest HQPlayer Desktop 3.21 is compiled with latest CUDA 9.2. But already earlier versions compiled with CUDA 9.1 had full support for Volta. Availability of latest CUDA toolkit actually delayed my release, because Microsoft's update to Visual Studio 2017 broke the previous CUDA toolkit version... But the CUDA-Z test application you are running is compiled against CUDA 5 or 6 or something really old.

10 hours ago, yamamoto2002 said:

Yes it is good thing. On Windows, non-processor-group-aware apps can handle up to 64 core (or 64 hyper-thread). Process affinity mask is 64bit (one bit is associated to one core(or hyper-thread), so it can express up to 64 core(or hyper-thread)). With 32 core 64 thread CPU, all the available affinity mask bit is used and free performance improvement of multi thread app by increasing CPU core ends there. If this trend continues and say 64 core 128 thread CPU is arrived, app should be rewritten to use multiple processor groups to squeeze all the CPU resource.

Luckily I have very little Windows specific code and no limitations for number of CPU cores. So no need to rewrite anything...

And I can also warmly recommend using Linux...

yamamoto2002 · June 15, 2018

3 hours ago, Miska said:

But the CUDA-Z test application you are running is compiled against CUDA 5 or 6 or something really old.

Thanks for your reply. I understood about CUDA binary forward compatibility and things are cleared up:

https://docs.nvidia.com/cuda/volta-compatibility-guide/index.html

Quote

Applications that already include PTX versions of their kernels should work as-is on Volta-based GPUs. Applications that only support specific GPU architectures via cubin files, however, will need to be updated to provide Volta-compatible PTX or cubins.

So, CUDA-Z contains PTX binary for forward compatibility and if you are lucky enough, PTX code runs well on future architecture.

yamamoto2002 · June 24, 2018

On 6/15/2018 at 6:29 PM, Miska said:

Luckily I have very little Windows specific code and no limitations for number of CPU cores. So no need to rewrite anything...

Last I checked, on Windows, one process can handle up to 64 HT. So, in order to handle 128 HT, another worker process should be created and each process create 64 threads, and two processes communicate with inter-process communication. This is significant rewrite from casual multi threading code of Sunday programmer

Miska · June 24, 2018

2 hours ago, yamamoto2002 said:

Last I checked, on Windows, one process can handle up to 64 HT. So, in order to handle 128 HT, another worker process should be created and each process create 64 threads, and two processes communicate with inter-process communication. This is significant rewrite from casual multi threading code of Sunday programmer

I'm not casual multi-threading Sunday programmer... I also know how to program HPC clusters / supercomputers.

But I still recommend going for Linux if you have anything more than a traditional average PC. It is not my programming, it is about Microsoft's.

Anyway, this is going off-topic for CUDA & HQPlayer.

yamamoto2002 · June 30, 2018

Thank you for your reply. I found Linux has much better multi threading support for casual Sunday programmer. Also there are several cross platform library to do overcome this kind of OS specific quirky. And sorry for topic drift.

2a3set · July 20, 2018

Wondering how to enable K20 for cuda offload correctly in ubuntu 16.

hqplayer process is using correct gpu, but with cuda offload it hiccups more (using e5-2680v2 10 core xeon)

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
| 1 4660 C /usr/bin/hqplayer 95MiB |
+-----------------------------------------------------------------------------+

Whitigir · December 2, 2018

I am going to buy a cheap CUDA 1030 with passive heatsink and hope it can do some bidding here

Miska · December 4, 2018

On 7/20/2018 at 10:32 PM, 2a3set said:

Wondering how to enable K20 for cuda offload correctly in ubuntu 16.

hqplayer process is using correct gpu, but with cuda offload it hiccups more (using e5-2680v2 10 core xeon)

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.37 Driver Version: 396.37 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GT 710 Off | 00000000:02:00.0 N/A | N/A |
| 40% 41C P8 N/A / N/A | 95MiB / 2000MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K20Xm Off | 00000000:04:00.0 Off | Off |
| N/A 77C P0 59W / 235W | 106MiB / 6083MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
| 1 4660 C /usr/bin/hqplayer 95MiB |
+-----------------------------------------------------------------------------+

Is this with latest HQPlayer version? If that is the case, your driver is too old (396), you need latest driver (>= 410) for CUDA 10 support. Does HQPlayer tell that the offload is enabled when you start playback?

Now the GPU utilization is shown as 0% so things are probably working as they should...

Miska · December 4, 2018

5 hours ago, Miska said:

Now the GPU utilization is shown as 0% so things are probably working as they should...

Ehh; should be "not working as they should"...

simonklp · December 6, 2018

On Tuesday, December 04, 2018 at 11:14 PM, Miska said:

Ehh; should be "not working as they should"...

Hi Miska, for HQP Desktop, I understand that there is a message at the bottom left corner telling us that it is enabled during the first few seconds when the music starts playing.

But for HQPE, how do we know that the CUDA offload is working properly? Thanks.

Miska · December 6, 2018

3 hours ago, simonklp said:

But for HQPE, how do we know that the CUDA offload is working properly?

I've now added indication about this on the front page status table. And also improved logging about this.

simonklp · December 6, 2018

26 minutes ago, Miska said:

I've now added indication about this on the front page status table. And also improved logging about this.

Hi Miska, noted with thanks. Do you mean that the front page of web configuration includes this status table?

Best Nvidia CUDA Card for HQPlayer

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Create an account or sign in to comment

Create an account

Sign in