wealas Posted March 5, 2019 Share Posted March 5, 2019 Just setup a Titan Z with HQPe 4.9 on ubuntu. The system shows two GPUs and only one is used by HQPe. Shows about 25% GPU load from time to time when doing 2 channel 44.1 PCM to DSD256 using convolution. +-----------------------------------------------------------------------------+ | NVIDIA-SMI 418.43 Driver Version: 418.43 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX TIT... Off | 00000000:03:00.0 Off | N/A | | 55% 82C P2 74W / 189W | 943MiB / 6083MiB | 25% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce GTX TIT... Off | 00000000:04:00.0 Off | N/A | | 42% 66C P8 35W / 189W | 11MiB / 6083MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 645 C /usr/bin/hqplayerd 932MiB | +-----------------------------------------------------------------------------+ Link to comment
acousticmood Posted March 1, 2020 Share Posted March 1, 2020 Old topic but here goes. I have an i7 quad core machine (sonic transporter windows version) from little green computer. I’m looking for more processing power to run higher sampling rates (limited right now to 512) can someone recommend a card from nvdia’s current lineup? thanks for your help. Link to comment
Miska Posted March 1, 2020 Share Posted March 1, 2020 3 hours ago, acousticmood said: Old topic but here goes. I have an i7 quad core machine (sonic transporter windows version) from little green computer. I’m looking for more processing power to run higher sampling rates (limited right now to 512) can someone recommend a card from nvdia’s current lineup? Are you looking to go to DSD1024 or what is the goal exactly? Any details on the DAC to be used? Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
acousticmood Posted March 2, 2020 Share Posted March 2, 2020 Using an ultra Rendu feed8ng an Oppo 205 usb Dac. This unit will take dsd 512 and pcm up to 768. I can only get 512 pcm. So 8 don’t need dsd 1024 but I would like to get as much as what my dac will handle thanks Link to comment
Miska Posted March 2, 2020 Share Posted March 2, 2020 For higher rates the CPU usually becomes limiting factor and GPU cannot solve that problem. GPU helps on running demanding filters and convolution engine. But if CPU falls short on clock cycles per output sample for the modulator... Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
CharlieMB Posted March 16, 2020 Share Posted March 16, 2020 Does the amount of memory matter? 3, 6, 8 GB ? Perhaps more memory is not better and adds heat for nothing. Link to comment
Miska Posted March 17, 2020 Share Posted March 17, 2020 3 hours ago, CharlieMB said: Does the amount of memory matter? 3, 6, 8 GB ? Perhaps more memory is not better and adds heat for nothing. You mean GPU memory? In typical stereo cases without digital room correction, HQPlayer uses less than 1 GB of GPU RAM. Then something else may use it too. Typically however amount of memory depends on the GPU model, slower ones have also less RAM. Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
CharlieMB Posted March 17, 2020 Share Posted March 17, 2020 Yes, I meant GPU memory. Thanks. So then I assume 4GB should be safe, and 2GB iffy. Right now I have 96/24 PCM to DSD256 running Poly-sinc-xtr-2s and DSD7-256+fs with only a Nvidia Quadro K620 and Intel i7/3930K @ 3.2 GHz. Redbook also works at those settings. This is the best sound I've had to date. This card is only 24 GFlops of FP64 performance and what's nice is it only burns 45 watts. But it won't do these settings with ASDM7-512+. I'd like to try something around 5x that performance or ~ 125 GFlops, which is about 1/3 of a GFX 1080, and less power consumption than a 1080. The purpose is to try more of the one stage filters or get further toward DSD512 (some work at 512 even now). From this tread I know that I need to look for graphics cards / drivers supporting CUDA 3.0 or above. If I install a second card, will HQPlayer allow me to choose which card to offload to? And if not, will it pick automatically the more powerful card? Thanks again, and in advance Link to comment
Miska Posted March 18, 2020 Share Posted March 18, 2020 4 hours ago, CharlieMB said: Right now I have 96/24 PCM to DSD256 running Poly-sinc-xtr-2s and DSD7-256+fs with only a Nvidia Quadro K620 and Intel i7/3930K @ 3.2 GHz. Redbook also works at those settings. This is the best sound I've had to date. This card is only 24 GFlops of FP64 performance and what's nice is it only burns 45 watts. But it won't do these settings with ASDM7-512+. If you only changed modulator and not output rate, then this is not related to the GPU but instead CPU. So it is good to check with some light filters first that you CPU can deal with the modulators you want. 4 hours ago, CharlieMB said: I'd like to try something around 5x that performance or ~ 125 GFlops, which is about 1/3 of a GFX 1080, and less power consumption than a 1080. The purpose is to try more of the one stage filters or get further toward DSD512 (some work at 512 even now). Since doubling output rate about doubles the processing load, that should do. But some filters like single stage xtr are heavy. Even more so if you cannot stay within same rate family but instead need to go from 48/96/192k to 44.1x512 output. This depends on the DAC, so good to take this detail into account. 4 hours ago, CharlieMB said: If I install a second card, will HQPlayer allow me to choose which card to offload to? And if not, will it pick automatically the more powerful card? There is now environment variable DSP_CUDA_DEVICE that allows you to specify numeric id of the card you want to use. If not specified, the CUDA framework picks up the card it thinks is best for the purpose. Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
CharlieMB Posted March 19, 2020 Share Posted March 19, 2020 On 3/17/2020 at 9:10 PM, Miska said: If you only changed modulator and not output rate, then this is not related to the GPU but instead CPU. So it is good to check with some light filters first that you CPU can deal with the modulators you want. Thanks. Yes I only changed the modulator. Come to think of it, other than to simply try it, I don't know why I would like to play with ASDM7-512+ when I'm only running DSD256. Anyway, it does work with lighter modulators, IIRC. On 3/17/2020 at 9:10 PM, Miska said: Since doubling output rate about doubles the processing load, that should do. But some filters like single stage xtr are heavy. Even more so if you cannot stay within same rate family but instead need to go from 48/96/192k to 44.1x512 output. This depends on the DAC, so good to take this detail into account. Thanks. Noted. BTW, I have the Spring 2, KTE, which does any rate. Still, my intention is to always stay within family to reduce load. On 3/17/2020 at 9:10 PM, Miska said: There is now environment variable DSP_CUDA_DEVICE that allows you to specify numeric id of the card you want to use. Ahhh, thanks. Your answer is important because it allows me/us to buy a 2nd card and know it will be used. Does your 1080 get loud during playback? (I have the same ASUS ROG Strix GTX 1080 for my workstation and it's too long to fit in my playback machine.) Thanks again Link to comment
Miska Posted March 20, 2020 Share Posted March 20, 2020 10 hours ago, CharlieMB said: Does your 1080 get loud during playback? (I have the same ASUS ROG Strix GTX 1080 for my workstation and it's too long to fit in my playback machine.) No, never heard anything notable from it, given that it is in a Fractal Design Define case (that has sound proofing panels). Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
CharlieMB Posted March 20, 2020 Share Posted March 20, 2020 Wow. Yesterday I was able to go from Redbook 44.1K / 16 to DSD512 using Closed-form-fast and ASDM7 512+ fs. Again, this was with a Quadro K620 and "auto rate family." Closed-form might have also worked, but -fast sounded better. Link to comment
CharlieMB Posted March 20, 2020 Share Posted March 20, 2020 To shop for a card, is it better to focus on FP64 performance over number of CUDA cores? The two are usually related, but sometimes the FP64 performance surprises, especially when looking at older cards. Link to comment
Miska Posted March 20, 2020 Share Posted March 20, 2020 1 hour ago, CharlieMB said: To shop for a card, is it better to focus on FP64 performance over number of CUDA cores? The two are usually related, but sometimes the FP64 performance surprises, especially when looking at older cards. Number of CUDA cores is pretty bogus figure. And actually official FP64 performance figures too except within same product family. So the FP64 performance figures are only useful when comparing relative performance of products within same architecture generation. Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
CharlieMB Posted March 21, 2020 Share Posted March 21, 2020 On 3/20/2020 at 5:46 PM, Miska said: So the FP64 performance figures are only useful when comparing relative performance of products within same architecture generation. I understood this to mean that Quadro K (e.g., K6000) can't be compared with confidence against Quadro P. As a corollary it would follow that Quadro and GTX families can't be compared with confidence either. Thanks Link to comment
semente Posted March 21, 2020 Share Posted March 21, 2020 On 1/30/2016 at 3:53 PM, Miska said: I made a small write-up here about the GPU installation and initial testing results: Upgrade GPU for more CUDA processing power - Blogs - Computer Audiophile @The Computer Audiophile could you please resuscitate this link/thread? Thanks. "Science draws the wave, poetry fills it with water" Teixeira de Pascoaes HQPlayer Desktop / Mac mini → Intona 7054 → RME ADI-2 DAC FS (DSD256) Link to comment
Miska Posted March 21, 2020 Share Posted March 21, 2020 23 minutes ago, CharlieMB said: I understood this to mean that Quadro K (e.g., K6000) can't be compared with confidence against Quadro P. Exactly, the figures compared directly are badly misleading. For example the newer generations handle multitasking much much better than the old ones. And HQPlayer is example of heavy multitasking. And reason of some of the performance differences is totally unknown, Nvidia doesn't really tell much what is exactly going on under the CUDA hood, how the GPU really works inside. 23 minutes ago, CharlieMB said: As a corollary it would follow that Quadro and GTX families can't be compared with confidence either. Since Quadro and GTX/RTX are essentially same with some fusing and driver differences you can as long as you stick to the same generation. So you can compare for example Turing generation Quadro/Titan/RTX versions to each other directly. But you cannot compare figures of Turing and Pascal to each other directly. Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
The Computer Audiophile Posted March 21, 2020 Share Posted March 21, 2020 1 hour ago, semente said: @The Computer Audiophile could you please resuscitate this link/thread? Thanks. Done. semente 1 Founder of Audiophile Style | My Audio Systems Link to comment
jabbr Posted April 5, 2020 Share Posted April 5, 2020 ATM (see my server thread), I'm not seeing worthwhile benefits/$ from CUDA with HQPlayer as opposed to the newer CPUs with AVX512 Custom room treatments for headphone users. Link to comment
Miska Posted April 5, 2020 Share Posted April 5, 2020 18 hours ago, jabbr said: ATM (see my server thread), I'm not seeing worthwhile benefits/$ from CUDA with HQPlayer as opposed to the newer CPUs with AVX512 For some cases I see it very useful, especially for offloading convolution, multichannel, and some filter cases. Or if you just want to free up CPU resources for other tasks. Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
jabbr Posted April 5, 2020 Share Posted April 5, 2020 Yeah I spoke too soon. With certain settings I’m getting up to 45% nvidia-smi Custom room treatments for headphone users. Link to comment
sdolezalek Posted April 5, 2020 Share Posted April 5, 2020 I found that a) you have to have your HQ Player settings just right (in my case that means Multicore DSP checked and Cuda Offload and Adaptive output Rate grayed); and b) your downstream equipment needs to be able to handle the traffic and buffer accordingly (in my case before I upgraded to an Ultrarendu I was getting dropouts because it looked like my computer was flooding the DAC and dropping packets), in order to get the CUDA benefit. I discovered this by looking at the loads on 1) the CPU, 2) the Cuda card, and 3) the network as HQ Player was playing (in an NAA configuration). What I saw was a periodic overload of the CPU, the GPU and the Network (in part due to my also having both Roon and HQ Player doing their own thing on the main computer, thereby each loading the CPU and the Network. Synology NAS>i7-6700/32GB/NVIDIA QUADRO P4000 Win10>Qobuz+Tidal>Roon>HQPlayer>DSD512> Fiber Switch>Ultrarendu (NAA)>Holo Audio May KTE DAC> Bryston SP3 pre>Levinson No. 432 amps>Magnepan (MG20.1x2, CCR and MMC2x6) Link to comment
Snoozer Posted June 11, 2020 Share Posted June 11, 2020 Miska, could you please give me a hand here ?. A little bit of background: I have a license of your embedded solution. My use case is upsampling everything to DSD256 to whichever filters my machine is capable of at the moment of playing music. The CPU is an I9-9900K. I currently use an Audiolinux headless solution, just for the sake of being able to load the last CUDA drivers and to use a GPU under linux. On top of that, I have in mind to use HQplayer as to be my crossover solution with 4 ways (8 channels) and a multichannel OKTO DAC. FIR filter for every channel will be in place. Cuda offload is the idea. ¿ Will a RTX 2070 be enough to run the crossover filters ?. As I got it from this very thread and others, the GPU will benefit crossover filtering and room correction (this last one I have in mind to use soon as well). As a lateral topic: I have some original DSD material. How would HQplayer apply crossover filters to that ?. I can imagine converting DSD to PCM, apply filters and then back to DSD. Is that how would it work ? cheers and thanx for your help Link to comment
Miska Posted June 11, 2020 Share Posted June 11, 2020 3 hours ago, Snoozer said: I currently use an Audiolinux headless solution, just for the sake of being able to load the last CUDA drivers and to use a GPU under linux. Headless Ubuntu Server is the supported way to do that... 3 hours ago, Snoozer said: On top of that, I have in mind to use HQplayer as to be my crossover solution with 4 ways (8 channels) and a multichannel OKTO DAC. FIR filter for every channel will be in place. Cuda offload is the idea. Note that with such DAC you can (maybe) do max DSD64 with 8 channels. If you want 8 channels of DSD256, you need something like exaSound (doesn't work directly on normal Linux) or Merging NADAC/Hapi/Horus. 3 hours ago, Snoozer said: As a lateral topic: I have some original DSD material. How would HQplayer apply crossover filters to that ?. I can imagine converting DSD to PCM, apply filters and then back to DSD. Is that how would it work ? If source is DSD64 and output is DSD256, convolution filters are processed at DSD64 rate and then result is upsampled to DSD256. Convolution is always at source rate, except for DSD -> PCM playback case. So if source is DSD256 and output is DSD64, convolution is performed at DSD256 rate. Signalyst - Developer of HQPlayer Pulse & Fidelity - Software Defined Amplifiers Link to comment
Kal Rubinson Posted June 11, 2020 Share Posted June 11, 2020 1 hour ago, Miska said: Note that with such DAC you can (maybe) do max DSD64 with 8 channels. The OKTO DAC8 will do DSD128. Kal Rubinson Senior Contributing Editor, Stereophile Link to comment
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now