Jump to content
IGNORED

Building a high performance compute server for HQPlayer


Recommended Posts

Are you planning to upgrade the power supply or stay with the present one? Once the NVIDIA is placed, Do you think the fan’s noise and the impact of the EMI, etc. on the system can be managed?

 

Kudos on taking another way for the HQPlayer server.

Qnap NAS (LPS) >UA ETHER REGEN (BG7TBL Master Clock) > Grimm MU1 > Mola Mola Tambaqui /Meridian 808.3> Wavac EC300B >Tannoy Canterbury SE

 

HP Rig ++ >Woo WES/ > Stax SR-009, Audeze LCD2

Link to comment
3 hours ago, zerung said:

Are you planning to upgrade the power supply or stay with the present one? Once the NVIDIA is placed, Do you think the fan’s noise and the impact of the EMI, etc. on the system can be managed?

 

Kudos on taking another way for the HQPlayer server.

 

I have a 950W PSU but powering the RTX 2080 Ti is always a challenge with cables.

 

The EMI isn't an issue because the Mellanox NIC is designed to operate within extremely tight tolerances even in the presence of noise. The optical output will not transmit EMI, nor jitter for that matter, and the workstation remains far away from my audio system. I plan to use a Source Photonics 100Gbase-LR4 module which uses duplex LC connectors and single mode fiber.

Custom room treatments for headphone users.

Link to comment
6 hours ago, jabbr said:

2. Enable both AVX512 and CUDA coprocessing

 

I would try without CUDA.  At least in my setup, not enabling the GPU (CUDA) reaps rewards across the board.  Removing it again reaps another worthwhile improvement.  As always, YMMV depending on card itself (1080 GTX FE in this case).

Link to comment

CUDA only seems to add a little bit of performance, using an RTX 2080 Ti with 440 driver:

Nether can do ASDM7EC without stuttering - hqp goes to 100% cpu on two cores and only 7% CUDA per nvidia-smi FF50EB0A-EEB7-4C99-91C2-1AAC92803F6A.thumb.jpeg.2c3a36317cc5b64b64176ba5cbe22782.jpeg

 

1E613DB6-693D-414F-9875-4D21839863A8.thumb.jpeg.e053e0aa5360d36d4796c58bc334f524.jpeg

 

when using AMSDM7 512 + fs the CPUtilization of HQP goes from 284% to 200% and 14% CUDA: 

9762FC0F-95F2-4C7B-83B4-1DF15E388E84.thumb.jpeg.6a1c677fbbc8c62dbf1c32a18837853e.jpeg8B45F725-42C6-4081-A204-4599FD602E91.thumb.jpeg.6bf986bc255c4e38eeaa76df9463f871.jpeg0EC7F1DD-F66C-474C-BFED-7FFA7ED60043.thumb.jpeg.b56b670f7d7a1cff94299432a75ce2e8.jpeg

 

81EDAAA3-32B9-4A2C-84FE-AD9962DAEBA7.jpeg

AC59A0C3-FE6A-41FB-8636-4E1FA2AA8925.jpeg
 


@Miska does this seem about right? Thoughts?

Custom room treatments for headphone users.

Link to comment
On 4/5/2020 at 7:34 PM, Miska said:

Yeah, you get decently high GPU load on RTX when you do for example 48k to 44.1 x 512 with poly-sinc-xtr.

 

Alternatively, if you want to do 8 channels with a regular filter to DSD256 using the EC modulators, you end up pretty high load on both GPU and CPU.

 

 

Yes that's the eventual idea. Any consideration of providing for >1 DAC with multichannel, I am sure you are considering physically separate NAA/DACs.

 

I must say the Dell case is very well made, and quiet even with the RTX 2080 Ti running. 

 

I am waiting for my replacement NIC but I suspect that even with RDMA offload, the CPU load will not change appreciably - I think I'm very close to ASDM7EC / 512 but as they say, close only counts for ...

 

Interestingly the CPU bursts ( $ lscpu ) to 4.5 Ghz with AMSDM7 but only to 3.9 Ghz with ASDM7EC so I think there's hope of tweaking something

Custom room treatments for headphone users.

Link to comment
32 minutes ago, jabbr said:

Yes that's the eventual idea. Any consideration of providing for >1 DAC with multichannel, I am sure you are considering physically separate NAA/DACs.

 

No, NAA is DAC-side clocked so only one DAC for multichannel. No synchronization for multiple DACs on purpose, because you would have multiple clocks in such case. Very few audiophile DACs support external clock synchronization anyway.

 

With exaSound you can get 8 channels of 384/32 PCM and DSD256. With Merging Hapi/Horus you can get 384 PCM and DSD256 multichannel as well.

 

2 hours ago, jabbr said:

I am waiting for my replacement NIC but I suspect that even with RDMA offload, the CPU load will not change appreciably - I think I'm very close to ASDM7EC / 512 but as they say, close only counts for ...

 

Interestingly the CPU bursts ( $ lscpu ) to 4.5 Ghz with AMSDM7 but only to 3.9 Ghz with ASDM7EC so I think there's hope of tweaking something

 

You could run null output test and see what kind of processing time you get for a track vs track length.

 

So far highest published boost is the new flagship laptop CPU i9-10980HK at 5.3 GHz, but it is only single core boost. For this case you would need at least two cores boosted and I don't know how much boost that CPU can have with  two cores.

 

i9-9900KS can do all-core boost at 5 GHz, but it is not enough for ASDM7EC at DSD512.

 

Will be interesting to see how high clocks there will be for the 10th gen desktop CPUs.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
5 minutes ago, Miska said:

 

You could run null output test and see what kind of processing time you get for a track vs track length.

 

So far highest published boost is the new flagship laptop CPU i9-10980HK at 5.3 GHz, but it is only single core boost. For this case you would need at least two cores boosted and I don't know how much boost that CPU can have with  two cores.

 

i9-9900KS can do all-core boost at 5 GHz, but it is not enough for ASDM7EC at DSD512.

 

Will be interesting to see how high clocks there will be for the 10th gen desktop CPUs.

 

 

True but W2245 has AVX512. Its curious that ASDM7EC is not able to boost c/w AMSDM7 (3.9 vs 4.5). W2245 should be able to multicore boost. I may need to look at each core separately -- htop shows % but not per core clock rate. 

Custom room treatments for headphone users.

Link to comment
27 minutes ago, jabbr said:

True but W2245 has AVX512. Its curious that ASDM7EC is not able to boost c/w AMSDM7 (3.9 vs 4.5). W2245 should be able to multicore boost. I may need to look at each core separately -- htop shows % but not per core clock rate. 

 

Cores are boosted individually, you can check /proc/cpuinfo to see what clocks each core have.

 

But only testing will show how each CPU model performs at such tasks.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

I created an AWK to capture the /proc/cpuinfo output:

In both cases output 44.1x512

First from the AMSDM7: Note that 4 threads are boosting to 4.5 Ghz

Processor:  0  Mhz:  2943.395
Processor:  1  Mhz:  1200.102
Processor:  2  Mhz:  2802.146
Processor:  3  Mhz:  1200.199
Processor:  4  Mhz:  4499.999
Processor:  5  Mhz:  1200.046
Processor:  6  Mhz:  1201.438
Processor:  7  Mhz:  4561.842
Processor:  8  Mhz:  2436.413
Processor:  9  Mhz:  1201.864
Processor:  10  Mhz:  2671.769
Processor:  11  Mhz:  1200.799
Processor:  12  Mhz:  4492.542
Processor:  13  Mhz:  1200.343
Processor:  14  Mhz:  1200.020
Processor:  15  Mhz:  4514.522
CPU model:  Intel(R) Xeon(R) W-2245 CPU @ 3.90GHz
1 CPU,  8 physical cores per CPU, total 16 logical CPU units

Now with ASDM7EC: Note that these 4 threads stay at 3.9 Ghz

Processor:  0  Mhz:  3965.971
Processor:  1  Mhz:  1200.031
Processor:  2  Mhz:  1200.164
Processor:  3  Mhz:  1265.660
Processor:  4  Mhz:  3965.246
Processor:  5  Mhz:  1201.363
Processor:  6  Mhz:  1201.073
Processor:  7  Mhz:  3076.107
Processor:  8  Mhz:  3900.650
Processor:  9  Mhz:  1200.346
Processor:  10  Mhz:  1201.619
Processor:  11  Mhz:  1270.791
Processor:  12  Mhz:  3899.409
Processor:  13  Mhz:  1200.285
Processor:  14  Mhz:  1200.009
Processor:  15  Mhz:  3145.273
CPU model:  Intel(R) Xeon(R) W-2245 CPU @ 3.90GHz
1 CPU,  8 physical cores per CPU, total 16 logical CPU units

 

Custom room treatments for headphone users.

Link to comment
2 hours ago, jabbr said:

I created an AWK to capture the /proc/cpuinfo output:

In both cases output 44.1x512

First from the AMSDM7: Note that 4 threads are boosting to 4.5 Ghz


Processor:  0  Mhz:  2943.395
Processor:  1  Mhz:  1200.102
Processor:  2  Mhz:  2802.146
Processor:  3  Mhz:  1200.199
Processor:  4  Mhz:  4499.999
Processor:  5  Mhz:  1200.046
Processor:  6  Mhz:  1201.438
Processor:  7  Mhz:  4561.842
Processor:  8  Mhz:  2436.413
Processor:  9  Mhz:  1201.864
Processor:  10  Mhz:  2671.769
Processor:  11  Mhz:  1200.799
Processor:  12  Mhz:  4492.542
Processor:  13  Mhz:  1200.343
Processor:  14  Mhz:  1200.020
Processor:  15  Mhz:  4514.522
CPU model:  Intel(R) Xeon(R) W-2245 CPU @ 3.90GHz
1 CPU,  8 physical cores per CPU, total 16 logical CPU units

Now with ASDM7EC: Note that these 4 threads stay at 3.9 Ghz


Processor:  0  Mhz:  3965.971
Processor:  1  Mhz:  1200.031
Processor:  2  Mhz:  1200.164
Processor:  3  Mhz:  1265.660
Processor:  4  Mhz:  3965.246
Processor:  5  Mhz:  1201.363
Processor:  6  Mhz:  1201.073
Processor:  7  Mhz:  3076.107
Processor:  8  Mhz:  3900.650
Processor:  9  Mhz:  1200.346
Processor:  10  Mhz:  1201.619
Processor:  11  Mhz:  1270.791
Processor:  12  Mhz:  3899.409
Processor:  13  Mhz:  1200.285
Processor:  14  Mhz:  1200.009
Processor:  15  Mhz:  3145.273
CPU model:  Intel(R) Xeon(R) W-2245 CPU @ 3.90GHz
1 CPU,  8 physical cores per CPU, total 16 logical CPU units

 

 

Hard to say, but it could be AVX512 capping the clocks, because ASDM7EC puts more load on it.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
6 minutes ago, Miska said:

 

Hard to say, but it could be AVX512 capping the clocks, because ASDM7EC puts more load on it.

 

Yes as we've discussed. Is there a way to send more cycles to CUDA which isn't getting used too much? Perhaps a "prefer-cuda" flag? Or "enable-AVX512". That might just make 44.1x512 work?

Custom room treatments for headphone users.

Link to comment
1 minute ago, jabbr said:

Yes as we've discussed. Is there a way to send more cycles to CUDA which isn't getting used too much? Perhaps a "prefer-cuda" flag? That might just enable 44.1x512?

 

Modulators cannot be run on GPU because of the mathematical structure they would be badly sub-optimal there. You can only run filters and convolution engine there.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
8 hours ago, jabbr said:

 

Tricky part is that things depend on CPU model and workload. Newer CPUs likely throttle less than previous generations.

 

But I would conclude that likely the higher AVX-512 usage limits clocks to base frequency in this case.

 

On 4/4/2020 at 5:04 PM, jabbr said:

8GB 1x8GB DDR4 2933MHz RDIMM ECC Memory

 

Btw, note that for full memory speed on a quad-channel CPU you need four DIMMs...

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
4 hours ago, Miska said:

Btw, note that for full memory speed on a quad-channel CPU you need four DIMMs...


yep, I’m not paying dell for RAM, not a GPU nor NIC ... going to install NIC then RAM and test after each

Custom room treatments for headphone users.

Link to comment

Some more numbers:

looking at the temps - running ASDM7EC-DSD256:

dell_smm-virtual-0
Adapter: Virtual device
fan1:        1000 RPM
fan2:         723 RPM
fan3:         714 RPM

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +65.0°C  (high = +88.0°C, crit = +98.0°C)
Core 0:        +64.0°C  (high = +88.0°C, crit = +98.0°C)
Core 2:        +52.0°C  (high = +88.0°C, crit = +98.0°C)
Core 3:        +51.0°C  (high = +88.0°C, crit = +98.0°C)
Core 5:        +51.0°C  (high = +88.0°C, crit = +98.0°C)
Core 8:        +65.0°C  (high = +88.0°C, crit = +98.0°C)
Core 10:       +49.0°C  (high = +88.0°C, crit = +98.0°C)
Core 11:       +49.0°C  (high = +88.0°C, crit = +98.0°C)
Core 12:       +51.0°C  (high = +88.0°C, crit = +98.0°C)

When I try to run at DSD512, the sound is on for a second and then off for a second and repeat. The CPU utilization doesn't get >70%

(32Gb RAM now)

dell_smm-virtual-0
Adapter: Virtual device
fan1:         991 RPM
fan2:         724 RPM
fan3:         680 RPM

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +44.0°C  (high = +88.0°C, crit = +98.0°C)
Core 0:        +44.0°C  (high = +88.0°C, crit = +98.0°C)
Core 2:        +41.0°C  (high = +88.0°C, crit = +98.0°C)
Core 3:        +42.0°C  (high = +88.0°C, crit = +98.0°C)
Core 5:        +41.0°C  (high = +88.0°C, crit = +98.0°C)
Core 8:        +43.0°C  (high = +88.0°C, crit = +98.0°C)
Core 10:       +39.0°C  (high = +88.0°C, crit = +98.0°C)
Core 11:       +40.0°C  (high = +88.0°C, crit = +98.0°C)
Core 12:       +41.0°C  (high = +88.0°C, crit = +98.0°C)

 

Custom room treatments for headphone users.

Link to comment
6 hours ago, jabbr said:

When I try to run at DSD512, the sound is on for a second and then off for a second and repeat. The CPU utilization doesn't get >70%

 

When deadlines are systematically missed, the whole process goes into spring like motion that you may know from rush hour traffic jams where queue of cars end up in such motion of acceleration and deceleration.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...