Jump to content
IGNORED

Best CPU for hqplayer


sbenyo

Recommended Posts

3 hours ago, sbenyo said:

t seems the L3 cache and # of cores are important. It seems there is a proof that 1950x works even without CUDA offload and 6950x works with CUDA. It also says that the cheaper 1920x may work but there is no proof for it.

 

I have the i7-6950X and it doesn't need CUDA...

 

I just checked Ryzen 7 1700X and it can do DSD256 with xtr filter, but not DSD512 without stuttering. Maybe with overclocking or 1800X, but not 1700X as standard.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
  • 2 months later...
On 2/4/2018 at 8:57 PM, sbenyo said:

I also noticed that both using multicore and Cuda offload should not be fully set. They should be both grayed.

 

If CUDA box is grayed, then it means only convolution engine is offload. Thus if you don't use it, then CUDA is not used either. Offload status is reported briefly in the main window status bar when playback is started.

 

Multicore box grayed should automatically find optimal configuration for any core count, so in most cases it is best selection.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
  • 4 weeks later...
23 hours ago, sbenyo said:

I wonder what is the CPU utilization and temperature people see when using XTR filters and DSD512.

Mine is avg. of 30% total CPU utilization (all cores) and temp can be 60c-70c (Using Core Temp utility).

 

Here's my figures... I've noticed that AMD Ryzen's produce more heat compared to Intel CPU's of same TDP specification...

 

This is with Noctua NH-D15 cooler, with one fan installed (second one doesn't fit because of tall memory modules).

 

cpu-temps.thumb.png.156f47aaada28608551eb0d3f8971eca.png

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

After about hour of playing, the temps settled just below 40C because fan kicked in and the speed settled to match the heat production. But still the machine was really quiet without any clear fan noise.

 

This with room temperature of 21C. At summer when inside temps are higher, the CPU will of course run hotter too.

 

One critical thing to observe is type and amount of thermal paste between CPU and the cooler. The layer should be as thin as possible while covering exactly all the available area possible.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

This AMD CoreBoost is similar to Intel's TurboBoost. It can boost clock frequency of some of the cores when other cores are on lighter load. Amount of boost it can give depends on load of each core, you can get highest boost when only single core is loaded. In practice it aims to keep the total CPU package within power limits while allowing performance boost on some cores when others are under lighter load. So this feature is not considered overclocking, but just standard feature of the CPU (and can be disabled if necessary).

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
1 hour ago, sbenyo said:

I also wonder how this works if memory boost is not possible. Can I get 4.0GHz clock speed on CPU with 2400Mhz memory speed or does it require memory to support it as well?

 

Memory runs at it's own speed and there are layers of cache between RAM and CPU. So this boost is independent of memory bus which runs at constant speed.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
5 hours ago, n2it said:

Agree it will create more heat (i.e. the memory will run at a higher voltage), but it shouldn't add (much) to the CPU heat - except a

 

Sure, but of course the RAM module will run hotter and it adds to total heat bill of the case (just like everything else too). For example my Xeon workstation manual says that if all 8 memory slots are populated, extra RAM fan needs to be added to ensure enough ventilation in order to maintain stability under all conditions and ensure component lifetime. So observing RAM module temperatures is useful and depends a lot on overall computer case air flow, because there's usually not dedicated fan for RAM modules (but such of course can be added).

 

5 hours ago, n2it said:

Given that you can control the heat, faster memory will mean faster memory access (and less CPU wait time).  A threadripper - which in reality is like running dual CPUs - the faster memory will help with latency when the L3 fills and the core needs to access the memory on either of the 2 far cores.  The faster the memory, the faster the access to the far memory. 

 

That being said, I don't know what exact constraints are, so you might experiment to see if this actually reduces CPU utilization by reducing the memory wait times after using up the L3 cache.

 

HQPlayer is quite heavy on memory access and speed of memory is critical, but OTOH, if things are fast enough already, pumping more speed out of CPU and RAM will usually just increase heat production. Wait times usually don't heat up the CPU, because it is idling.

 

When things are quite not running fast enough, then getting faster RAM and running it at higher speed may help. But compared to caches, it is slow, so amount of CPU cache is still very critical.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
  • 3 weeks later...
6 hours ago, louisxiawei said:

The 14-core e5-2683 only has 3.0 GHz of max turbo frequency, I doubt a bit regarding its capability for the upsampling performance. If you have a solid confirm of such setting, then you will save many peoples' wallet who wish to apply poly-sinc-xtr filter at DSD512.

 

Turbo frequencies don't really matter much when you have all cores loaded, because in those cases turbo cannot be really used. Xeons have smaller difference between max turbo and base frequency. Advantage of 2683 is it's 35 MB cache, 10 MB more than my 6950X.

 

But interesting that it works with just two memory modules, because 2011-3 socket has four memory channels and in this case only half of those populated which halves the memory bandwidth to RAM. Probably covered up by the much larger cache.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
1 hour ago, louisxiawei said:

But 2683 has only 2.00 GHz base clock, don't you think it's a bit low for task like poly-sinc-xtr at DSD512? My 7980 never runs below 3.5 Ghz with poly-sinc-xtr at DSD512.  Or are you suggesting that the 10 MB more cache size can compensate the base clock frequency?  

 

If the clock rate is fast enough to run the modulator then it's enough.

 

For the filter, clock frequency doesn't matter if there are more cores and enough memory bandwidth (cache). GPU's being one example since they usually have clocks below 2 GHz, but lot of parallel cores.

 

1 hour ago, louisxiawei said:

One thing I still don't understand is that I tried to upsample DXD file (352.8) to 48x512 with poly-sinc-xtr. The load should be significantly lower than upsampling from redbook file, but why is this setting still causing stutter? Which part cause the limitation? The cache of my 7980XE or the speed of the DDR4 3466 64GB RAM?

 

It becomes memory bound, because the filter bank becomes massive when converting between rate families and high ratio. Cache is always much faster than RAM, so you would need huge cache.

 

I haven't got spare 3500€ yet to try out Titan V, but with it's high FP64 performance and HBM RAM it could potentially be good. But never know without trying... Then one could use something i7-8700K that has high clock rate but not so many cores for rest of the work.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
  • 3 months later...
  • 1 month later...
  • 2 months later...
5 hours ago, AudioXP said:

Thanks, salaryman, for your quick answer. @Miska, I'm running the latest HQPe (4.6.1). Tried on both Ubuntu 18.04 and debian 9-5. Results were similar, in that I'm far away from full load on all/any cores.

 

What kind of system load figure does "top" show when trying to do DSD512 with non-2s filters? Are you trying within same rate family or between rate families? I think former should work, latter likely doesn't.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

I'm planning to find out some time soon'ish as my old Mac Mini is getting quite pretty old indeed.

 

All the lineup seems like good options for running HQPlayer, especially the 6-core ones. Biggest question is how noisy the fan will get under load. Nvidia eGPU could be interesting option too.

 

In traditional Apple style, the RAM and SSD options are insanely expensive though. I would say 16 GB is absolute minimum for RAM and 512 GB minimum SSD to keep the machine usable for longer period...

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
2 hours ago, Audio-68 said:

According to Apple Insider earlier today , the 2018 MacMini will include user-expandable RAM. The  slots are SO-DIMM slots, and are mostly accessible to the user. In conversations with Apple corporate employees, Apple Insider was told that users with a "modicum of skill" can get to the pair of RAM slots.

 

That would be awesome. I've upgraded my Mac Mini and iMac RAM with Kingston's Apple-specific parts that have been working perfectly for many years and cost less than half of Apple's pricing. I was disappointed when they removed that possibility...

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
3 hours ago, Jens_G said:

Miska, what do you think about the following configuration, would it be sufficient for DSD512 poly-sync-xtr (non -2s)?

 

Intel Xeon E5-2680V3 12 x 2,5 GHz Turbo 3,3 Ghz

64 GB RAM (8 x 8129 MB, 2133MHz, DDR4)

Mainboard with Intel X99 (Gigabyte GA-X99-UD4P)

 

It is not very clear at least. The i7-6950X I have can just do it (and can fit on the same motherboard). It is 10 cores 3.0 GHz base and 3.5 GHz turbo (and 4 GHz with turbo 3). Price should be about the same, given same MSRP. That E5 is one generation older and lower base clock, although has two more cores.

 

64 GB of RAM is maybe a bit overkill, I have 32 GB and there's plenty of free RAM left. But of course shouldn't hurt either.

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
37 minutes ago, Jens_G said:

OK, so it would be worth a try? What do you think? I could get the above mentioned bundle on an acceptable price. Or would you recommend me to go to a newer platform instead because it would be more future proof, etc. BTW: DSD256 poly-sinc-xtr (non -2s) should be possible in any case, shouldn‘t it?

 

Maybe... DSD256 is about half of the load, so I'd certainly expect it to work.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
1 hour ago, ddetaey said:

In fact I doubt that even there is a CPU available today that can handle DSD512 poly-sinc-xtr (non -2s) all by itself , without CUDA support.

(even not the high ranked 

Intel Core i9-7980XE @ 2.60GHz
27,761 

 

The i7-6950X I have can do it.

 

I think someone reported that it works on i9-7980XE. I'd like to hear if it works on other 79xxX models, but I'd assume it does based on 6950X.

 

And I believe someone reported it to work on AMD Threadripper 1950X. Now there's bunch of new second generation Threadrippers and would be nice to hear reports about those. But 2950X at least should be OK, since it is equivalent but newer to 1950X. I would really love to get my hands on 2990WX and see how it performs.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
8 hours ago, rando said:

 

@Miska What would prevent adding a benchmark feature to HQPlayer using a built in test file, liability?

 

Different settings generate different load patterns, although they have a lot in common too. So I would need to define a work load that would be useful for benchmark.

 

I'll think about this, but likely it doesn't get implemented very soon.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
31 minutes ago, bibo01 said:

If, as you say, it's not going to be implemented very soon, wouldn't be OK for you to define for now the most suitable benchmark for HQP among those I listed before and people can start ti run some test/comparison?!

 

You could check if some FP64 heavy benchmark puts CPUs in similar order like discussed here?

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...