Jump to content
IGNORED

Best CPU for hqplayer


sbenyo

Recommended Posts

44 minutes ago, Bertel said:

@Miska this is interesting - I have a spare XFX GeForce 6800 XTreme with 128MB DDR3 and PCIe - my CPU is an 12th Gen i9-12900K with 3.19 GHz as well - is this combi/pairing any good for CUDA / GPU offloading, I mean does this bring the full 6800XT benefit or is this more of a "light" version and doesn't make too much sense?

 

Please note 6800XT as in AMD Radeon RX 6800 XT.

 

GeForce is Nvidia...

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
  • 2 weeks later...
  • 2 weeks later...
On 7/21/2022 at 1:36 AM, IgorSki said:

OMG!!! Is this the first report of sustainable 192/24 @ 7ECv2@1024 ??
 

What was your filter choice ? Although I remeber you have GPU so it helps with filter load.

 

Is it constantly stable playback or with rare shatters, like once per 5-10 min ?

 

Hey @IgorSki …. Sorry about this, just realised I didn’t reply!!

 

I use the following….

image.thumb.png.d8d474e4db57cff00775025cceb860e9.png

No issues with constant playback, though I don’t use it as much tbh as temps run around 85+ c and power consumption goes up close to 400w.

 

 

Link to comment
On 7/17/2022 at 4:25 AM, Miska said:

OTOH, benefit of having just two cores on higher load is that those two cores can gain higher turbo boost clocks. More even the load distribution, less boost you can gain (apart from 9900KS and few other special models)


@Miska I’ve managed to tweak my cooling a bit more…. Still trying to get to 192@7ECv2@1024(48k family) No luck with all p core @ 55 as well. As you indicated I also managed to get p core 1 & 5 to 56 and still no luck!!!

 

I tried 57 on the two cores, but so far no luck with these two cores…. The questions that I have is….
- is it consistently core 1 & 5 for others that are loaded?
- Is there any way to pick which cores get the load…. I.e. selecting the “golden cores” instead? 
- Will tweaking Ram have any impact?

Link to comment
2 hours ago, IgorSki said:

1st - no 1024x48 (yet) - a bit problematic with "integer" filters. But... to me, this is only a theoretical issue as in practice my NO GPU set-up for most interesting Sinc- filters (Sinc-Mx, Sinc-Ll) can not give me even DSD@512 ! So - not an issue.

 

This is where GPU offload can help a lot. Since those filters are the most "GPU friendly".

 

poly-sinc are relatively more heavy on a GPU.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
9 hours ago, Miska said:

 

I think it depends more on Microsoft. AMD already performs very well on Linux. And Windows has certainly got much better.

 

I confess my complete ignorance on this subject !

Can you elaborate a little more about it, please?
Latest generations of Intel cpu's require Windows 11 with "Thread Director" to know how to deal with p-cores and e-cores but this is not the case with AMD where the cores are all the same type.

Thank you.

Link to comment
3 hours ago, hpsxrb said:

I confess my complete ignorance on this subject !

Can you elaborate a little more about

I believe what Miska is saying is that to be able to use the capabilities of anything (not only CPU) depends on the OS (operating system windows or Linux or what have you). The OS orchestrates the CPU usage with scheduling of the cores,  task distribution and affinity to cores, maximum core usage and the such. 

 

Traditionally Microsoft only focused on certain hardware, when the first AMD CPUs were released in the 90s along with the cyrix ones Microsoft support was lacking, they (logically) considered Intel as the only CPU worth of supporting, it wasn't only this, for many years their support for Intel network adapters was top priority and others like Broadcom for example were supported but with code full of bugs and less features and in some cases performance.

 

These days they have gotten better with the even support of technologies (overall I personally think they are in decline but that's my own personal opinion), Linux on the other hand was always more fair in this process.

 

Link to comment
14 hours ago, hpsxrb said:

I confess my complete ignorance on this subject !

Can you elaborate a little more about it, please?
Latest generations of Intel cpu's require Windows 11 with "Thread Director" to know how to deal with p-cores and e-cores but this is not the case with AMD where the cores are all the same type.

 

With AMD, cores are not all grouped the same way like Intel does. But instead in CCX groups. For proper performance, OS needs to understand core and core-cache relationships. And also if you spread this over multiple sockets where parts of RAM are behind different physical sockets and processor groups (NUMA).

 

HQPlayer also has understanding of these structures to manage workload distribution.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
On 9/2/2022 at 8:07 PM, Miska said:

 

This is where GPU offload can help a lot. Since those filters are the most "GPU friendly".

 

poly-sinc are relatively more heavy on a GPU.

 

And this is where comes the biggest dilemma for this build.

 

Strategy 1. NO GPU. Get everything that is possible from CPU alone build.

  • Pros: Cost efficient build / Energy and Cost efficient run
  • Cons: No headroom for convolution, matrix / No access to some filters at high rates (Sinc-Mx, Sinc-Ll or Nx: gauss-xla or ext3 @ DSD 1024)

 

Strategy 2. YES GPU - Fairly widely used and tested set up for DSD512, i'd say a winner in this domain. But IMHO a bit of unknown for DSD1024 (If I did not miss, I think I only read from @El Guapo and @MJ1409 about DSD1024 builds and 7ECv2 runs with CUDA) and I'm really wondering how far GPU could push my builds. On one hand it will most likely be lagging behind CPU (so what?), but on the another hand theoretically it should definitely take a load from heavies like XLA or EXT3 for Nx: @ SDM 1024. Which can be very interesting, actually. Looking at earlier @El Guapo tests with CUDA for 7ECv2@1024 looks like most of "Sinc-xx" filters can not do 1024 anyway. And GPU seems to increase power consumption to noticeable levels.

 

In this regard, I have been reading this post from @Cornell77 with huge interest. Judging GPU by "FP64 performance" seems RTX A... series kick ! And looking now at the prices around my place: new RTX A4500 is less than $1500 and new RTX A4000 is less than a $1000! Really, really tempting. But is it needed @ SDM 1024 ? Oh, my [scratching head] 

 

 

 

 

22 hours ago, luisma said:

Thank you @IgorSkifor the post, I went the same route except with my own loop 1x 360 and 1x 280 (45mm), yeah not under $1000. I'm currently in the OC process your notes could be very helpful 

 

Working around "cost KPI" was another fun part of the project, really felt like hunting :) I was not in particular rush to get the spare parts, I think it took me 1 -1/5 month for all parts, but in a meanwhile I have hugely benefited from a) discounts b) secondary market (like getting my tower at $35, etc...)

 

I still keep a watch list for one of the leading and very reasonable suppliers here. All new build would not break $1500. And I have put "KS" in the list. For the "K" it will be almost $100 cheaper. Below is the picture in swiss francs (CHF), but it is almost 1:1 with USD today.

 

image.thumb.png.ffc877ac16b25fea8ebf4f6c2783588e.png

 

 

 

 

Link to comment
2 hours ago, Miska said:

 

For the sinc-xx filters, others than sinc-M and maybe sinc-S at DSD1024, amount of available GPU RAM can become the limiting factor. For DSD512 with sinc-L, you need at least 12 GB. Thus for DSD1024 you need at least 24 GB. So A5000 or A6000 could do. Likely A6000 can do a lot, but it is very expensive as well.

 

Jussi, does new AMD CPU has the ability to support CUDA offload? The new 7950X looks better than my 12900KS but I am not sure if CUDA is same for AMD CPU?

Link to comment
35 minutes ago, pis99 said:

Jussi, does new AMD CPU has the ability to support CUDA offload? The new 7950X looks better than my 12900KS but I am not sure if CUDA is same for AMD CPU?

 

CUDA is Nvidia's thing and works on Nvidia GPUs (graphics cards). CPU manufacturer is unrelated, and CUDA works the same on AMD and Intel CPUs.

 

Note that CUDA is not supported on AMD GPUs. AMD CPUs are fine though.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment

Now that the prices of 3090ti are at 1100 USD (Best Buy USA) I am tempted to buy two of them and get a total of 48 GB VRAM.  I priced a RTX A6000 at 4600 USD ouch!. 

My question is using 2 3090ti cards in the upcoming AMD motherboard with 2 PCIe 5 slots is doable but will it behave as a 48GB equivalent (a6000) to be able to push the performance envelop.

Please advise. Thks

Link to comment
6 hours ago, Mantheunknown said:

Now that the prices of 3090ti are at 1100 USD (Best Buy USA) I am tempted to buy two of them and get a total of 48 GB VRAM.  I priced a RTX A6000 at 4600 USD ouch!. 

My question is using 2 3090ti cards in the upcoming AMD motherboard with 2 PCIe 5 slots is doable but will it behave as a 48GB equivalent (a6000) to be able to push the performance envelop.

 

It should be technically doable to support multiple GPUs, but at the moment I don't have such hardware to test with and thus it is not implemented at the moment.

 

Signalyst - Developer of HQPlayer

Pulse & Fidelity - Software Defined Amplifiers

Link to comment
On 9/2/2022 at 10:58 PM, IgorSki said:

Stable ASDM7ECv2 @ 1024 

2ch playback of 44.1/16 through 192/24 and DSD sources

 

And with above post from Jussi - we are clearly and surely stepping into DSD 1024 era of fat EC modulators. Who could've thought about this only a year ago ! When I started my 12900k project couple of month ago one of the main ideas was to squeeze as much HQP power as possible from single CPU build (no GPU). Another element was to try keeping total cost of the build below $1000. The results were amazing with 5EC/7EC modulators not really breaking a sweat at DSD 1024x48. But at the same time it felt that 5ECv2/7ECv2 @ 1024 is not really far off either...

 

Today I'm happy to share with you continuation of my research on DSD 1024 topic.

 

Yes, it is possible to have ASDM7ECv2 @ 1024 !

 

Since my last posts I have made, however, one very important tweak - replacing the cooling system. Here is my present set up:

  • CPU: i9-12900k
  • MB: MSI Pro Z690-A (LGA 1700, Intel Z690 (DDR5), ATX)
  • COOLING: Arctic Liquid Freezer II 360
  • MEM: Kingston DDR5-RAM FURY Beast 6000 MHz 2x8 GB
  • SSD: Samsung SSD M2 250G
  • POWER: EVGA SuperNOVA 850 G3 80 Plus Gold 850W 
  • CASE: Fractal Define R5  

Few words on cooling. Noctua NH-D15 did everything good. Well... almost. A very potent air cooling solution would keep you running with 12th generation at DSD512 surely and with very reasonable amount of noise. However the push to DSD 1024 brings Noctua to the edge of it's capacity. The choice of "Arctic Liquid Freezer II 360" AIO comes with the help of fairly scientific review from GamingNexus. Actually following their reviews I was making my choice between "EK-AIO Basic 360" and "Arctic Liquid Freezer II 360". The former had a bit of a waiting time to get here where I am, so I ended with the latter. Make sure you have "LGA1700 kit" handy (may need to be purchased separately). And another thing to consider is AIO size, before buying I made a research if it's gonna fit in my "tower". It does, obviously, but with a little sacrifices - some of the internal brackets inside the tower and cable manager had to be removed. Easy, but may need to be considered. And finally from the price perspective Freezer is at the same price as Noctua with absolutely insane difference in performance. Sadly, Noctua will have to go... And finally as hinted by Freezer manual and @El Guapo- it is better to adjust the FAN curve in BIOS settings. I have set 100% FAN speed starting at 60C.

 

Software (very simple):

  • Ubuntu Server 22.04 LTS
  • libgmpris_2.2.1-10_amd64.deb
  • linux-headers-5.15.39-jl+_5.15.39-jl+-1_amd64.deb
  • linux-image-5.15.39-jl+_5.15.39-jl+-1_amd64.deb
  • hqplayerd_4.32.4-141avx2_amd64.deb (very important is setting in HQP xml config file multicore="1", with default "auto" it was shuttering on my system)

For system monitoring I use couple of known utilities:

  • htop - visual CPU load by threads, frequencies, temps
  • i7z - CPU load by cores, frequencies, temps, voltages


ASDM7ECv2 @ 1024

 

for i9-12900k ASDM7ECv2 @ 1024 comes at hefty overclocking price. Which really brings a question - can this be achieved with any i9-12900k, or...  ? Contrary to 5EC/7EC@1024x48 which came with "easy overclocking" where most of the parameters were left at default (see the posts above) 7ECv2 has required some tweaking. Well, what was that?

 

  1. Fastest Memory profile, in my case XMP-1 to have full DDR5-6000 potential
  2. P-Core ratios. The most stable build is: core [0]: 55x / core [1]: 55x / cores [2-7]: 53x
  3. E-Core ratio: 41x for all
  4. Ring ratio: 44x 
  5. Voltages, (oh, my....) this is where most of my tweaking time has really been spent. My overall observation is that 7ECv2@1024 can be achieved within a range of core voltages: [1.40 - 1.45]v. Good results I was able to receive with voltages around 1.41-1.43, anything above is quickly bringing the heating challenge. Most stable voltage set up came out from tweaking the VF curve: negative offset at 48x point -0.036v, positive offset at 55x point +0.010v. Another option is to choose adaptive voltage setting with 1.45v (for starters), but theses would jump up to 1.5v at times, getting into the low and mid 90Cs. Freezer can handle, but still.

This was a base line, and highlights on advance settings are the following four:

  1. E-cores: 4 active 
  2. EIST (Enhanced Intel Speedstep Technology) - DISABLED
  3. TVB (Thermal Velocity Boost) settings - DISABLED
  4. Intel Turbo Boost - ENABLED

So, what all this mumbo-jumbo can bring to the table ? Well, I was receiving hours and hours of 7ECv2@1024 playback with mixed sources from 44.1/16 up to 192/24 and DSD64/128 sources with my favorite filters: 

a) 1x gauss-xla                         Nx: hires-lp/mp

b) 1x gauss-xla / short-mp      Nx: short-mp

c) FIR2 and Xfi for DSD content (just WOW !!!)

 

ext3 shutters at Nx. And I seem to pick gauss-xla for 1x content more and more than I used to love short-mp. However! With Prog short-mp shines even more at ASDM7ECv2 @ 1024

 

Here's a screen of i7z utility. What you can see is Core 1[0] and Core 5[4] has the most load. But it does not seem to matter which two cores you push to 55x ratio in BIOS atually. In my practice it needs to be two neighbouring cores for stable set up.

 

image.thumb.png.49539e0a767ed0864b36be00127fe101.png

 

 

And here's a screen from htop, very long running session again, listening for lounge jazz since morning. The main highlight is that the Freezer keeps the temperatures phenomenally under control, I repeat  - phenomenally !!!

 

image.thumb.png.79e8dd5c1bbe30b234ecd24e373affa8.png


These are my HQP settings:

image.thumb.png.fa855198ef3bfda246bde1967fb64d57.png

 

Final thoughts:

 

1st - no 1024x48 (yet) - a bit problematic with "integer" filters. But... to me, this is only a theoretical issue as in practice my NO GPU set-up for most interesting Sinc- filters (Sinc-Mx, Sinc-Ll) can not give me even DSD@512 ! So - not an issue. 

 

2st - HQP gives everything in this up sampling rodeo and seemingly the system with just a CPU alone can not manage additional DSP. I'm not yet using a convolution or matrix for my main listening, but I was using speakers position correction to offset for my listening position that is slightly out of center. With this speaker option "On" HQP shutters. It this a bug or a feature, I don't know. I found a nice workaround however - ROON speaker positioning is a bit perfect feature, so I'm doing this part in ROON - corrected - no shutters.

 

3rd - the overclocking build is stable enough but not infinitely stable. I have once celebrated 24h+ endurance run with ROON radio play. But flipping the system on and off (when not in use) resulted at the end in significant system crash that even required CMOS reset (that came for no visible to me reason). 

 

4th - actually not too bad - system power consumption is not breaking 250w !

 

Conclusion:

 

Yes, it is possible to have ASDM7ECv2 @ 1024 ! You know how complex is to describe the subjective impressions, but I would like to mention few things. On my system increase of output DSD rates from 256 to 512 to 1024 comes with significant expansion of the sound stage. This is less noticeable when jumping from 256 to 512 (11Mhz jump), but it is hugely noticeable stepping from 512 to 1024 (22,5MHz jump). At 1024 the soundstage fills the whole room wall to wall entirely. While my DAC - HoloAudio MAY already provides a good channel separation, the imaging inside this 1024 soundstage also becomes amazing, often bringing the effect of the live presence at the show. Here a good visual example is David Bowie performance of "Wild Is the Wind" from Station to Station album. If you remember how on the music video the camera floats around the musicians, you have this acoustic impression to be emerged into the performance sitting just next to the stage. And such examples are many. 5ECv2 sounds equally impressive, however I find it to be a tiny bit "thinner" than 7ECv2 presentation. 

 

Coming Next...

This post concludes  my 12900K / NO GPU runs. I happen to have 12900KS chip...

Absolutely inspiring post . I am now pursing such a build as a result of your impressive description . One comment I would like to make , regarding considering de-Lidding and direct die cooling of the 12900KS chip. Apparently significant cooling benefits can be achieved with such a mod -Which could result in further performance gain .   

https://www.ekwb.com/shop/ek-quantum-velocity2-d-rgb-1700-nickel-plexi

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...