Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
|
|
|
|
Poster | Thread | kolla
|  |
Re: Amiga SIMD unit Posted on 17-Oct-2022 18:43:51
| | [ #221 ] |
| |
 |
Elite Member  |
Joined: 21-Aug-2003 Posts: 2421
From: Trondheim, Norway | | |
|
| @Hammer
What unit do you use for keeping up with time and such? It is either broken or extremely slow… _________________ B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC |
| Status: Offline |
| | cdimauro
|  |
Re: Amiga SIMD unit Posted on 18-Oct-2022 5:56:32
| | [ #222 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 3097
From: Germany | | |
|
| @Hammer
Quote:
Do you understand that in the real world the integer SIMD instructions are executed as well on FP-intensive code? So this extra port in Haswell is used.
That's also the reason why Haswell still had much better performance of Zen1 on such workloads: Ryzen: Strictly technical

@Hammer
Quote:
Hammer wrote: @cdimauro
Quote:
In fact, it's talking about 128-bit (AVX) instructions, and not about a non-existent AVX-128.
Frome the beginning of the same document: INTEL® ADVANCED VECTOR EXTENSIONS ARCHITECTURE OVERVIEW Intel AVX has many similarities to the SSE and double-precision floating-point portions of SSE2. However, Intel AVX introduces the following architectural enhancements: Support for 256-bit wide vectors and SIMD register set. [...] Intel AVX introduces support for 256-bit wide SIMD registers [...] Efficient encoding of instruction syntax operating on 128-bit and 256-bit register sets. and here we're talking about AVX, so NOT even AVX2.
This clearly shows that AVX introduced support to BOTH 128 and 256 bit instructions, on a 256-bit vector registers file.
You can continue to spend time searching around, but it'll always be a waste of time, because you cannot change the reality: AVX is a 256-bit SIMD extension.
|
FACT: AVX-128 exists |
FACT: AVX-128 does NOT exists, as I've already proved several times here and especially on this post https://amigaworld.net/modules/newbb/viewtopic.php?mode=viewtopic&topic_id=43882&forum=17&start=200&viewmode=flat&order=0#839444 Quote:
as a hardware implementation e.g. AMD Jaguar and it's the lowest common denominator for Xbox One and PS4 game consoles. |
This has NOTHING to do with AVX, AVX2 and AVX-512 which are precise ISA SIMD extensions.
You can implement such extensions as you want: it doesn't matter at all and can say NOTHING about the ISA design.
Do you understand the difference between an ISA = Instruction Set Architecture and one of its possible implementations (microarchitectures)?
Or, let me reformulate: is there any chance that you will learn those simple definitions? AND, possible, to do NOT mix them? Quote:
Jaguar's AVX-256 instruction set |
Which doesn't exists: see above. Instructions sets are ONLY AVX, AVX2, and AVX-512. Full stop.
Plus, AVX, so the FIRST one, was already 256-bit. Quote:
support is for forward compatibility with Zen 2-based game consoles with actual hardware AVX-256 implementation. Zen 2 has hardware AVX2-256 implementation. |
MUHAHAHAH Oh, yes!!!
AMD had crystal balls and already known in advanced that Zen2, released SIX years after Jaguar, had a 256-bit SIMD implementation.
Sure, I believe you!  Quote:
The only correct thing that you reported. Good! Quote:
Intel Gracemont's AVX2-256 instruction set |
Which does NOT exist. Instruction Set = ISA = AVX, AVX2, AVX-512. And AVX was already 256-bit since day one.
There's no AVX2-256: only on your dreamland. Quote:
support is with 128-bit AVX hardware implementation. |
Correct. Quote:
https://twitter.com/iancutress/status/1327358373373898752?lang=en Cinebench R23 only uses the AVX-128 subset from AVX-512. |
See above: there's no AVX-128.
And AVX-512 has no subset: every implementation supports all register sizes, from 128 to 512 bit. You cannot escape it: everything should be supported. The concrete implementation (microarchitecture) is a completely different thing and it doesn't matter at all here (it's an internal detail). Quote:
AVX-128 hardware implementation and AVX-128 subset usage are real. |
Yes, on your dreamland.  Quote:
Cinebench R23 doesn't properly benchmark AVX-256/AVX2-256 |
Of course, because those ISA doesn't exist. Like pink flying unicorns, god, etc.. Quote:
and AVX-512 usage at their full width. |
From the link it looks like that it doesn't use AVX-512. I haven't read of any "subset" being used.
But I've no Twitter account, so the web site stops me scrolling after some posts. Quote:
AVX-512 is always at its full-width. Microarchitectures can be different, and in this case it looks like that AMD's one implements AVX-512 in 256-bit.
Now, is there any chance that you learn the correct definition of ISA and microarchitecture? Because you're quite confused and continue to mix them, which doesn't make any sense.
@kolla
Quote:
kolla wrote: @Hammer
What unit do you use for keeping up with time and such? It is either broken or extremely slow… |
Well, it's the proof that time is relative. 
I wonder if Hammer was able to sleep well during that longer period, while thinking about how to reply to my post. |
| Status: Offline |
| | Hammer
 |  |
Re: Amiga SIMD unit Posted on 18-Oct-2022 8:19:36
| | [ #223 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 4641
From: Australia | | |
|
| @cdimauro
Quote:
cdimauro wrote:
Do you understand that in the real world the integer SIMD instructions are executed as well on FP-intensive code? So this extra port in Haswell is used.
That's also the reason why Haswell still had much better performance of Zen1 on such workloads:
|
Do you understand Intel Core i7-5960X is Haswell-E with 8 cores and 16 threads?
For the record, I have regenerated my Core "Haswell" i7-4790K gaming PC with cheapo Turing GTX 1660 Super for the ground floor PC.
I sold my Core i7-7820X gaming PC to fund Ryzen 9 3900X gaming PC.
Your cited "relative performance benchmark" obscures 256-bit AVX benchmarks e.g. Cinebench R15.
Also,

The Stilt's results have been disputed and have been banned from Anandtech's forum.
PS; To look into Anandtech's forum member details, I'm also an Anandtech forum member.
Last edited by Hammer on 18-Oct-2022 at 08:41 AM.
_________________ Ryzen 9 7900X, DDR5-6000 32 GB RAM, GeForce RTX 4080 Amiga 1200 (rev 1D1, KS 3.2, TF1260, 68060 @ 63 Mhz, 128 MB) Amiga 500 (rev 6A, KS 3.2, PiStorm/RPi3a/Emu68) |
| Status: Offline |
| | Hammer
 |  |
Re: Amiga SIMD unit Posted on 18-Oct-2022 8:38:23
| | [ #224 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 4641
From: Australia | | |
|
| @cdimauro
Quote:
FACT: AVX-128 subset exists as a lower-cost 128-bit hardware implementation.
You can't handle microarchitecture implementation that optimally implements the AVX-128 subset while offering less optimal AVX-256 compatibility. Both Intel (Gracemont, E-Core) and AMD (Jaguar) have implemented the AVX-128 subset via the 128-bit SIMD hardware with less optimal AVX-256 compatibility mode. This method is good for keeping software compatibility to be less fragmented when offering lower-cost SKUs.
https://www.intel.com/content/www/us/en/developer/articles/tool/software-development-emulator.html https://www.reddit.com/r/WindowsMR/comments/l4ytks/windows_mr_and_old_cpus_avx_it_works/
Any semi-modern X86 CPU can run AVX, but the issue is efficiency and performance. Intel Pentium III's 128bit SSE wasn't true 128-bit SIMD since the SIMD hardware is 64 bits wide, but it prepared the software ecosystem for 128-bit hardware implementation e.g. AMD K8 has 128-bit FADD while Intel Core 2 has 128-bit FADD, and 128bit FMUL.
PowerPC Altivec 128-bit SIMD has been implemented as 128-bit hardware from the start while lower-cost PowerPC's 64-bit SIMD is not Altivec compatible, hence fragmenting the software ecosystem.
Quote:
And AVX-512 has no subset: every implementation supports all register sizes, from 128 to 512 bit. You cannot escape it: everything should be supported. The concrete implementation (microarchitecture) is a completely different thing and it doesn't matter at all here (it's an internal detail).
|
You can't handle microarchitecture implementation that optimally implements the AVX-128 subset while offering less optimal AVX-256 compatibility.
You only look at the front-end instruction set support while not looking at microarchitecture implementation.
AVX-512 has various extensions, hence AVX-512F is the core instruction set.

https://www.tomshardware.com/news/ryzen-7000-zen-4-avx-512-y-cruncher-support Y-Cruncher app will arrive with full AVX-512 support for AMD's upcoming Ryzen 7000 processors.
AMD Zen 4's AVX-512 support is via the "double pump" 256-bit method i.e. multiple 256-bit hardware implementations. Zen 4's front end is twice wide when compared to Zen 3's.
This situation mirrors Zen 1.0's AVX2-256 being "double-pumped" with multiple 128-bit hardware implementations.
AVX mileage can vary since hardware implementation can be different.
Last edited by Hammer on 18-Oct-2022 at 09:12 AM. Last edited by Hammer on 18-Oct-2022 at 09:06 AM. Last edited by Hammer on 18-Oct-2022 at 09:04 AM.
_________________ Ryzen 9 7900X, DDR5-6000 32 GB RAM, GeForce RTX 4080 Amiga 1200 (rev 1D1, KS 3.2, TF1260, 68060 @ 63 Mhz, 128 MB) Amiga 500 (rev 6A, KS 3.2, PiStorm/RPi3a/Emu68) |
| Status: Offline |
| | Hammer
 |  |
Re: Amiga SIMD unit Posted on 18-Oct-2022 9:28:08
| | [ #225 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 4641
From: Australia | | |
|
| @cdimauro
Quote:
This isn't Intel's marketing, but clearly a guideline to developers for helping them converting their code from the SSE to the AVX, starting by using 128-bit AVX instructions.
In fact, it's talking about 128-bit (AVX) instructions, and not about a non-existent AVX-128.
[quote] 128-bit AVX instructions exist that directly replace 128-bit SSE instructions.
[quote] Frome the beginning of the same document: INTEL® ADVANCED VECTOR EXTENSIONS ARCHITECTURE OVERVIEW Intel AVX has many similarities to the SSE and double-precision floating-point portions of SSE2. However, Intel AVX introduces the following architectural enhancements: Support for 256-bit wide vectors and SIMD register set. [...] Intel AVX introduces support for 256-bit wide SIMD registers [...] Efficient encoding of instruction syntax operating on 128-bit and 256-bit register sets. and here we're talking about AVX, so NOT even AVX2.
This clearly shows that AVX introduced support to BOTH 128 and 256 bit instructions, on a 256-bit vector registers file.
You can continue to spend time searching around, but it'll always be a waste of time, because you cannot change the reality: AVX is a 256-bit SIMD extension.
|
 Notice AVX-128's smaller data types didn't extend to 256-bit wide types. There's inconsistency with AVX definition.
AVX512 VNNI extension enables 8-bit and 16-bit datatypes on wider AVX-512.
AVX256 VNNI extension enables 8-bit and 16-bit datatypes on wider AVX-256 e.g. Alder lake.
It's Intel's developer relation marketing push for AVX-128 against legacy 128bit SSE instructions.
From year 2012 https://www.naic.edu/~phil/software/intel/319433-014.pdf Quoting from Intel's Intel® Architecture Instruction Set Extensions Programming Reference (doc number: 319433-014, AUGUST 2012)
2.8.2 Using AVX 128-bit Instructions Instead of Legacy SSE instructions Applications using AVX and FMA should migrate legacy 128-bit SIMD instructions to their 128-bit AVX equivalents. AVX supplies the full complement of 128-bit SIMD instructions except for AES and PCLMULQDQ.
Last edited by Hammer on 18-Oct-2022 at 09:33 AM.
_________________ Ryzen 9 7900X, DDR5-6000 32 GB RAM, GeForce RTX 4080 Amiga 1200 (rev 1D1, KS 3.2, TF1260, 68060 @ 63 Mhz, 128 MB) Amiga 500 (rev 6A, KS 3.2, PiStorm/RPi3a/Emu68) |
| Status: Offline |
| | QBit
|  |
Re: Amiga SIMD unit Posted on 18-Oct-2022 22:31:20
| | [ #226 ] |
| |
 |
Regular Member  |
Joined: 15-Jun-2018 Posts: 290
From: Unknown | | |
|
| | Status: Offline |
| | cdimauro
|  |
Re: Amiga SIMD unit Posted on 19-Oct-2022 21:58:23
| | [ #227 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 3097
From: Germany | | |
|
| @Hammer
Quote:
Hammer wrote: @cdimauro
Quote:
cdimauro wrote:
Do you understand that in the real world the integer SIMD instructions are executed as well on FP-intensive code? So this extra port in Haswell is used.
That's also the reason why Haswell still had much better performance of Zen1 on such workloads:
|
Do you understand Intel Core i7-5960X is Haswell-E with 8 cores and 16 threads? |
And what's the point? Wasn't Ryzen 1800X a processor with 8 core 16 threads? Quote:
For the record, I have regenerated my Core "Haswell" i7-4790K gaming PC with cheapo Turing GTX 1660 Super for the ground floor PC.
I sold my Core i7-7820X gaming PC to fund Ryzen 9 3900X gaming PC. |
Useless padding. As usual, with you. Quote:
Your cited "relative performance benchmark" obscures 256-bit AVX benchmarks e.g. Cinebench R15. |
So what? Quote:
Also,
[BIG IMAGE REMOVED. YOU ARE SILL UNABLE TO USE AN IMAGE HOST SERVICE]
The Stilt's results have been disputed and have been banned from Anandtech's forum.
PS; To look into Anandtech's forum member details, I'm also an Anandtech forum member. |
I'm also a member and I know for sure that you're telling a lie about The Stilt. In fact, from the end of the thread: https://forums.anandtech.com/threads/ryzen-strictly-technical.2500572/page-86
the OP is no longer a member (he did it himself, not the mods)
So, a moderator said that it was The Stilt himself to decide to disable his account. So, it was NOT banned, as you falsely stated!
Quote:
Hammer wrote: @cdimauro
Quote:
FACT: AVX-128 subset exists as a lower-cost 128-bit hardware implementation. |
You still don't know of what you talk about : there's no such subset.
ALL processors that implement AVX (and AVX-2, AVX-512) then implement ALL instructions for ALL supported vector register sizes. So, they implement the FULL set. Quote:
You can't handle microarchitecture implementation that optimally implements the AVX-128 subset while offering less optimal AVX-256 compatibility.
| It's a complete non-sense: see above. Quote:
Both Intel (Gracemont, E-Core) and AMD (Jaguar) have implemented the AVX-128 subset via the 128-bit SIMD hardware with less optimal AVX-256 compatibility mode. This method is good for keeping software compatibility to be less fragmented when offering lower-cost SKUs. |
Same as above: they implement the FULL AVX and AVX-2 set of instructions (and vector register sizes). Quote:
You continue to have no clue at all of what's an architecture = ISA = Instruction Set Architecture and what's ONE of its various implementations = microarchitecture.
What's even worse is that you desperately search the web seeking for something that might help you on the discussion and post links for which you don't even understand what they talk about.
In fact the above were about Intel's SDE emulator. Which gives you STATISTICS about the executed instructions. So, this is to help the developers to understand which kind of instruction THE SPECIFIC APPLICATION was executing. That's why SDE reports information like:
Running the Histogram Tool To generate the instruction mix histograms by opcode (XED iclass, the default) or instruction form (iform). As of version 4.29, the instruction length and instruction category histograms are always included. [...] The ISA extension histogram is also always computed and printed as star-prefixed rows in the histograms. ISA extensions are things like (BASE, X87, MMX, SSE, SSE2, SSE3, etc.). This is useful to see which instruction set extensions are used in your application. [...] The rows in the mix output histograms come in two flavors. The rows that begin with "*" are meta-categories which sum up the data in different ways. Here are descriptions of some of the meta categories: [...] *avx128 Any AVX instruction with a 128b vector length but without the XED_ATTRIBUTE_SIMD_SCALAR *avx256 Any AVX instruction with a 256b vector length *avx512 Any AVX instruction with a 512b vector length.
It's clearly reported above: those are META (ME-TA) CATEGORIES for the statistics. Quote:
Any semi-modern X86 CPU can run AVX, but the issue is efficiency and performance. Intel Pentium III's 128bit SSE wasn't true 128-bit SIMD since the SIMD hardware is 64 bits wide, but it prepared the software ecosystem for 128-bit hardware implementation e.g. AMD K8 has 128-bit FADD while Intel Core 2 has 128-bit FADD, and 128bit FMUL.
PowerPC Altivec 128-bit SIMD has been implemented as 128-bit hardware from the start while lower-cost PowerPC's 64-bit SIMD is not Altivec compatible, hence fragmenting the software ecosystem. |
Yes, and those ALL belong to the MICROARCHITECTURE. Understood?
The ISA is something different and defines the FULL set of instructions AND registers which ALL of the above microarchitectures MUST implement to be fully compliant. Quote:
Quote:
And AVX-512 has no subset: every implementation supports all register sizes, from 128 to 512 bit. You cannot escape it: everything should be supported. The concrete implementation (microarchitecture) is a completely different thing and it doesn't matter at all here (it's an internal detail).
|
You can't handle microarchitecture implementation that optimally implements the AVX-128 subset while offering less optimal AVX-256 compatibility. |
This is the same absolute non-sense as above: already replied. Quote:
You only look at the front-end instruction set support while not looking at microarchitecture implementation. |
Maybe because those are DIFFERENT things? And the main problem here is that you do NOT understand their respective roles and you mix them all the time! Quote:
AVX-512 has various extensions, hence AVX-512F is the core instruction set.
 |
Sure. And we're still in the architecture = ISA = Instruction Set Architecture domain. In fact the above chart is all about the ISA and it shows only INSTRUCTIONS. Quote:
https://www.tomshardware.com/news/ryzen-7000-zen-4-avx-512-y-cruncher-support Y-Cruncher app will arrive with full AVX-512 support for AMD's upcoming Ryzen 7000 processors.
AMD Zen 4's AVX-512 support is via the "double pump" 256-bit method i.e. multiple 256-bit hardware implementations. Zen 4's front end is twice wide when compared to Zen 3's.
This situation mirrors Zen 1.0's AVX2-256 being "double-pumped" with multiple 128-bit hardware implementations.
AVX mileage can vary since hardware implementation can be different. |
Yes and? That's all about this specific MICROARCHITECTURE which implements the FULL AVX-512 instruction set. So, NO subsets here: ALL should and is implemented.
How it is implemented does NOT matter.
@Hammer
Quote:
Hammer wrote: @cdimauro
Quote:
This isn't Intel's marketing, but clearly a guideline to developers for helping them converting their code from the SSE to the AVX, starting by using 128-bit AVX instructions.
In fact, it's talking about 128-bit (AVX) instructions, and not about a non-existent AVX-128.
This clearly shows that AVX introduced support to BOTH 128 and 256 bit instructions, on a 256-bit vector registers file.
You can continue to spend time searching around, but it'll always be a waste of time, because you cannot change the reality: AVX is a 256-bit SIMD extension.
|
[ANOTHER BIG IMAGE REMOVED BECAUSE YOU ARE SO LIMITED THAT YOU'RE NOT EVEN ABLE TO USE A IMAGE HOST SERVICE] Notice AVX-128's smaller data types didn't extend to 256-bit wide types. There's inconsistency with AVX definition.
AVX512 VNNI extension enables 8-bit and 16-bit datatypes on wider AVX-512.
AVX256 VNNI extension enables 8-bit and 16-bit datatypes on wider AVX-256 e.g. Alder lake.
It's Intel's developer relation marketing push for AVX-128 against legacy 128bit SSE instructions. |
I've already replied here: https://amigaworld.net/modules/newbb/viewtopic.php?mode=viewtopic&topic_id=43882&forum=17&start=200&viewmode=flat&order=0#837337
Reporting exactly the same thing like a broken record doesn't make it true!
As I've said, this is NOT marketing: rather a guideline for developers to help them porting their SSE code to AVX. And the charts show which datatypes are available for the specific registers sizes, so that they know how they can change their code to make better use of them.
But since you've no clue at all about programming you failed understanding even this very basilar information. Quote:
From year 2012 https://www.naic.edu/~phil/software/intel/319433-014.pdf Quoting from Intel's Intel® Architecture Instruction Set Extensions Programming Reference (doc number: 319433-014, AUGUST 2012)
2.8.2 Using AVX 128-bit Instructions Instead of Legacy SSE instructions Applications using AVX and FMA should migrate legacy 128-bit SIMD instructions to their 128-bit AVX equivalents. AVX supplies the full complement of 128-bit SIMD instructions except for AES and PCLMULQDQ. |
Yes, and? It's fully correct. Those are instructions using registers with 128-bit size.
In fact, the document clearly reports "AVX 128-bit instruction". There's no AVX-128. You cannot combine "AVX" and "128-bit" to define non-existing AVX-128 (or AVX-256)!
But, as I've said, you cannot understand those basic concepts because you've no clue at all about those topics.
|
| Status: Offline |
| |
|
|
|
[ home ][ about us ][ privacy ]
[ forums ][ classifieds ]
[ links ][ news archive ]
[ link to us ][ user account ]
|