Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
|
|
|
|
Poster | Thread | bhabbott
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 11-Jul-2025 6:41:59
| | [ #321 ] |
| |
 |
Cult Member  |
Joined: 6-Jun-2018 Posts: 554
From: Aotearoa | | |
|
| @matthey
Quote:
matthey wrote:
Motorola throwing out their beautiful 68k baby with the bathwater was a politically motivated decision after being pushed into joining the AIM Alliance. |
It wasn't just politics. The 68040 was late and they had trouble keeping the power consumption down. The 68020 wasn't great either. The first batch failed with practically zero yield, and clock speed was limited by power consumption. The 68060 had a similar problem, with the original 0.6u version being too power hungry to go over 50MHz. Motorola's biggest problem wasn't CPU design, but manufacturing.
I'm guessing they figured the lower complexity of RISC would allow for higher clock speeds and better performance with less design effort. The 'political' side would be Apple wanting higher performance to beat Intel, and not being satisfied with the 68040. Loosing Apple was the end for 68k desktop. If Motorola had been more onto it then Apple might have stuck with 68k and we would see 68k equivalents to Pentium up to at least several hundred MHz. But Motorola was not able to make CISC chips that fast.
Unfortunately RISC didn't solve the manufacturing problem. PPC Quote:
Toward the close of the decade, manufacturing issues began plaguing the AIM alliance in much the same way they did Motorola, which consistently pushed back deployments of new processors for Apple and other vendors: first from Motorola in the 1990s with the PowerPC 7xx and 74xx processors, and IBM with the 64-bit PowerPC 970 processor in 2003. In 2004, Motorola exited the chip manufacturing business by spinning off its semiconductor business as an independent company called Freescale Semiconductor...
In 2005, Apple announced they would no longer use PowerPC processors in their Apple Macintosh computers, favoring Intel-produced processors instead, citing the performance limitations of the chip for future personal computer hardware specifically related to heat generation and energy usage, as well as the inability of IBM to move the 970 processor to the 3 GHz range. |
Quote:
matthey wrote:
When a business sabotages their own products for technically inferior products, there is a higher likely hood that they are ignoring business technical factors and analysis too. We know the destiny of Motorola. |
This is not a fair analysis. If Motorola couldn't make CPUs matching Intel they were toast. But they never managed to get the 68040 past 40MHz, while Intel pushed the 486 up to 100MHz. By this time PC were going mainstream and the market for them was exploding. Other platforms would struggle to survive, especially if they were slower.
The 68060 was final proof that Motorola was never going to catch Intel. They couldn't get it to go as fast as Intel's Pentium, and the gap kept widening. That combined with the much greater market for x86 meant Motorola would never be competitive. Less sales meant less profit and less money for R&D and process improvement, which meant they would continue to fall behind.
Quote:
matthey wrote:
The Hyperion A-EonKit syndicate is doing similar shenanigans in Amiga Neverland like producing noncompetitive hardware, continuing forever with broken business models, pushing big lies to cover their ever growing IP encroachments, coercing the competition to protect their products, etc. They are at least sabotaging the competition and appear to be colluding to keep them out of their manipulated market but the result is likely to be the same. Most intellectuals see the incompetence and corruption choosing to stay away from such businesses and markets. |
This is what happens when a market shrinks below the point of viability. The Amiga should have become retro in 1996, but some people couldn't let it go, imagining they could bring the Amiga up to modern standards so fans wouldn't have to turn to the dark side. But the tiny market meant high prices and shenanigans as the survivors fought over scraps. This was encouraged by a small cadre of Amiga zealots who embraced the 'NG' scene despite it having little future.
I looked at getting an NG 'Amiga' back in 2015, but decided it was a joke. I literally threw away all my Amiga stuff except for the A1200, thinking it was worthless. That was the effect it had on me. 2 years later I discovered that the Amiga retro scene was booming. Now I am back home with classic 68k Amiga, where I always wanted to be.
I don't think it's fair to call it 'incompetence and corruption'. More a case of fans wanting something that isn't practical. Without economy of scale you aren't going to get an NG Amiga at a competitive price. A different approach is needed. Instead of cutting off the existing user base, leverage it with cheap modern hardware that greatly improves performance without changing the architecture. We are now seeing this with PiStorm etc.
|
| Status: Offline |
| | minator
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 11-Jul-2025 14:10:02
| | [ #322 ] |
| |
 |
Super Member  |
Joined: 23-Mar-2004 Posts: 1035
From: Cambridge | | |
|
| This whole thread seems to be based on the idea that smaller code is better and faster. This was true in the 70s and much of the 80s because memory was slow and expensive and CPUs were operating directly from it. So, a smaller number of memory reads was a win for performance.
However, once cache was introduced, this was no longer true. Consider a loop, There is an advantage for smaller code when reading the data into cache, but for every subsequent iteration there is no advantage.
Now, what happens if I unroll the loop? I create larger code, but it'll run faster. Again, there is an advantage for smaller code when reading the data into cache, but for every subsequent iteration the advantage is to the bigger code. In fact, the cost to the silicon of implementing CISC style smaller code becomes a disadvantage, limiting the clock speed and thus performance.
The 68K line pretty much created the high end workstation market and owned it for much of the 80s. However, once RISC processors appeared, they were easier to design, cheaper to make and faster. The fact they used bigger code had no disadvantage to their performance whatsoever.
The 68K line went from owning the workstation market to having a hard time competing with PCs. The 060 was competitive with a Pentium at the same clock speed, but it was too late, the Pentium was 100MHz when the 060 launched at 50MHz. The RISC processors with their bigger code were not bothered.
It's clear to me the 060 was not designed as a workstation processor. It had a weak FPU, a 32 bit bus and no L2 cache. The workstation chips all had monster FPUs, 64 bit busses, and large L2 caches. Even the 486 used L2! That said, it was quite an upgrade from the 040 so probably made a very nice upgrade for an Amiga.
Motorola, having lost the workstation and desktop markets, evolved the 68K line into embedded CPU32, Coldfire and Dragonball chips. This makes a lot more sense as that's where small code size is more important and they were very successful, at least for a while.
Benchmarks 1994 SPEC Int/Float 92 (ordered by Int) Source: https://www.cs.toronto.edu/pub/jdd/spectable
CPU__________MHz___Int___Float
68060_________50___49.0*___ ? ** 486DX4_______100___51.4___26.6 MicroSPARC 2__70___57.0___47.3 Pentium_______66___67.4___61.5 (1993) PowerPC 603___80___75.0___85.0 *** PowerPC 601___80___78.8___90.4 R4400________150___91.7___97.5 Pentium______100___96.2___81.2 HyperSPARC___100__104.5__127.6 R8000_________75__108.7__310.6 POWER2________71__134.1__273.8 PA 7150 _____125__136.0__201.0 PA 7200______120__168.7__269.2 (1995) Alpha 21064A_275__193.8__292.6
* Estimate from Motorola: https://websrv.cecs.uci.edu/~papers/mpr/MPR/ARTICLES/080502.pdf
** Unknown, but 68060 has half the FP throughput of the Pentium.
*** PowerPC 603 result: https://68kmla.org/bb/index.php?threads/how-did-the-powerpc-603-5200-at-75mhz-compare-to-pc-s-486-pentium.47493/ https://groups.google.com/g/comp.sys.mac.hardware.misc/c/cc_47GAHYb0?pli=1
Had the 060 been pushed to 100 or even 150MHz, by 1996 it would have been more competitive at Integer but even less competitive in FP. However, Intel's Pentium Pro (essentially a fast OoO RISC with CISC decoders) was way ahead in Integer.
Benchmarks 1996 SPEC Int/Float 95 (ordered by Int)
Source: https://www.cs.toronto.edu/pub/jdd/spectable
CPU___________MHz____Int_______Float
MicroSPARC 2__110____1.59______1.99 68060__________75____2.31*_____ ? Pentium_______100____3.30______2.59 68060_________100____3.30**____ ?___(Hypothetical) HyperSPARC____150____4.02______4.73 Pentium_______150____4.27______3.04 68060_________150____4.27**____ ?___(Very Hypothetical) PA7200________120____4.61______8.24_(1995) R5000_________180____4.82______5.42 POWER2SC______120____5.61_____16.60 Pentium Pro___150____6.08______5.42 UltraSPARC____167____6.6_______9.37 PowerPC 604e__200____7.22______6.91 PowerPC 603e__300____7.40***___6.10 Pentium Pro___200____8.09______6.75 R10000________195____9.48_____19.00 PA8000________180____11.8_____20.20 Alpha 21164___400____12.1_____17.20
* Estimated based on Pentium 75MHz ** Guesstimated based on Pentium 100/150MHz
** Power 603e number is a Motorola estimate: https://www.cpushack.com/CIC/announce/1997/MotorolaPowerPC603e-300.html _________________ Whyzzat? |
| Status: Offline |
| | ppcamiga1
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 11-Jul-2025 16:30:19
| | [ #323 ] |
| |
 |
Super Member  |
Joined: 23-Aug-2015 Posts: 1017
From: Unknown | | |
|
| @bhabbott
I buy Amiga NG because I found it as btter Amiga than these made after Amiga 500 you should not compare Amiga NG to pc you should compare Amiga NG for example to Amiga 4000 from Commodore Amiga NG is cheaper faster simply better
|
| Status: Offline |
| | ppcamiga1
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 11-Jul-2025 16:33:44
| | [ #324 ] |
| |
 |
Super Member  |
Joined: 23-Aug-2015 Posts: 1017
From: Unknown | | |
|
| @minator
17 pages of bs because one crazy cannot accept that Motorola have to switch from 68k to further compete with Intel it happed almost 35 years ago but mattay stll have problem with it
|
| Status: Offline |
| | cdimauro
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 12-Jul-2025 4:33:10
| | [ #325 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4438
From: Germany | | |
|
| @Hammer
Quote:
Hammer wrote: @cdimauro
There are operations being applied on multiple data elements per instruction with SIMD, which is important for 3D games while the 68060 version is scalar. |
What's not clear to you that this thread is about CODE DENSITY? Memory footprint is also good to discuss, because code density is a very important factor which heavily influences it. Quote:
The focus for Amiga Hombre is texture-mapped 3D within $50 BOM cost range. The primary Amiga Hombre's use case is CD3D game console and A1200 (desktop computer with game console bias) replacement. Unlike Apple's Mac, the Amiga didn't establish a large enough business customer base. |
Irrelevant. Off topic. And PA-RISC was a super crappy processor, as I've already reported on the other thread where I've shown you the reasons for that. Quote:
The code written for 32-bit MIPS-III is typically reduced by 40% in size when compiled for MIPS16 (Kissell 97). |
It was not enough to survive the competition, which did better. Quote:
Yet another US government-funded US academia created another RISC-V pro-density instruction set. For commercialized RISC-V, SiFive's 8-core P550 is available in laptops at the current time, |
I and Matt have shared several numbers about RISC-V code density, which isn't that good.
Only the PULP version is getting good numbers, thanks to several other extensions (I would say obvious extensions. NOT obvious to the academics which have designed RISC-V). Quote:
while it's missing in action with 68060. |
Of course! 68k has one of the best code density around and crushes RISC-V.
So, it does NOT need anything from this PoV (and even on performance side, since it executes LESS instructions ON AVERAGE to achieve the same tasks). Quote:
There's extra aid for RISC-V not including other governments' support for RISC-V.
MIPS was US government-funded via US academia and US DARPA before shifting into commercialization. China's state owned Loongson's MIPS architecture shifted to Loongson 3 5000 ISA, a MIPS and RISC-V blend. |
Irrelevant, off topic, PADDING. Quote:
There's always a cost vs performance. For this topic, cost vs performance vs code density, and this can't be avoided. |
What's not clear to you that this thread is about CODE DENSITY? Memory footprint is also good to discuss, because code density is a very important factor which heavily influences it. Quote:
Without additional support, 68K is dead. |
It's so dead that it gained a LLVM backend recently...
Sure, it requires some modernizations (a SIMD extension. A 64-bit version), but after so many years it still looks very good in all fundamental metrics for an architecture. So, revamping it makes sense. |
| Status: Offline |
| | cdimauro
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 12-Jul-2025 5:02:31
| | [ #326 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4438
From: Germany | | |
|
| @matthey
Quote:
matthey wrote: cdimauro Quote:
I know Embench since long and it's certainly a valuable benchmark, but it's very limited to the embedded market, and actually only supports integers (no FP data). That's because the target is the very low-end embedded market (64kB Flash, 16kB RAM).
|
I had seen Embench mentioned before but never looked into. There was supposed to be a floating-point version of the benchmark being worked on but it should have been out by now if there were no problems. Unfortunately, it looks like support for Embench has stalled. |
Indeed. The test suite is the same and no FP benchmarks were added. It remains of very limited use. Quote:
Maybe they target too small of memory footprint to be popular even though it is useful for larger footprints too? The "64kB Flash" is NOR flash or ROM memory which could greatly supplement the 16kiB minimum RAM. I doubt anyone would write the Embench benchmark to NOR flash memory or ROM to perform the benchmark but maybe the idea is that support code resides there? |
There's no detail like that, but the target footprint isn't something that could help RISC-V on such very limited constrains: there are 8 and 16 bit legacy architectures (or new ones which can "narrow"/"shrink" that much down) which can do much better. Quote:
cdimauro Quote:
But for measuring the code density is good, albeit they decided to only measure the code size, ignoring the dataro segment, which is relevant because it takes space on the Flash and it's also influenced by the architectures (e.g.: constants which are on the dataro segment in some architectures are embedded in the code segment on others, and viceversa).
|
I agree about the data which is an important performance metric too but it is nice to see the man who coined the term RISC and made it popular recognize that code density is important after all. |
Which means that variable-length instructions are very important, since you can't achieve good results in this area with fixed-length encodings.
Yes, it's funny to think about it. 
And I've even more fun when thinking about the super-complicated instructions that were added to RISC-V, which take several cycles to execute (against his "only simple instructions can be executed, in 1 clock cycle") and with some of them which do multiple loads or stores at the same time.
That's priceless.  Quote:
cdimauro Quote:
The paper is quite interesting, but it also shows the limits of such dual-ISA support, because the two ISAs should be similar for a good part. If there are too many differences, then it becomes not convenient anymore.
It might make sense if we look at the low-end embedded market, because the used architectures are simple and usually only at the "Machine" level (e.g.: only the user space execution mode is available). That's because supervisor/kernel mode and, in general, MMUs and other system level elements can diverge even a lot, up to the point that the common backend (and part of the frontend) becomes too fat and/or inefficient.
|
Modern CPU cores often do support multiple ISAs already. ARM had original 32-bit ARM, Thumb, Thumb-2 and AArch64 for 4 ISAs to support. Some older ARM cores supported the Jazelle ISA which allowed to execute Java byte code. ARM had a large variety of ISAs even though most had similarities. Maybe not that much different than 32-bit ARM and RISC-V though. |
Well, they are "RISCs" (SIC!) and for many basic things are very similar or even exactly the same. But the devil is in the details, as I've explained before. Quote:
The x86 ISA with segments and 16-bit modes is still supported on 64-bit x86-64 cores even though the additional logic paths may cause a slowdown of the max clock speed. |
It doesn't look like, since they reached 6GHz (and "RISCs" are still far away). Quote:
The area is certainly increased too with the larger ISAs but most of it is the newer ISAs. |
And even more will come with APX. Quote:
ARM is jettisoning most of their baggage ISAs while Intel has been unsuccessful at jettisoning their ISA baggage. |
Intel tried, but it failed. Too many vendors have criticized the X86S initiative, which could have removed part of this baggage (not so much, but... that's ok: something slimmer is good). Intel is bound hand and foot to its past and cannot break free of it: it will also mark its end.
ARM is in a much better position because it continues to offer to its customers what they need. If they still want the old ARM32 license, they can have it. But there's no further development. Quote:
Multiple 32-bit ARM ISAs were useful but ARM went all in on one large standard all purpose 64-bit ISA instead. |
Which is very good: space open for the competition, on both 32 and 64 bit (since the latter suffers from not good code density).  Quote:
cdimauro Quote:
In fact, the paper mentions also the Transmeta's Crusoe processors as a software-based translation platform, omitting key information about how it really worked (e.g.: the internal architecture). Which is basically what people usually find searching around: "software translation!". The reality is quite different, since those processors implements parts of the x86 system-level (PMMU and some other stuff) in hardware, which are fundamental things that HEAVILY influence the execution performance. Without that (e.g.: a completely different architecture which is purely relying on the software translation of x86 instructions), the performance would have been miserable. |
Code morphing sounds more like software even though it could be hardware as the paper demonstrates. Also, the VLIW hardware is much different than traditional instruction pipelines. Yes, extra hardware support was utilized by matching common hardware processing and control resources. The ~$1 billion Transmeta experiment did not provide enough info to stop the $10+ billion Itanic mistake. At least the combi pipeline experiment was cheap and the solution may be more practical than VLIW "general purpose" cores. |
Indeed, but it looks like that nVidia did better with its Project Denver. Some Transmeta engineers joined nVidia years ago, and this is clearly visible on this project.
I know that it's still being used, but it's difficult to see on which products. In the last years I've not seen any news about it. |
| Status: Offline |
| | cdimauro
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 12-Jul-2025 5:39:26
| | [ #327 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4438
From: Germany | | |
|
| @bhabbott
Quote:
bhabbott wrote: @matthey
Quote:
matthey wrote:
Motorola throwing out their beautiful 68k baby with the bathwater was a politically motivated decision after being pushed into joining the AIM Alliance. |
It wasn't just politics. The 68040 was late and they had trouble keeping the power consumption down. The 68020 wasn't great either. The first batch failed with practically zero yield, and clock speed was limited by power consumption. The 68060 had a similar problem, with the original 0.6u version being too power hungry to go over 50MHz. Motorola's biggest problem wasn't CPU design, but manufacturing.
I'm guessing they figured the lower complexity of RISC would allow for higher clock speeds and better performance with less design effort. The 'political' side would be Apple wanting higher performance to beat Intel, and not being satisfied with the 68040. Loosing Apple was the end for 68k desktop. If Motorola had been more onto it then Apple might have stuck with 68k and we would see 68k equivalents to Pentium up to at least several hundred MHz. But Motorola was not able to make CISC chips that fast.
Unfortunately RISC didn't solve the manufacturing problem. PPC Quote:
Toward the close of the decade, manufacturing issues began plaguing the AIM alliance in much the same way they did Motorola, which consistently pushed back deployments of new processors for Apple and other vendors: first from Motorola in the 1990s with the PowerPC 7xx and 74xx processors, and IBM with the 64-bit PowerPC 970 processor in 2003. In 2004, Motorola exited the chip manufacturing business by spinning off its semiconductor business as an independent company called Freescale Semiconductor...
In 2005, Apple announced they would no longer use PowerPC processors in their Apple Macintosh computers, favoring Intel-produced processors instead, citing the performance limitations of the chip for future personal computer hardware specifically related to heat generation and energy usage, as well as the inability of IBM to move the 970 processor to the 3 GHz range. |
|
As you can see, there were also other, fundamental, problems which badly influenced the 68k family evolution. Quote:
Quote:
matthey wrote:
When a business sabotages their own products for technically inferior products, there is a higher likely hood that they are ignoring business technical factors and analysis too. We know the destiny of Motorola. |
This is not a fair analysis. If Motorola couldn't make CPUs matching Intel they were toast. But they never managed to get the 68040 past 40MHz, while Intel pushed the 486 up to 100MHz. |
Which were close, if you consider that the 68040 worked like a 486 DX2 processor: doubling the internal frequency of the master clock. So, a 40Mhz 68040 was in reality running at 80Mhz.
Intel's 486 DX2 reached 66Mhz with 5v, and 100Mhz with 3.3v. The DX4 (with triple the base clock speed) reached 100Mhz at 3.3v.
I don't see why Motorola could haven't done something similar with the 68040. Quote:
By this time PC were going mainstream and the market for them was exploding. Other platforms would struggle to survive, especially if they were slower. |
Indeed. ALL other platforms... Quote:
The 68060 was final proof that Motorola was never going to catch Intel. They couldn't get it to go as fast as Intel's Pentium, and the gap kept widening. That combined with the much greater market for x86 meant Motorola would never be competitive. Less sales meant less profit and less money for R&D and process improvement, which meant they would continue to fall behind. |
Consider that Motorola was already working in parallel to its new PowerPC processors. Likely, their resources were majorly on that side and less on the 68060 side.
However, PowerPC was a complete failure, as we know.
Why Motorola had to continue with the Dragonball family? And, more important, why with the Coldfire?.
It's clearly evident that PowerPCs were NOT able to entirely replace the 68k family. For the obvious reasons which we are talking here (code density is the primary one, of course). |
| Status: Offline |
| | cdimauro
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 12-Jul-2025 6:16:28
| | [ #328 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4438
From: Germany | | |
|
| @minator
Quote:
minator wrote: This whole thread seems to be based on the idea that smaller code is better and faster. This was true in the 70s and much of the 80s because memory was slow and expensive and CPUs were operating directly from it. So, a smaller number of memory reads was a win for performance.
However, once cache was introduced, this was no longer true. Consider a loop, There is an advantage for smaller code when reading the data into cache, but for every subsequent iteration there is no advantage.
Now, what happens if I unroll the loop? I create larger code, but it'll run faster. Again, there is an advantage for smaller code when reading the data into cache, but for every subsequent iteration the advantage is to the bigger code. In fact, the cost to the silicon of implementing CISC style smaller code becomes a disadvantage, limiting the clock speed and thus performance. |
If code density wasn't important anymore, then why almost all processor vendors cared about it and introduced proper extensions or even new architectures with the sole purpose of supporting it? Caches were already introduced and we've seen their benefits.
Nevertheless, and well AFTER their introduction, processors vendors introduced extensions and ISAs only for the code density. Even RISC-V, which was the last one (they started on 2011), has reserved 75% of the opcode space only for the 16-bit / "compact" C extension, and this since the very beginning.
Still nowadays there's a lot of research on this topic. Processor vendors are still investing a lot on improving the code generated for their processor for this specific aspect. Not even counting the contributions of the existing compilers.
How do you explain ALL this?
The simple answer is that no: code density still matters. A LOT. Since it also heavily affects performance and power consumptions (and the reasons were already reported in several other comments on this thread). Quote:
The 68K line pretty much created the high end workstation market and owned it for much of the 80s. However, once RISC processors appeared, they were easier to design, cheaper to make and faster. The fact they used bigger code had no disadvantage to their performance whatsoever.
The 68K line went from owning the workstation market to having a hard time competing with PCs. The 060 was competitive with a Pentium at the same clock speed, but it was too late, the Pentium was 100MHz when the 060 launched at 50MHz. The RISC processors with their bigger code were not bothered. |
Because the only thing were they have focused at the time was performance. There was the battle for better performance, and that's the only thing that they care about.
The mantra was: simple instructions, executed on a single clock cycle --> high performance (compared to the CISCs which required more clock cycles).
Simple instructions also means higher clock rates --> even more performances.
It was beautiful on paper, and it also worked out at the beginning, since CISCs weren't able to deliver similar performances, until Intel came with its 80486 and Motorola followed with the 68040, which have started showing how even CISC processors can achieve high performance.
When the RISC propaganda started, there were engineers and academics that did NOT believe that CISC processors could even pipelining... Quote:
It's clear to me the 060 was not designed as a workstation processor. It had a weak FPU, a 32 bit bus and no L2 cache. The workstation chips all had monster FPUs, 64 bit busses, and large L2 caches. |
Indeed.
But RISCs required L2 caches because of their poor code density. And since their only focus was about performance, that was the easy way to go faster (instead of an internal redesign of chip). Quote:
Well, NO! 486s only had L1 caches. Exactly like Pentiums. Quote:
That said, it was quite an upgrade from the 040 so probably made a very nice upgrade for an Amiga. |
Right. But Commodore was (also) too late... Quote:
Motorola, having lost the workstation and desktop markets, evolved the 68K line into embedded CPU32, Coldfire and Dragonball chips. This makes a lot more sense as that's where small code size is more important and they were very successful, at least for a while. |
* Quote:
It should have needed some lifting instead of just pushing up the clock.
But Motorola decided to focus on PowerPCs... Quote:
However, Intel's Pentium Pro (essentially a fast OoO RISC with CISC decoders) was way ahead in Integer. |
No, the P6 was NOT using a RISC processor inside. Time ago there was a blog of one of its key engineers which totally refuted it, explaining the reasons. Quote:
Benchmarks 1996 SPEC Int/Float 95 (ordered by Int)
Source: https://www.cs.toronto.edu/pub/jdd/spectable
CPU___________MHz____Int_______Float
MicroSPARC 2__110____1.59______1.99 68060__________75____2.31*_____ ? Pentium_______100____3.30______2.59 68060_________100____3.30**____ ?___(Hypothetical) HyperSPARC____150____4.02______4.73 Pentium_______150____4.27______3.04 68060_________150____4.27**____ ?___(Very Hypothetical) PA7200________120____4.61______8.24_(1995) R5000_________180____4.82______5.42 POWER2SC______120____5.61_____16.60 Pentium Pro___150____6.08______5.42 UltraSPARC____167____6.6_______9.37 PowerPC 604e__200____7.22______6.91 PowerPC 603e__300____7.40***___6.10 Pentium Pro___200____8.09______6.75 R10000________195____9.48_____19.00 PA8000________180____11.8_____20.20 Alpha 21164___400____12.1_____17.20
* Estimated based on Pentium 75MHz ** Guesstimated based on Pentium 100/150MHz
** Power 603e number is a Motorola estimate: https://www.cpushack.com/CIC/announce/1997/MotorolaPowerPC603e-300.html |
A better metric to check how more efficient are (micro)architectures would have been the SPEC/Mhz. |
| Status: Offline |
| | cdimauro
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 12-Jul-2025 6:22:25
| | [ #329 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4438
From: Germany | | |
|
| @ppcamiga1
Quote:
ppcamiga1 wrote: @bhabbott
I buy Amiga NG because I found it as btter Amiga than these made after Amiga 500 |
There are no "NG" Amiga. They only exist on your narrow mind. Quote:
you should not compare Amiga NG to pc |
It's NOT possible for the above reason: you can't compare a dream with a concrete product. Quote:
you should compare Amiga NG for example to Amiga 4000 from Commodore |
Same as above. Quote:
Amiga NG is cheaper faster simply better |
Again, in your dreams, since no Amiga NG exists. Quote:
ppcamiga1 wrote: @minator
17 pages of bs |
That you don't understand at all, evidently. Quote:
because one crazy cannot accept that Motorola have to switch from 68k to further compete with Intel it happed almost 35 years ago but mattay stll have problem with it |
He's not the only one, since I agree as well.
Now, care to show how your beloved CraPProcessors performed at the code density?
If not, you can do one simple thing: AVOID dirtying this tread with your complete non-sense, HAMSTER! |
| Status: Offline |
| | ppcamiga1
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 12-Jul-2025 15:23:38
| | [ #330 ] |
| |
 |
Super Member  |
Joined: 23-Aug-2015 Posts: 1017
From: Unknown | | |
|
| @cdimauro
Amiga NG is better Amiga than these made by Commodore after Amiga 500. It is your problem that you do not accept reality. Amiga NG is Amiga that Commdore will made if survive few years more. has everything that should be in Amiga RISC 3D FPU MMU etc
|
| Status: Offline |
| | matthey
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 12-Jul-2025 19:47:01
| | [ #331 ] |
| |
 |
Elite Member  |
Joined: 14-Mar-2007 Posts: 2752
From: Kansas | | |
|
| bhabbott Quote:
It wasn't just politics. The 68040 was late and they had trouble keeping the power consumption down. The 68020 wasn't great either. The first batch failed with practically zero yield, and clock speed was limited by power consumption. The 68060 had a similar problem, with the original 0.6u version being too power hungry to go over 50MHz. Motorola's biggest problem wasn't CPU design, but manufacturing.
I'm guessing they figured the lower complexity of RISC would allow for higher clock speeds and better performance with less design effort. The 'political' side would be Apple wanting higher performance to beat Intel, and not being satisfied with the 68040. Loosing Apple was the end for 68k desktop. If Motorola had been more onto it then Apple might have stuck with 68k and we would see 68k equivalents to Pentium up to at least several hundred MHz. But Motorola was not able to make CISC chips that fast.
|
I agree that Motorola had chip fab problems but they were unrelated to the 68k and the politically motivated instead of technically motivated decision to switch to PPC. Motorola also had management problems including all the way to the top where there was a lack of technical understanding. The 68k remained vastly superior to PPC for the embedded market and Motorola did not lose this market until they sabotaged products like the 68060 by not clocking it up, used 68k embedded profits on PPC products and shoved PPC down customer's throats.
bhabbott Quote:
Unfortunately RISC didn't solve the manufacturing problem. PPC Quote:
Toward the close of the decade, manufacturing issues began plaguing the AIM alliance in much the same way they did Motorola, which consistently pushed back deployments of new processors for Apple and other vendors: first from Motorola in the 1990s with the PowerPC 7xx and 74xx processors, and IBM with the 64-bit PowerPC 970 processor in 2003. In 2004, Motorola exited the chip manufacturing business by spinning off its semiconductor business as an independent company called Freescale Semiconductor...
In 2005, Apple announced they would no longer use PowerPC processors in their Apple Macintosh computers, favoring Intel-produced processors instead, citing the performance limitations of the chip for future personal computer hardware specifically related to heat generation and energy usage, as well as the inability of IBM to move the 970 processor to the 3 GHz range. |
|
Motorola production problems increased with PPC as they needed higher tech and more modern chip fab processes to clock up the shallow pipeline PPC CPUs. They also predictably required more caches and costs increased. The PPC601, PPC603 and PPC604 may have been financial failures as they were replaced with the PPC601+, PPC603e and PPC604e using larger caches and newer and more expensive chip fab processes to increase clock speeds. Motorola was replacing PPC CPUs that were not even out for a year and they were using similar processes to Intel. The RISC advantage was real and allowed them to produce higher clocked CPUs sooner. The PowerPC 603e was the first mainstream desktop processor to reach 300 MHz. Then Intel increased their pipeline depth like Mororola already had in their 8-stage 68060 but did not leverage by clocking it up. CISC could not only be pipelined but cache access instructions could be pipelined, CISC pipelines generally did not suffer load-to-use stalls and CISC ISAs generally had better code density for more efficient cache use. The primary ability of a pipeline to be clocked up, whether RISC or CISC, is the number of pipeline stages. The IBM PPC970 (G5) has a deep pipeline and could be clocked up but it was late and not efficient in many ways. CISC pipelines have more integer performance potential!
bhabbott Quote:
This is not a fair analysis. If Motorola couldn't make CPUs matching Intel they were toast. But they never managed to get the 68040 past 40MHz, while Intel pushed the 486 up to 100MHz. By this time PC were going mainstream and the market for them was exploding. Other platforms would struggle to survive, especially if they were slower.
|
Intel had the marketing advantage as cdimauro said with 486 frequency doubling. The 68040 had severe production problems and some questionable design decisions. However, the 68040 still had more performance/MHz than the 486 and it used less power when the problems were solved. The 3.3V 68040V@33MHz dissipates only 1.5W and LPSTOP sleep of the full static design was 0.00066W.
https://textfiles.meulie.net/computers/486vs040.txt Quote:
Dhrystone Benchmark Version 2.1 (Integer Performance Test -- ALU) ----------------------------------------------------------------------------- System | Results - Kdhrystones/s | Relative ----------------------------------------------------------------------------- VAX 11/780 1.6 1.0 Motorola MC68030 (50 Mhz,1ws) 20.0 12.5 Intel 80486 (25 Mhz) 24.0 15.0 SPARC (25 Mhz) 27.0 16.8 Motorola M88000 (20 Mhz) 33.3 20.1 MIPS M/2000, R3000 (25 Mhz) 39.4 23.8 Motorola MC68040 (25 Mhz) 40.0 24.3 Intel 80860 (33.3 Mhz) 67.3 40.6
|
The 68040 had 3 MFLOPS compared to the 486 1.0 MFLOPS in a double precision FP Linpack benchmark from the same link. It was really bad to be late with tech when Moore's Law kicked in hard as Commodore found out too. The late 68040 likely set the 68060 back too but they were smart enough to increase the pipeline depth from 6-stage to 8-stage and add branch prediction and BTB/loop handling to more than compensate and compete strongly against shallow pipeline RISC and Pentium designs.
bhabbott Quote:
The 68060 was final proof that Motorola was never going to catch Intel. They couldn't get it to go as fast as Intel's Pentium, and the gap kept widening. That combined with the much greater market for x86 meant Motorola would never be competitive. Less sales meant less profit and less money for R&D and process improvement, which meant they would continue to fall behind.
|
That was the sabotage of the 68060. Motorola announced a 68060@66MHz to be available within 3 months of production start yet I have never seen a full 68060@66MHz.
Motorola Introduces Heir to 68000 (Microprocessor Report April 18, 1994) https://websrv.cecs.uci.edu/~papers/mpr/MPR/ARTICLES/080502.pdf Quote:
Price & Availability
The 50-MHz 68060, 68LC060, and 68EC060 are now sampling to selected beta sites. General samples will be available 3Q94, with production to follow late in that quarter. Prices are $263, $169, and $150, respectively, in 10,000-unit quantities. Production quantities of the 66-MHz 68060 will be available late in 4Q94, according to the company; no price has been announced. For further information, contact Motorola at 512.891.2917.
|
Motorola created several die shrinks over the long embedded life of the 68060 with the revision 6 68060 often able to be clocked to 100MHz but they are still 50MHz parts for the full CPU which includes MMU and FPU. The 68060 had a long embedded life of 18 years from 1994 to 2012?
https://amitopia.com/68060-amiga-buyers-guide/ Quote:
68060 isn’t produced since 2012 by Freescale so be aware of this!
|
Sabotaging the 8-stage 68060 by limiting the clock speed rating to 50MHz decreased the price efficiency (performance/$) and performance yet it was still a successful CPU in the embedded market. Motorola wasted resources on newer chip fab processes to clock up shallow pipeline PPC CPUs that did not last a year in the market and could not compete with longer pipeline CPUs like the 68060 using a cheaper fab process. A Pentium killer was also a PPC killer necessitating sabotage.
bhabbott Quote:
This is what happens when a market shrinks below the point of viability. The Amiga should have become retro in 1996, but some people couldn't let it go, imagining they could bring the Amiga up to modern standards so fans wouldn't have to turn to the dark side. But the tiny market meant high prices and shenanigans as the survivors fought over scraps. This was encouraged by a small cadre of Amiga zealots who embraced the 'NG' scene despite it having little future.
I looked at getting an NG 'Amiga' back in 2015, but decided it was a joke. I literally threw away all my Amiga stuff except for the A1200, thinking it was worthless. That was the effect it had on me. 2 years later I discovered that the Amiga retro scene was booming. Now I am back home with classic 68k Amiga, where I always wanted to be.
I don't think it's fair to call it 'incompetence and corruption'. More a case of fans wanting something that isn't practical. Without economy of scale you aren't going to get an NG Amiga at a competitive price. A different approach is needed. Instead of cutting off the existing user base, leverage it with cheap modern hardware that greatly improves performance without changing the architecture. We are now seeing this with PiStorm etc.
|
The problem was not that the Amiga market shrunk but that the Hyperion A-EonKit syndicate narrowed their market to a tiny fraction of the Amiga market. Hyperion released the 68k AmigaOS in 2018 and they went from a profit of -44,197 Euros in 2017 to a profit of 150,561 Euros in 2018.
Hyperion financials by year in Euros indicator | 2024 | 2023 | 2022 | 2021 | 2020 | 2019 | 2018 | 2017 | 2016 | 2015 | 2014 profit/loss 2,408 -10,102 -4,049 5,520 -6,561 1,375 150,561 -44,197 -40,024 -39,212 -68,267 debt 209,751 238,417 229,811 242,591 290,219 370,992 579,829 588,033 596,390 599,218 588,062 equity 13,766 11,359 21,461 25,510 19,989 26,551 -373,794 -524,354 -480,158 -440,133 -400,921 gross-margin 44,249 30,659 50,361 45,578 43,526 45,160 199,098 -3,372 -2,675 17 -18,521
The price of incompatible, unfaithful and noncompetitive hardware is AmigaNOne. The 68k Amiga market allowed this kind of a turnaround without competitive hardware!
|
| Status: Offline |
| | matthey
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 12-Jul-2025 23:40:49
| | [ #332 ] |
| |
 |
Elite Member  |
Joined: 14-Mar-2007 Posts: 2752
From: Kansas | | |
|
| minator Quote:
This whole thread seems to be based on the idea that smaller code is better and faster. This was true in the 70s and much of the 80s because memory was slow and expensive and CPUs were operating directly from it. So, a smaller number of memory reads was a win for performance.
However, once cache was introduced, this was no longer true. Consider a loop, There is an advantage for smaller code when reading the data into cache, but for every subsequent iteration there is no advantage.
Now, what happens if I unroll the loop? I create larger code, but it'll run faster. Again, there is an advantage for smaller code when reading the data into cache, but for every subsequent iteration the advantage is to the bigger code. In fact, the cost to the silicon of implementing CISC style smaller code becomes a disadvantage, limiting the clock speed and thus performance.
|
I believe you completely miss the scale of CPU cores today. Loop performance is very important but there are many thousands of loops in modern code. RISC cores often require more loop unrolling to avoid load-to-use stalls and maintain performance. Caches are not one huge cache but multilevel caches because larger caches have slower access times and further away caches can increase access times.
https://www.7-cpu.com/cpu/Ice_Lake.html Quote:
Intel i7-1065G7 (Ice Lake), 3.9 GHz (Turbo Boost), 10 nm. 122 mm2, RAM: 16 GB, dual LPDDR4-3733. Surface Pro 7
o L1 Data cache = 48 KB, 64 B/line, 12-WAY ?. o L1 Instruction cache = 32 KB, ? B/line, ?-WAY. o L2 cache = 512 KB, 64 B/line, 8-WAY o L3 cache = 8 MB, 64 B/line, 16-WAY
o Data TLB L1 (stores): 16 entries, 16-way. Shared for all page sizes o Instruction TLB 4KB: 128 entries (64 per thread MT), 8-way. o Instruction TLB 2MB: 16 entries (8 per thread MT), 8-way.
o L1 Data Cache Latency = 5 cycles for simple access via pointer o L1 Data Cache Latency = 5 cycles for access with complex address calculation (size_t n, *p; n = p[n]). o L2 Cache Latency = 13 cycles o L3 Cache Latency = 42 cycles (core 0) o L3 Cache Latency = 41 cycles (core 1) o L3 Cache Latency = 41 cycles (core 2) o L3 Cache Latency = 42 cycles (core 3) o RAM Latency = 42 cycles + 87 ns (LPDDR4-3733) o RAM Latency = 42 cycles + 64 ns (DDR4-2666 19-19-19)
|
An expensive OoO core can hide the L1 access latency and some of the L2 access latency but a RISC OoO pipeline is very complex, more complex than a CISC in-order pipeline, and much more expensive than a CISC in-order pipeline. OoO generates heat too, much more so than adding more caches. RISC OoO cores have the same reality as above. Their cache access timings are available on the 7-cpu.com website too and are more likely to have longer access times in cycles.
minator Quote:
The 68K line pretty much created the high end workstation market and owned it for much of the 80s. However, once RISC processors appeared, they were easier to design, cheaper to make and faster. The fact they used bigger code had no disadvantage to their performance whatsoever.
|
It is true that RISC CPUs were easier to design and cheaper. RISC pipelines are easier to pipeline but not as powerful as CISC pipelines. Max clock speeds are more related to the number of pipeline stages. RISC pipelines had an early advantage because the number of transistors were limited on chips. Larger code is always a disadvantage which early RISC architects did not appreciate but most understand now.
Alpha
PA-RISC
MIPS
SPARC
PPC
ARM
Do you honestly believe it is just a coincidence that all these RISC ISAs not only died but were close to dying in the order of their code density from worst at top to only bad. They have been replaced by load/store architectures with better code density and that abandon most of the original RISC philosophies. Some people would claim that RISC is dead while others buy into the morphing definition of RISC.
minator Quote:
The 68K line went from owning the workstation market to having a hard time competing with PCs. The 060 was competitive with a Pentium at the same clock speed, but it was too late, the Pentium was 100MHz when the 060 launched at 50MHz. The RISC processors with their bigger code were not bothered.
|
I see the 8-stage 68060 as having a major advantage over the 5-stage P5 Pentium. It could be clocked up more and it was better in almost every way. The only Pentium advantages I can think of were the pipelined FPU, 64-bit data bus which was a liability for embedded use and x86 compatibility. The in-order Pentium pipeline grew from 5-stages to 6-stages and received a chip fab process improvement about every year but the 68060 could have competed with an older cheaper process and was cheaper to begin with.
minator Quote:
It's clear to me the 060 was not designed as a workstation processor. It had a weak FPU, a 32 bit bus and no L2 cache. The workstation chips all had monster FPUs, 64 bit busses, and large L2 caches. Even the 486 used L2! That said, it was quite an upgrade from the 040 so probably made a very nice upgrade for an Amiga.
|
The 68060 definitely was not designed as a workstation CPU. It was a high end embedded and low end desktop CPU. L2 caches did not generally fit on die until the late 1990s. Some high end CPUs had L2 cache tags on chip and 64-bit data buses but these were expensive. The 68060 has 32-bit non-multiplexed address and data buses which provide good performance with reduced 68k memory traffic, much of it coming from reduced instruction traffic do to good code density (many RISC CPUs had multiplexed buses and high memory traffic including ARM and MIPS cores that I am aware of). The 68060 FPU is Spartan but short FPU instruction timings and the good ISA make up for lack of pipelining to provide mediocre performance. The focus of the 68060 was clearly integer performance. Many early RISC CPUs emphasized FPU performance because it was easier for a RISC pipeline but integer performance is more important for general purpose CPUs.
minator Quote:
Motorola, having lost the workstation and desktop markets, evolved the 68K line into embedded CPU32, Coldfire and Dragonball chips. This makes a lot more sense as that's where small code size is more important and they were very successful, at least for a while.
|
CPU32 was a good embedded ISA with very good 68000 compatibility and good 68020 compatibility while providing some simplification over the 68020 ISA. Had Motorola standardized on it while adding improvements like the new ColdFire instructions, they likely could have retained the 68k embedded market. Instead, they castrated the 68k down to the ColdFire losing most 68k compatibility in order to make room for PPC embedded chips. The ARM Thumb-2 ISA is much better than PPC for embedded use so ARM gained the embedded market even though the 68k/CPU32 is better. There was no choice for 68k customers when Motorola stopped producing new 68k chips even though the 68k was #1 in 32-bit embedded CPU sales in 1997 and likely up to the early 2000s. Thumb-2 introduced in 2003 and the integration advantage of SoCs led by ARM eventually caused older 68k CPUs to be replaced.
minator Quote:
Benchmarks 1994 SPEC Int/Float 92 (ordered by Int) Source: https://www.cs.toronto.edu/pub/jdd/spectable
CPU__________MHz___Int___Float
68060_________50___49.0*___ ? ** 486DX4_______100___51.4___26.6 MicroSPARC 2__70___57.0___47.3 Pentium_______66___67.4___61.5 (1993) PowerPC 603___80___75.0___85.0 *** PowerPC 601___80___78.8___90.4 R4400________150___91.7___97.5 Pentium______100___96.2___81.2 HyperSPARC___100__104.5__127.6 R8000_________75__108.7__310.6 POWER2________71__134.1__273.8 PA 7150 _____125__136.0__201.0 PA 7200______120__168.7__269.2 (1995) Alpha 21064A_275__193.8__292.6
* Estimate from Motorola: https://websrv.cecs.uci.edu/~papers/mpr/MPR/ARTICLES/080502.pdf
** Unknown, but 68060 has half the FP throughput of the Pentium.
|
I have Motorola documentation which claims, "Greater than 50 Integer SPECmarks at 50 MHz". There is no claim for SPEC fp and I have not seen it anywhere.
High-Performance Internal Product Portfolio Overview (Issue 10 Fourth Quarter, 1995) Quote:
Greater than 50 Integer SPECmarks at 50 MHz
|
Compiler support was a work in progress and the 68060 never received good compiler support from popular compilers. The Motorola Diab Data C compiler had an instruction scheduler at least. The in-order 68060 has good superscalar mult-issue rates even without an instruction scheduler but better compiler support should offer moderate improvement (most low end superscalar RISC cores have bad performance without an instruction scheduler). The 68060 integer performance was better for the Dhrystone benchmark where it reached 1.56 DMIPS/MHz claimed by Motorola and 1.8 DMIPS/MHz claimed by a Microprocessor Report. More recently, I found and compared the ByteMark benchmark which is like an older SPEC benchmark with int and fp results from several programs. A quick compile with an older version of GCC showed the 68060 had 40% better integer performance at the same clock speed and surprisingly VBCC was nearly as good at fp compared to the Pentium at the same clock speed.
https://amigaworld.net/modules/newbb/viewtopic.php?topic_id=44391&forum=25&16#847418
The Pentium no doubt has higher limits for the pipelined FPU but it requires very careful and likely human optimized code where the 68k FPU has a good ISA that is easy for compilers to use. I worked with Frank Wille on some 68k FPU peephole optimizations and updated the VBCC support code for the 68060 but there is room for performance improvements. A VBCC instruction scheduler would help both int and fp performance but the integer 68k backend should be improved first.
The 68060 had a much more powerful integer design than the P5 Pentium and the 8-stage pipeline vs 5-stage pipeline would have increased the advantage had Motorola chosen to clock it up instead of sabotage it. The 68060 was lower power and cheaper. Comparing my in-order 8-stage 68060@75MHz with a much more expensive 14-stage OoO Pentium Pro has my 68060 outperforming it in integer performance at the same clock speed but the Pentium Pro could be clocked higher for an overall performance advantage (PPro instruction clock timings were often worse than the 68060 and P5 Pentium offsetting some of the advantage).
ByteMark results https://www.macinfo.de/bench/bytemark.html
ByteMark benchmark CPU | int index | int index/MHz 68060@75MHz 1.20 0.016 PPro@200MHz 2.80 0.014 Pentium@90MHz 1.00 0.011
The 68060 had a small enough die that the caches could be doubled. The site above has some RISC CPUs although many are later with larger caches, better fab processes and high end desktop features. Limited OoO PPC cores had good performance but they could not be clocked up with shallow pipelines without expensive fab processes and they were cache hogs. Clocking up a classic 5-stage RISC pipeline was not competitive (R3000) and neither was extending the 5-stage RISC pipeline to a 8-stage RISC pipeline (R4000) like MIPS. The scalar MIPS R4400@200MHz has an 8-stage pipeline which allows it to clock up and it was cheap but it has very poor performance/MHz. The DEC Alpha 21164@266MHz with 7-stage pipeline is much better with good performance but it was very expensive. DEC had some of the best architects in the world but bet too much on the Alpha ISA with poor code density, high clock speeds which generated too much heat and compilers which could not optimize as well as more complex cores and humans.
Last edited by matthey on 13-Jul-2025 at 02:43 AM. Last edited by matthey on 12-Jul-2025 at 11:55 PM. Last edited by matthey on 12-Jul-2025 at 11:50 PM. Last edited by matthey on 12-Jul-2025 at 11:42 PM.
|
| Status: Offline |
| |
|
|
|
[ home ][ about us ][ privacy ]
[ forums ][ classifieds ]
[ links ][ news archive ]
[ link to us ][ user account ]
|