Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
|
|
|
|
Poster | Thread | Karlos
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 19-Aug-2024 13:07:11
| | [ #361 ] |
| |
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4958
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @Hammer
The 68K shares more in common with the z80 than the 6502 in terms of architecture, e.g. lots of registers and reliance on microcode. Right up to the point where the 68060 comes along and just dumps most (all?) of the microcode and statically wires everything, much like the 6502.
At the risk of what-ifery, it would be interesting to imagine what the performance of a statically wired 68000 would have been like. _________________ Doing stupid things for fun... |
| Status: Offline |
| | cdimauro
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 19-Aug-2024 20:07:20
| | [ #362 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4432
From: Germany | | |
|
| @Karlos: isn't a 68040 what you're looking for? 
BTW, regarding this:
Quote:
Karlos wrote: So, while the 68000 can add a pair of 32-bit integers in a single instruction, where the 6502 would need 5 (clear the carry followed by four add with carry). |
In no way a 6502 requires only 5 instructions for it.
This is the reason why its performance drastically drops when executing some higher-level tasks.
For example: https://en.wikipedia.org/wiki/Zilog_Z8000#Limited_success
Comparing assembly language versions of the Byte Sieve, one sees that the 5.5 MHz Z8000's 1.1 seconds is impressive when compared to the 8-bit designs it replaced, including Zilog's 4 MHz Z80 at 6.8 seconds, and the popular 1 MHz MOS 6502 at 13.9. Even the newer 1 MHz Motorola 6809 was much slower, at 5.1 seconds.[49] It also fares well against the 8 MHz Intel 8086 which turned in a time of 1.9 seconds, or the less expensive 5 MHz Intel 8088 at 4 seconds.
For those looking for pure performance, the Z8000 was the fastest CPU available in early 1979. But this was true only for a period of a few months. The 16/32-bit 8 MHz Motorola 68000 came to market later the same year and turns in a time of 0.49 seconds on the same Sieve test, over twice as fast as the Z8000.
Clock for clock, the 68000 is more than 3,5 times faster than the 6502 on the Byte Sieve benchmark. |
| Status: Offline |
| | Karlos
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 19-Aug-2024 20:44:10
| | [ #363 ] |
| |
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4958
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @cdimauro
With the 6502 example I forgot to include the necessary load and store operations that would be needed when adding each successive byte so yeah, it's even worse.
The 68040 was still somewhat microcoded AFAIK. The 68060 is mostly static wired. What I've always wondered is what happened to the integer multiply instructions from 040 to 060. The improvement is staggering. _________________ Doing stupid things for fun... |
| Status: Offline |
| | Lou
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 20-Aug-2024 0:44:03
| | [ #364 ] |
| |
 |
Elite Member  |
Joined: 2-Nov-2004 Posts: 4259
From: Rhode Island | | |
|
| @bhabbott
Quote:
bhabbott wrote: @Lou
Quote:
Lou wrote:
What a f!ng joke!
Pathetic troll!
Now I realize you're not to bright and all but... |
"Insults are the arguments employed by those who are in the wrong"
|
That makes sense as to why that moron started them.
Quote:
Since it was never released and only about 50 prototypes exist, I doubt you'll find any consistent documentation. 3 different video cards were offered, one had a blitter, also the 8563 has a burst fill/copy.
Quote:
Quote:
The VDC supports a double-pixel mode. This essentially lets you shrink the color cell size to 4x1 on a 320x200 display...it can also go down to 160x200 for a 2x1 cell. It's what makes games I linked possible on the VDC because reducing the column count increases it's speed. This also means that the VDC can display 8x more colors within the typical 8x8 cell/tile of the C64's VIC-II in hi-res 320x200 mode... |
Interesting. The C128 Programmer's Reference Guide does not mention these modes.
Quote:
Apparently not comprehensive enough.
Quote:
C128 Resolutions VDC: 80x25 text 80x50 text 640x172 Hides color mode 640x200 monochrome mode Interlacing available, but not useful
Color cell size VDC: 8x8 to 8x32 |
|
You can see this demo mention it. https://csdb.dk/release/?id=206013&show=summary
you can follow the code here to see how it's done: https://gist.github.com/ytmytm/265d6ae1f5b1df7bd7ef0ba67bdaa816
Just because 'official documentation' doesn't mention it doesn't mean it isn't possible. |
| Status: Offline |
| | Lou
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 20-Aug-2024 1:36:46
| | [ #365 ] |
| |
 |
Elite Member  |
Joined: 2-Nov-2004 Posts: 4259
From: Rhode Island | | |
|
| Byte Sieve. A worthless benchmark used to test compilers...
I can show useless benchmarks that make the 68000 look slow. https://www.youtube.com/watch?v=2k_jP73Ly7A
Funnily enough the code here didn't use zero-page addressing which would have sped up the writing to memory...
Which brings up another point. Even this benchmark's use-case favors the 68000 despite the 68000 losing. It's not writing to memory at all.
Normally, aka in the real world, you'd do something like this to an array, such as a color plane in memory. Once again the 6502 family would crush the 68000 as it can access memory much faster.
In the real world, the cpu doesn't do much math. It uses LUT to get the answer. It's what code for car engines do and video games. You can have your 140-152 cycle DIV opcode. It's a joke. BRA takes about 12 cycles. You can go down the line and cheer that you can do 16/32bit math faster but it doesn't matter. Your fancy 'programmer-friendly' addressing modes take up to 48 cylces. You NEED 8Mhz minimum just to feel fast.
Smarter engineers than us/you/all-of-us have already done the analysis. On average a 1Mhz 6502 is generally equal to a 2.47Mhz 68000...add 20% when using a 65C02...add another 25% when comparing to a 65CE02. Deny reality all you want. You can cherry pick tasks all you want, but it's still reality. This is why ARM won. 68K was inefficient. When it got efficient (040/060), it was too late and too expensive. ARM was superior...is superior. Last edited by Lou on 20-Aug-2024 at 02:10 AM. Last edited by Lou on 20-Aug-2024 at 02:05 AM. Last edited by Lou on 20-Aug-2024 at 02:04 AM. Last edited by Lou on 20-Aug-2024 at 01:55 AM.
|
| Status: Offline |
| | Lou
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 20-Aug-2024 1:40:27
| | [ #366 ] |
| |
 |
Elite Member  |
Joined: 2-Nov-2004 Posts: 4259
From: Rhode Island | | |
|
| @cdimauro
Quote:
cdimauro wrote: @Lou
...a bunch of horse manure...
|
This from the numbskull that takes wiki speculation as hard facts and still can't explain how a C64 demo and can contain 8 bitplanes of color data to generate 256 colors and not run out of memory.
'oh but the unofficial wiki said so ... so it must be true!'
Still asking for proofs when the troll has provided none.Last edited by Lou on 20-Aug-2024 at 01:41 AM.
|
| Status: Offline |
| | Hammer
 |  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 20-Aug-2024 3:46:41
| | [ #367 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6503
From: Australia | | |
|
| @Karlos
Quote:
Karlos wrote: @Hammer
The 68K shares more in common with the z80 than the 6502 in terms of architecture, e.g. lots of registers and reliance on microcode. Right up to the point where the 68060 comes along and just dumps most (all?) of the microcode and statically wires everything, much like the 6502.
At the risk of what-ifery, it would be interesting to imagine what the performance of a statically wired 68000 would have been like.
|
68040 would be the first 68K CPU with mostly hardware implementation which consumed about 1.2 million transistors vs 68030's 273,000 transistors.
For the full hardware 68000 implementation, complex instructions would require additional transistors.
68060 has extra features such as two integer pipelines.
Not every 68K instruction would be used for a 3D use case, hence why I argued for a hybrid design instead of ColdFire's selective 68K instruction ejection.
Like Amiga Hombre's 1 million transistor budget, Sony's PSX has a 1 million transistor budget for CPU, GTE, and GPU.
I argued for 68030M i.e. MUL instruction optimized 68030 to counter MIPS R3000's stronger MUL argument. 68030's math instructions subset would be tuned for the DSP/3D role, while the rest of the instructions operate normally.
For Motorola's price policy with Atari Falcon's price range with strong MUL instructions, Atari needs to purchase 56K DSP with 68020/68030 i.e. a two-chip purchase regime to replace two-chip 68020 and 68851. Without the support chips upgrade, 680EC040 is not usable on Amiga and Falcon platforms. Motorola's price policy wouldn't survive against AMD's status quo-breaker mindset. The "3 MB game consoles" group's PS1 and Saturn have about 68040 level CPU target with a 68030 price range boosted by matrix math co-processor or second 68040 level CPU power.
PS1's GTE ( matrix math co-processor @ 58 Mhz) is based on a cutdown MIPS CPU optimised for 3D matrix math operations.
Sega added the second SuperH-2 @ 28 MHz for Saturn.
Gaming PCs would need a Pentium class CPU for games ported from Saturn or PSX.
CD32's MPEG FMV workload was processed by a custom MIP-X CPU @ 40Mhz with strong DSP-like integer strength.
For math intensity processing for the dollar, SuperH2 and MIPS R3000A are hard to beat.
Motorola's 68K was pushed out of high-performance game consoles for performance and economic reasons.
Motorola would need new leadership with a status quo-breaker mindset to restore 68K performance vs value characteristics. Motorola's leadership runs like the establishment.
Last edited by Hammer on 20-Aug-2024 at 04:32 AM. Last edited by Hammer on 20-Aug-2024 at 04:29 AM. Last edited by Hammer on 20-Aug-2024 at 04:05 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | cdimauro
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 20-Aug-2024 4:34:34
| | [ #368 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4432
From: Germany | | |
|
| @Karlos
Quote:
Karlos wrote: @cdimauro
With the 6502 example I forgot to include the necessary load and store operations that would be needed when adding each successive byte so yeah, it's even worse. |
It also depends on how / where those 32-bit data are stored.
If a pointer to them should be passed as argument a the function that adds them (so, making the 32-bit addition usable everywhere in the code), you can imagine yourself how many other instructions are needed. Quote:
The 68040 was still somewhat microcoded AFAIK. The 68060 is mostly static wired. |
Yes, but you can't completely get rid of microcode if you like to simplify the PMMU pages walking. Quote:
What I've always wondered is what happened to the integer multiply instructions from 040 to 060. The improvement is staggering. |
Eh. The usual Motorola policy: cutting stuff to save transistors... |
| Status: Offline |
| | cdimauro
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 20-Aug-2024 4:50:55
| | [ #369 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4432
From: Germany | | |
|
| @Lou
Quote:
Lou wrote: @bhabbott
Quote:
bhabbott wrote: @Lou
"Insults are the arguments employed by those who are in the wrong"
|
That makes sense as to why that moron started them. |
Don't try to change the cards on the table: it's enough to sequentially read the comments to see who started insulting.
You're so childish that you aren't even be able to take the responsibility for your actions... Quote:
Quote:
Since it was never released and only about 50 prototypes exist, I doubt you'll find any consistent documentation. 3 different video cards were offered, one had a blitter, also the 8563 has a burst fill/copy. |
Don't worry: the video that you've already shared was enough to see how much "fast" it was.  Quote:
There's also a video which shows it in action. Have you felt ashamed to share it? Here is it: https://www.youtube.com/watch?v=iYFQZyK3xSo
A nice and slow... slideshow.
That's the most that you can get, since from the code it's clearly visible why:
// copy bitmap (256x200=6400 bytes) from C128 RAM to VDC RAM VDC_BlitBitmap: [...] loop: lda $0000,y !: bit VDC_REG bpl !- sta VDC_DATA_REG iny bne loop inc loop+2 inx cpx #$19 bne loop rts The super slow copy operation from the CPU's RAM to the VDC's RAM.
That's for transferring ONE byte at the time, but at the beginning you need to check the VDC's status bit, otherwise you interfere with it. In fact, you can only transfer data when it's NOT displaying something (e.g.: only during the vertical or horizontal blank period).
That's why you can do very little with the VDC and it's not suitable for games: its memory is too limited for storing both the screen and the graphics assets, so you need to use the CPU's memory for them, but with this so slow operation.
In short: USELESS CRAP. |
| Status: Offline |
| | Hammer
 |  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 20-Aug-2024 5:00:23
| | [ #370 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6503
From: Australia | | |
|
| @Lou
Quote:
Lou wrote: Byte Sieve. A worthless benchmark used to test compilers...
I can show useless benchmarks that make the 68000 look slow. https://www.youtube.com/watch?v=2k_jP73Ly7A
Funnily enough the code here didn't use zero-page addressing which would have sped up the writing to memory...
Which brings up another point. Even this benchmark's use-case favors the 68000 despite the 68000 losing. It's not writing to memory at all.
Normally, aka in the real world, you'd do something like this to an array, such as a color plane in memory. Once again the 6502 family would crush the 68000 as it can access memory much faster.
In the real world, the cpu doesn't do much math. It uses LUT to get the answer. It's what code for car engines do and video games. You can have your 140-152 cycle DIV opcode. It's a joke. BRA takes about 12 cycles. You can go down the line and cheer that you can do 16/32bit math faster but it doesn't matter. Your fancy 'programmer-friendly' addressing modes take up to 48 cylces. You NEED 8Mhz minimum just to feel fast.
Smarter engineers than us/you/all-of-us have already done the analysis. On average a 1Mhz 6502 is generally equal to a 2.47Mhz 68000...add 20% when using a 65C02...add another 25% when comparing to a 65CE02. Deny reality all you want. You can cherry pick tasks all you want, but it's still reality. This is why ARM won. 68K was inefficient. When it got efficient (040/060), it was too late and too expensive. ARM was superior...is superior.
|
In reality, 68000 reached 8Mhz and beyond e.g. 66 Mhz with DragonBall Super VZ (MC68SZ328) on a handheld mobile device.
In the handheld mobile market, DragonBall Super VZ was pushed out by ARMv4T-era CPUs after the initial success of DragonBall VZ.
Before the low-cost 68EC020 offer in early 1991, AA500 was to be paired with 68000 @ 14 Mhz.
The offer in 1991 was 68EC020-16 and 68EC020-25. "A1000 Plus with AGA" would have 68EC020-25 for 1991's $800 retail release which is about $649 in 1987. 68EC020-16's lower cost for AA500 which is PCMCIA-less AA600/A1200.
Commodore wanted to use lower cost 68EC040-25 (about $100) but missing support chips have removed this selection.
Motorola was a major factor in Commodore's mass production release schedule.
For 3DO M2, IBM PowerPC 601 @ 66Mhz pushed out ARM60 (ARMv3). ARM wasn't strong until DEC's StrongARM (ARMv4). StrongARM SA-110 operating at 100, 160, and 200 MHz, were announced on 5 February 1996. ARM would need DEC's high Mhz expertise to make it a strong contender in the triple-digit Mhz race.
Last edited by Hammer on 21-Aug-2024 at 02:38 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | cdimauro
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 20-Aug-2024 5:12:13
| | [ #371 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4432
From: Germany | | |
|
| @Lou
Quote:
Lou wrote: Byte Sieve. A worthless benchmark used to test compilers... |
You don't even read what people write. As usual. Here is it:
Comparing assembly language versions of the Byte Sieve
Hopeless! Quote:
You've already shared this "benchmark", that it can be easily classified as the most stupid and useless benchmark even written, since it just stressed the execution of ONE instruction. Quote:
Which brings up another point. Even this benchmark's use-case favors the 68000 despite the 68000 losing. It's not writing to memory at all.
Normally, aka in the real world, you'd do something like this to an array, such as a color plane in memory. Once again the 6502 family would crush the 68000 as it can access memory much faster. |
Again, you continue to show your big ignorance, since Byte's Sieve benchmark is exactly doing that: https://en.wikipedia.org/wiki/Byte_Sieve
Gilbreath felt the sieve would be an ideal benchmark as it avoided indirect tests on arithmetic performance, which varied widely between systems. The algorithm mostly stresses array lookup performance and basic logic and branching capabilities. Nor does it require any advanced language features like recursion or advanced collection types. The only modification from Knuth’s original version was to remove a multiplication by two and replace it with an addition instead. With the original version, machines with hardware multipliers would otherwise run so much faster that the rest of the performance would be hidden.
So, that's the IDEAL benchmark for processors like the 65xx.
Nevertheless, the 68000 simply destroys it clock-for-clock. Quote:
In the real world, the cpu doesn't do much math. It uses LUT to get the answer. |
Specify: on 65xx's world, since they need to use the slow and expensive LUTs to fill the gap with the many things that they miss. Quote:
It's what code for car engines do |
Really? Care to SHOW it? Just as coincidence, I work for a prestigious car vendor, so I've super interested on checking what you've stated. Quote:
Depends on the video games. On mines they were very very rarely used. Quote:
You can have your 140-152 cycle DIV opcode. It's a joke. BRA takes about 12 cycles. You can go down the line and cheer that you can do 16/32bit math faster but it doesn't matter. Your fancy 'programmer-friendly' addressing modes take up to 48 cylces. You NEED 8Mhz minimum just to feel fast. |
Ah, really. So Motorola's engineers were stupid by adding those instructions because they could be easily emulated with LUTs?
Care to share equivalent routines to those routines, so that we can benchmark them against the execution of their respective instructions? Quote:
Smarter engineers than us/you/all-of-us have already done the analysis. On average a 1Mhz 6502 is generally equal to a 2.47Mhz 68000...add 20% when using a 65C02...add another 25% when comparing to a 65CE02. Deny reality all you want. You can cherry pick tasks all you want, but it's still reality. |
No, that's stupidity: if an engineer takes IPC/MIPS for comparing processors, he's not smart, but a complete idiot. Quote:
This is why ARM won. 68K was inefficient. When it got efficient (040/060), it was too late and too expensive. ARM was superior...is superior. |
ROFL: you don't know even the basic history. 
ARM was NOT competitive against Motorola until the latter decided to exit from the processor market with its 68k family.
The first processors were very fast, nothing to say, but limited to the Archimedes family.
Motorola introduced the 68040 on 1990 keeping up with performances and doing much better than ARM's processors.
After that, it introduced the 68060, which simply crashed ARM.
You don't know of what you talk about!  Quote:
Lou wrote: @cdimauro
Quote:
cdimauro wrote: @Lou
...a bunch of horse manure...
|
This from the numbskull that takes wiki speculation as hard facts and still can't explain how a C64 demo and can contain 8 bitplanes of color data to generate 256 colors and not run out of memory. |
Again, the hopeless joke of nature don't understand that it was NOT about the C64, rather the C65 which the discussion was about the 8 bitplanes.
And yes: the C65 had 128kB of RAM, so perfectly capable of displaying two 320x200 @ 256 colours screens. Quote:
'oh but the unofficial wiki said so ... so it must be true!'
Still asking for proofs when the troll has provided none. |
I've provided a dump of Commodore's C65 documentation.
I've also provided information from the same Wikipedia which YOU have also used.
That's enough for me.
Now it's YOUR turn: you can show the technical documentation about how the C65 worked in the new 256 colour mode, instead of wasting people time with your load of b@alls. |
| Status: Offline |
| | kolla
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 20-Aug-2024 8:44:23
| | [ #372 ] |
| |
 |
Elite Member  |
Joined: 20-Aug-2003 Posts: 3474
From: Trondheim, Norway | | |
|
| @kolla
Quote:
kolla wrote: @Hammer
I want to hear why anyone would but A300, not the flaws of A600 - what was the specs?
Most importantly- would it feature a full keyboard? That’s really the one thing I kept hearing people whine about at the time. |
Should A300 have kept the A600 size, or should it have been full size keyboard? Maybe it should have been a CD32-like “box” system, only with just floppy drive instead of CD, and external keyboard? What was original plans here? What would be the specs of an A300 that would justify its existence?_________________ B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC |
| Status: Offline |
| | ppcamiga1
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 20-Aug-2024 10:00:11
| | [ #373 ] |
| |
 |
Super Member  |
Joined: 23-Aug-2015 Posts: 1014
From: Unknown | | |
|
| c= should stop with 6502 after c64 c128, c16, c116, c65 all it was worth nothing crap c= should just milk c64 as long as it possible
|
| Status: Offline |
| | mskov
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 20-Aug-2024 16:17:31
| | [ #374 ] |
| |
 |
Regular Member  |
Joined: 22-May-2005 Posts: 252
From: Denmark | | |
|
| @kolla
Someone shared a downloadable manual for the A300 on Twitter recently. It looked like the A600 but I seem to recall that there were no possibility for an internal harddrive.
Here it is:
https://archive.org/details/introducing-the-amiga-300/mode/2up
_________________ Kind regards, Morten
Running OS4.1 on an AmigaONE |
| Status: Offline |
| | cdimauro
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 20-Aug-2024 20:04:18
| | [ #375 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4432
From: Germany | | |
|
| A note on this:
// copy bitmap (256x200=6400 bytes) from C128 RAM to VDC RAM
256x200 is NOT the real resolution. Since pixels are horizontally duplicated, it's 128x200: less than C64's multicolour mode... |
| Status: Offline |
| | matthey
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 21-Aug-2024 0:31:47
| | [ #376 ] |
| |
 |
Elite Member  |
Joined: 14-Mar-2007 Posts: 2747
From: Kansas | | |
|
| Karlos Quote:
With the 6502 example I forgot to include the necessary load and store operations that would be needed when adding each successive byte so yeah, it's even worse.
|
The 68k takes 7 instructions to add two 32-bit numbers in memory a byte at a time so 5 instructions would be impressive.
addq.l #4, a0 ; ptr to end of num +1B for predecrement addx.b addq.l #4, a1 ; ptr to end of num +1B for predecrement addx.b and.b #$00, ccr addx.b -(a1), -(a0) addx.b -(a1), -(a0) addx.b -(a1), -(a0) addx.b -(a1), -(a0)
ADDX could be more flexible if it was a 32-bit encoding but the 68000 keeping instructions (other than immediates and displacement data) 16-bit makes decoding easier. LE ordering has an advantage here potentially eliminating 2 instructions.
Karlos Quote:
The 68040 was still somewhat microcoded AFAIK. The 68060 is mostly static wired. What I've always wondered is what happened to the integer multiply instructions from 040 to 060. The improvement is staggering.
|
There is probably a combination of reasons for the large improvement in MUL performance from the 68040 to 68060.
1. microcode elimination (as mentioned) 2. much larger transistor budget allows more optimized MUL 3. faster logic in better silicon (small advantage from 68040 to 68060) 4. more demand and competition for faster MUL from software like 3D and DSP applications
The 68060 32x32=32 MUL 2 cycle latency outperformed most CPUs.
year | CPU | 32x32 latency 1984 68020 41 1985 80386 12-41 1986 Z80000 24 1986 ARM2 16 1988 R3000 12 1989 80486 13-42 1990 68040 20 1992 88110 3 1993 Pentium 9 1993 PPC601 5 1993 SH-2 4 1994 68060 2 1994 PPC603 2-5 1994 PPC604 3-4 1995 6x86 10 1996 R5000 4 1996 StrongARM 1-4 1997 SH-4 4 2002 ARM11 2-5
Even CPUs after the 68060 tended to have longer MUL latencies. This also has multiple reasons.
1. deeper pipelines for higher clock speeds require longer MUL pipelines 2. multiply accumulate (MAC/MAD) sharing the same pipeline requires a deeper pipeline 3. integer, FPU and SIMD unit MUL hardware sharing sacrifices performance for flexibility
Modern CPU cores are usually 2-5 cycles for a MUL even though a single cycle 32x32 may be possible. Even if the 68060 32x32=32 could execute in a single cycle with modern silicon, adding back the 32x32=64 may take 2 cycles as the limitation may be that only one reg write can be performed per cycle (limited by available reg write ports).
Last edited by matthey on 21-Aug-2024 at 12:36 AM.
|
| Status: Offline |
| | Hammer
 |  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 21-Aug-2024 2:23:56
| | [ #377 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6503
From: Australia | | |
|
| @matthey
Quote:
@matthey
The 68k takes 7 instructions to add two 32-bit numbers in memory a byte at a time so 5 instructions would be impressive.
|
The actual clock cycle is dependent on ASIC's implementation.
For example, 68000's ADDI.L EA has 16 cycles.
Quote:
@matthey
The 68060 32x32=32 MUL 2 cycle latency outperformed most CPUs.
|
1. That's less useful when the CPU is used as a rendering device, hence memory bandwidth is an important factor.
2. Game consoles have a major price issue. Sega's Saturn project was looking at 68030's price range.
68060/68LC060/68EC060 range price is out of A1200/CD32's price range!
For PSX project, most of Ken Kutaragi's meetings are about chip prices and pros vs cons debates.
3. 68060's April 1994 release is late for platform system integrators.
The timeline example with PowerPC 601, Sep 1992, IBM delivers first working prototypes of the PowerPC 601 processor. October 1992, IBM and Motorola formally announce the beginning of production of PowerPC 601 microprocessors. April 1993, Motorola begins shipping its PowerPC 601 processors, in 50 MHz and 66 MHz speeds. March 1994, Apple ships PowerMac.
------------- 4. Instruction cycle timings,
68060 MUL L has 2 cycles. 68060 FMUL can range from 3 to 5 cycles. 68060 FSGLMUL can range from 3 to 5 cycles. 68060 FPU is pure FP.
Pentium FIMUL is 6 cycles. Integer MUL on FPU pipelines. Pentium IMUL is 9 cycle. Integer MUL on integer pipelines. Pentium FMUL is 3 cycles. Pentium has 3 pipelines for integer MUL operations.
68060 FDIV can range from 37 to 39 cycles. Pentium FIDIV can range from 22 to 42 cycles. Pentium FDIV can range from 19 to 39 cycles. Limited out-of-order processing with FDIV.
PC DOS's Tomb Raider used Pentium FPU which is different from PSX's integer version.
Quake, enough said.
Quote:
year | CPU | 32x32 latency 1984 68020 41 1985 80386 12-41
|
In 1992, AMD's Am386-40 was priced against Intel's 386DX-25 and Motorola's 68030-25.
Quote:
1986 ARM2 16 1988 R3000 12 1989 80486 13-42 1990 68040 20
|
ARM2 wasn't in game consoles.
ARM6 variant ARM60 was in 3DO. The ex-Amiga engineers led 3DO rejected 68K. ARM60 @ 12 to 20 Mhz was boosted by matrix math co-processor @ 25 Mhz in MADAM (Agnus/Alice counterpart).
LSI's MIPS R3000A CPU @33 Mhz and cut-down costume MIPS-based GTE @ 58 Mhz was in PSX.
Two SuperH2 @ 28 Mhz in a master and slave configuration for Sega Saturn.
The poor man's double integer pipelines via a 68LC040/68EC040 class CPUs and imath co-processor from 3MB game consoles group.
Your pro-68060 argument is useless for low cost price range game consoles.
For the main chips, both Amiga Hombre and Sony PSX has 1 million transistor budgets they are spec'ed in the early 1990s.
IBM PowerPC 602 was designed for "1 million transistors" budget for post-1995 3DO M2.
After 386SX, the X86 world wouldn't re-visit game consoles until Pentium III/Celeron Coppermine/K7 Duron-era original Xbox.
Last edited by Hammer on 21-Aug-2024 at 04:13 AM. Last edited by Hammer on 21-Aug-2024 at 03:37 AM. Last edited by Hammer on 21-Aug-2024 at 02:35 AM. Last edited by Hammer on 21-Aug-2024 at 02:32 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | Hammer
 |  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 21-Aug-2024 3:29:58
| | [ #378 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6503
From: Australia | | |
|
| @kolla
Quote:
kolla wrote:
Should A300 have kept the A600 size, or should it have been full size keyboard? Maybe it should have been a CD32-like “box” system, only with just floppy drive instead of CD, and external keyboard? What was original plans here? What would be the specs of an A300 that would justify its existence? |
The original intent for A300 was a cost-reduced A500.
From Feb 1992, Mehdi Ali ordered A1200 and later CD32 against Bill Sydnes' and Jeff Frank's product stack plans which is too late since the A600 and Commodore PC inventory stockpiles has consumed Commodore's remaining cash reserve and incurred $116 million debt.
CD32 was the cost reduced A1200 which negated A300 ECS. Akiko combined Gayle, two CIAs, and Budgie in one chip. This is like combining Fat Gary, Ramsey and Bridgett in one chip. Akiko includes DMA CD-ROM controller and C2P hardware.
The focus should be 100% percent on AA and its variants e.g. AA+ with chunky pixel display modes.
Pro-PC(Bill Sydnes) and Commodore PC (Jeff Frank) advocates killed Commodore i.e. they made sure ECS Amigas (e.g. A300/A600, "A1000Jr"/A2200) wasn't a threat to Commodore PCs with VGA/SVGA clones.
From Commodore - The Final Years Quote:
A300 Becomes A600
As George Robbins finished off the A300 design, it became clear the cost reduction goal for the system could not be met, largely because of the SMT form factor.
“I think George probably did a pretty good job and the surface mount screwed him,” says Joe Augenbraun.
Sydnes had set out to have two Amiga computers at two different price points: the A500 Plus and the less expensive A300. Instead, the opposite happened. “They took over what George [Robbins] was working on and said, ‘We have to change this.’ They wanted new features, but the A600 didn’t give anybody any new features that anybody would consider useful,” says Dave Haynie. “It didn’t work with the Amiga 500 peripherals. It took away the keypad. The result cost $50 more than the A500. There was this whole list of things that were wrong with it.”
The resulting product would appeal to almost no one. “Bill was out of his depth when it came to the Amiga,” says Jeff Porter. “The A300 was started because he promised Mehdi he could cost reduce the A500. It cost more. That'll teach you to try to out cost-reduce Porter. No one could touch me on that topic!”
Porter believes the project manager should have monitored the development of the computer more closely and have been prepared to abandon ideas. “Surface mount parts cost more. Four layer PCB costs more,” he says. “No one was watching the costs.”
“If I had been given the project to cost reduce the A500, I think I probably could've gotten $100 off of it, maybe $150,” says Joe Augenbraun. “It wasn't that cost reduced. The keyboard was a pretty nice keyboard. There's money to take out there. There's a lot of shielding. There was a lot of plastic. There were extra components on the board. The board was bigger than it needed to be.”
Augenbraun concedes he would have had to abandon SMT. “It was absolutely clear that SMT was more expensive,” he says. “It was just totally obvious. I can see my way to getting some money out of that thing but I wouldn't have been able to do it going to surface mount.”
The PCMCIA slot was also expensive. “That added some cost from a circuitry point of view because they were changing the architecture a little bit,” says Gerard Bucas of GVP. “The bottom line is, manufacturing cost was higher than the A500, I think, like thirty, forty dollars and of course that made it impossible.”
|
A600's build seems to be a higher quality when compared to C64c. I'm okay with A600's build quality for AA600/A1200's slightly higher price segment.
Last edited by Hammer on 21-Aug-2024 at 03:34 AM. Last edited by Hammer on 21-Aug-2024 at 03:31 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | Hammer
 |  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 21-Aug-2024 3:51:47
| | [ #379 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6503
From: Australia | | |
|
| @cdimauro
Quote:
ARM was NOT competitive against Motorola until the latter decided to exit from the processor market with its 68k family.
The first processors were very fast, nothing to say, but limited to the Archimedes family.
Motorola introduced the 68040 on 1990 keeping up with performances and doing much better than ARM's processors.
After that, it introduced the 68060, which simply crashed ARM.
|
ARM60 @ 12 Mhz to 20 Mhz has the performance vs cost advantage against 68EC040 @ 25Mhz and 68030 @ 40 to 50Mhz.
Ex-original Amiga engineers led 3DO rejected the full 32bit 68K implementation.
This topic is about CD32, hence 68040 and 68060-based solutions are outside its price range.
Commodore's 68EC040-25 plans wasn't in CD32's price range. It would be very difficult to fit 68EC040-25 or 68EC060-50 within PSX's or CD32's price range.
Commodore's tolerated $50 CL-450 SoC cost for the FMV module.
Jeff Ported wanted 8 MB RAM ($20 extra) and CL-450 SoC @ 40Mhz integrated with CD32 as a math compute booster. It's effectively Commodore's PlayStation. CL-450 SoC's price would be less than $50 with +10000 units. That's more than enough compute power for Doom and Doom II.
Last edited by Hammer on 21-Aug-2024 at 04:04 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | cdimauro
|  |
Re: DoomAttack (Akiko C2P) on Amiga CD32 + Fast RAM (Wicher CD32) Posted on 21-Aug-2024 4:39:05
| | [ #380 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4432
From: Germany | | |
|
| @Hammer
Quote:
Hammer wrote: @cdimauro
Quote:
ARM was NOT competitive against Motorola until the latter decided to exit from the processor market with its 68k family.
The first processors were very fast, nothing to say, but limited to the Archimedes family.
Motorola introduced the 68040 on 1990 keeping up with performances and doing much better than ARM's processors.
After that, it introduced the 68060, which simply crashed ARM.
|
ARM60 @ 12 Mhz to 20 Mhz has the performance vs cost advantage against 68EC040 @ 25Mhz and 68030 @ 40 to 50Mhz. |
Cost was NOT under discussion here: only performance was mentioned, and the ARM60 can't reach the 68040 performance. Quote:
Ex-original Amiga engineers led 3DO rejected the full 32bit 68K implementation. |
Irrelevant. See below. Quote:
This topic is about CD32, hence 68040 and 68060-based solutions are outside its price range. |
You should pay a better attention at how a discussion evolves and what people say.
Absolutely no: this part of the discussion was ALL ABOUT PERFORMANCE. PURE PERFORMANCE!
You don't need a Oxford PhD to understand it: it's simple English. Quote:
Commodore's 68EC040-25 plans wasn't in CD32's price range. It would be very difficult to fit 68EC040-25 or 68EC060-50 within PSX's or CD32's price range.
Commodore's tolerated $50 CL-450 SoC cost for the FMV module.
Jeff Ported wanted 8 MB RAM ($20 extra) and CL-450 SoC @ 40Mhz integrated with CD32 as a math compute booster. It's effectively Commodore's PlayStation. CL-450 SoC's price would be less than $50 with +10000 units. That's more than enough compute power for Doom and Doom II. |
Irrelevant + padding. |
| Status: Offline |
| |
|
|
|
[ home ][ about us ][ privacy ]
[ forums ][ classifieds ]
[ links ][ news archive ]
[ link to us ][ user account ]
|