Poster | Thread |
cdimauro
| |
Re: New Classic Amiga market? Posted on 2-Aug-2024 5:08:57
| | [ #161 ] |
|
|
|
Elite Member |
Joined: 29-Oct-2012 Posts: 4127
From: Germany | | |
|
| @michalsc
Quote:
michalsc wrote: @matthey
Quote:
After pondering for awhile, my initial thought is that a DSB instruction alone is adequate for interpreted emulation where the result of each instruction is obtained before the next instruction is interpreted. This effectively serializes the execution of code as if the execution was not pipelined. This is similar to using an ISB instruction after every executed instruction. |
You are probably right and I might add ISB there. For me the DSB was much more important as memory barriers are essential when dealing with MMIO registers. But you are right, letting it empty the pipeline might be a good thing to do. |
Why is it needed? What issue could happen if you don't empty the pipeline from the code that you've generated?
To me it looks like that DSB should be enough: data need to be synchronized.
ISB, on the other hand, looks redundant. |
|
Status: Offline |
|
|
michalsc
| |
Re: New Classic Amiga market? Posted on 2-Aug-2024 6:36:29
| | [ #162 ] |
|
|
|
AROS Core Developer |
Joined: 14-Jun-2005 Posts: 403
From: Germany | | |
|
| @cdimauro
Quote:
Why is it needed? What issue could happen if you don't empty the pipeline from the code that you've generated? |
That depends on how I will implement e.g. MMU. There, instruction barrier might be important, or not. It will be not important if I will issue ISB already in the translated code turning MMU on/off. |
|
Status: Offline |
|
|
matthey
| |
Re: New Classic Amiga market? Posted on 3-Aug-2024 2:18:40
| | [ #163 ] |
|
|
|
Elite Member |
Joined: 14-Mar-2007 Posts: 2387
From: Kansas | | |
|
| cdimauro Quote:
Why is it needed? What issue could happen if you don't empty the pipeline from the code that you've generated?
To me it looks like that DSB should be enough: data need to be synchronized.
ISB, on the other hand, looks redundant.
|
The NOP instruction may be used for the 68020 external coprocessor interface which was eliminated with the 68040+ where NOP is likely unnecessary for synchronization but may have other uses. It may be used for synchronization in hardware drivers.
MC68020 Microprocessors User's Manual Quote:
5.6 BUS SYNCHRONIZATION
The MC68020/EC020 overlaps instruction execution—that is, during bus activity for one instruction, instructions that do not use the external bus can be executed. Due to the independent operation of the on-chip cache relative to the operation of the bus controller, many subsequent instructions can be executed, resulting in seemingly nonsequential instruction execution. When this is not desired and the system depends on sequential execution following bus activity, the NOP instruction can be used. The NOP instruction forces instruction and bus synchronization by freezing instruction execution until all pending bus cycles have completed.
An example of the use of the NOP instruction for this purpose is the case of a write operation of control information to an external register in which the external hardware attempts to control program execution based on the data that is written with the conditional assertion of BERR. Since the MC68020/EC020 cannot process the bus error until the end of the bus cycle, the external hardware has not successfully interrupted program execution. To prevent a subsequent instruction from executing until the external cycle completes, the NOP instruction can be inserted after the instruction causing the write. In this case, bus error exception processing proceeds immediately after the write and before subsequent instructions are executed. This is an irregular situation, and the use of the NOP instruction for this purpose is not required by most systems.
...
8.1.4 Instruction Execution Overlap
Overlap is the time, measured in clocks, when two instructions execute concurrently. In Figure 8-1, instructions A and B execute concurrently, and the overlapped portion of instruction B is absorbed in the instruction execution time of A (the previous instruction). The overlap time is deducted from the execution time of instruction B. Similarly, there is an overlap period between instruction B and instruction C, which reduces the attributed execution time for C.
The execution time attributed to instructions A, B, and C (after considering the overlap) is depicted in Figure 8-2.
It is possible that the execution time of an instruction will be absorbed by the overlap with a previous instruction for a net execution time of zero clocks.
Because of this overlap, a NOP is required between a write to a peripheral to clear an interrupt request and a subsequent MOVE to SR instruction to lower the interrupt mask level. Otherwise, the MOVE to SR instruction may complete before the write is accomplished, and a new interrupt exception will be generated for an old interrupt request.
|
If NOP was just for the outdated 68020 coprocessor interface, the 68040 and 68060 could have done away with the synchronization for the NOP instruction. Since they did not, I assume there are uses for a synchronization instruction that applies to pipelined CPUs with instruction execution overlap and parallel instruction execution. I believe this applies to JIT execution as well. It is likely that 99% of the time a NOP without an ISB would be fine but the same is likely true for a NOP without a DSB. What is expected from documentation and robust though?
Last edited by matthey on 03-Aug-2024 at 02:20 AM.
|
|
Status: Offline |
|
|
cdimauro
| |
Re: New Classic Amiga market? Posted on 3-Aug-2024 13:55:46
| | [ #164 ] |
|
|
|
Elite Member |
Joined: 29-Oct-2012 Posts: 4127
From: Germany | | |
|
| @michalsc
Quote:
michalsc wrote: @cdimauro
Quote:
Why is it needed? What issue could happen if you don't empty the pipeline from the code that you've generated? |
That depends on how I will implement e.g. MMU. There, instruction barrier might be important, or not. It will be not important if I will issue ISB already in the translated code turning MMU on/off. |
The good thing is that a JIT can have different code for different conditions (MMU on or off).
@matthey
Quote:
matthey wrote: cdimauro Quote:
Why is it needed? What issue could happen if you don't empty the pipeline from the code that you've generated?
To me it looks like that DSB should be enough: data need to be synchronized.
ISB, on the other hand, looks redundant.
|
The NOP instruction may be used for the 68020 external coprocessor interface which was eliminated with the 68040+ where NOP is likely unnecessary for synchronization but may have other uses. It may be used for synchronization in hardware drivers.
MC68020 Microprocessors User's Manual Quote:
5.6 BUS SYNCHRONIZATION
The MC68020/EC020 overlaps instruction execution—that is, during bus activity for one instruction, instructions that do not use the external bus can be executed. Due to the independent operation of the on-chip cache relative to the operation of the bus controller, many subsequent instructions can be executed, resulting in seemingly nonsequential instruction execution. When this is not desired and the system depends on sequential execution following bus activity, the NOP instruction can be used. The NOP instruction forces instruction and bus synchronization by freezing instruction execution until all pending bus cycles have completed.
An example of the use of the NOP instruction for this purpose is the case of a write operation of control information to an external register in which the external hardware attempts to control program execution based on the data that is written with the conditional assertion of BERR. Since the MC68020/EC020 cannot process the bus error until the end of the bus cycle, the external hardware has not successfully interrupted program execution. To prevent a subsequent instruction from executing until the external cycle completes, the NOP instruction can be inserted after the instruction causing the write. In this case, bus error exception processing proceeds immediately after the write and before subsequent instructions are executed. This is an irregular situation, and the use of the NOP instruction for this purpose is not required by most systems.
...
8.1.4 Instruction Execution Overlap
Overlap is the time, measured in clocks, when two instructions execute concurrently. In Figure 8-1, instructions A and B execute concurrently, and the overlapped portion of instruction B is absorbed in the instruction execution time of A (the previous instruction). The overlap time is deducted from the execution time of instruction B. Similarly, there is an overlap period between instruction B and instruction C, which reduces the attributed execution time for C.
The execution time attributed to instructions A, B, and C (after considering the overlap) is depicted in Figure 8-2.
It is possible that the execution time of an instruction will be absorbed by the overlap with a previous instruction for a net execution time of zero clocks.
Because of this overlap, a NOP is required between a write to a peripheral to clear an interrupt request and a subsequent MOVE to SR instruction to lower the interrupt mask level. Otherwise, the MOVE to SR instruction may complete before the write is accomplished, and a new interrupt exception will be generated for an old interrupt request.
|
If NOP was just for the outdated 68020 coprocessor interface, the 68040 and 68060 could have done away with the synchronization for the NOP instruction. Since they did not, I assume there are uses for a synchronization instruction that applies to pipelined CPUs with instruction execution overlap and parallel instruction execution. I believe this applies to JIT execution as well. It is likely that 99% of the time a NOP without an ISB would be fine but the same is likely true for a NOP without a DSB. What is expected from documentation and robust though? |
I think that the case of the MOVE to SR instruction which is reported on the above documentation makes sense and should be considered.
However, a JIT can produce different code according to the instructions flow. A NOP shouldn't always be translated to DSB + ISB if there's no subsequent MOVE to SR, for example. |
|
Status: Offline |
|
|
Hammer
| |
Re: New Classic Amiga market? Posted on 8-Aug-2024 2:59:58
| | [ #165 ] |
|
|
|
Elite Member |
Joined: 9-Mar-2003 Posts: 6039
From: Australia | | |
|
| @matthey
Quote:
Now lets consider the AC68080. I believe it has a 64 bit data bus making it a 64 bit CPU.
1. data bus width - AC68080 CPU is 64 bit, AC68080 ISA is undefined 2. max int datatype width - AC68080 CPU is 64 bit, AC68080 ISA is 64 bit 3. max pointer width - AC68080 CPU is 64 bit, AC68080 ISA is 64 bit (lacks 64 bit addressing modes?) 4. register width - AC68080 CPU is 64 bit, AC68080 ISA is 64 bit 5. ALU width - AC68080 CPU is 64 bit, AC68080 ISA is undefined
|
P55 Pentium MMX has an external 64-bit bus and it's not considered to be a "64-bit" CPU. MMX supports 64-bit scalar integer datatype.
A mainstream "64-bit" CPU allows user applications to access beyond 4GB of RAM in a linear memory model which is backed by 64-bit GPRs and 64-bit general-purpose ALU.
K8 Athlon supports dual double-rate data 64-bit memory bus i.e. 128-bit bus, 128-bit FADD vector units, and is still considered a "64-bit" CPU.
AmigaOS is a limitation for 4GB and 8GB RAM equipped RPi 4B/CM4 PiStorm. There's no PAE support on AmigaOS.
IA-32 Windows 2003 Server allows the "32-bit' OS to access memory size beyond 4GB via a memory segmentation model. Under PAE with 16 GB RAM, each 32-bit user app can have its maximum memory allocation.Last edited by Hammer on 08-Aug-2024 at 03:09 AM. Last edited by Hammer on 08-Aug-2024 at 03:08 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
|
Status: Offline |
|
|
MagicSN
| |
Re: New Classic Amiga market? Posted on 28-Sep-2024 11:08:55
| | [ #166 ] |
|
|
|
Hyperion |
Joined: 10-Mar-2003 Posts: 711
From: Unknown | | |
|
| @MagicSN
Btw to complete the list i posted earlier on speed performance:
Heretic2
Pi4 2.2 GHz - 25 fps 640x480 Pi3 - 10 gps 640x480 (19 fps 320x240) Vampire V4 - 8 fps 320x240 Pi5 (Amikit+os3.2) - 49 fps 640x480 Pi5 overclocked - 56 fps 640x480 X5000 with ppc version in sw renderer - 63 fps 060 100 mhz with internal chunky hw - 7 fps in 320x240
Secret Project #1 (sorry cannot reveal the name of the game yet, also release version might be faster, It is interesting as this is a integer math game, difference Pi vs Vamp is less), „secret project #1“ is the codename for a commercial game port i currently work on
Pi4 - 40-42 fps in 1024x768 Vampire V2 - 34 fps in 800x600, 14 fps 1024x768 Pi5 - 50 fps A1222 with ppc version- 37 fps 1024x768 x1000 with ppc version - around 60 fps 1024x768 060 with cvppc and similar system - 14-17 fps in 320x240 060 with aga - currently no fps counter in aga Version but from the look and feel probably similar 40 mhz 040 - no fps counter but probably not much slower than 060 (seems memory speed is limiting factor?)
Still optimizing that game… (Note the nature of the game makes it a bit hard to estimate the fps even with a counter it goes up and down a bit)
And yes it runs on all Vampires (not tested on ApolloOS only on os3.2 and caffeineOS) and also on A1222 ;)
Last edited by MagicSN on 28-Sep-2024 at 02:16 PM. Last edited by MagicSN on 28-Sep-2024 at 11:12 AM. Last edited by MagicSN on 28-Sep-2024 at 11:11 AM.
|
|
Status: Offline |
|
|
vox
| |
Re: New Classic Amiga market? Posted on 28-Sep-2024 12:53:12
| | [ #167 ] |
|
|
|
Elite Member |
Joined: 12-Jun-2005 Posts: 3957
From: Belgrade, Serbia | | |
|
| @MagicSN
Only x5000 and V4 cards are missing, as well as Tabor.A1222 and G4 and G5 Macs running via MorphOS to have some decent CPU and GPU comparison of all power Classic and NG Amigas. Thank you, gives good idea to people what to expect in 3D games. _________________ OS 3.x AROS and MOS supporter, fi di good, nothing fi di unprofessionalism. Learn it harder way! SinclairQL and WII U lover :D YT http://www.youtube.com/user/rasvoja |
|
Status: Offline |
|
|
MagicSN
| |
Re: New Classic Amiga market? Posted on 28-Sep-2024 14:30:57
| | [ #168 ] |
|
|
|
Hyperion |
Joined: 10-Mar-2003 Posts: 711
From: Unknown | | |
|
| @vox
I gave numbers for Tabor A1222 for "Secret Project #1". And for x5000 I gave numbers for Heretic 2 (with 3D Hardware on x5000 H2 reaches 180 fps BTW, 100 fps on the x1000, but for fairness I did the comparisions all with software renderer). As to the new game x5000 is probably the same like x1000 or slightly higher as due to nature of the game there is a maximum fps.
As to Heretic 2 no data for A1222 has been given as it requires a real FPU, does not work with SPE and despite what was told it is not so easy to make a SPE version in this case.
As to MorphOS - well it cannot run the OS 4 version and running the 68k version would be a bit unfair as this would run in emulation only. Of course you could run the old WarpOS version on it. As to the new game the WarpOS version is in works but not yet working.
As to Vampire I gave V4 values for Heretic 2 and V2 values for the new game. I have a V4 here but have not have it properly set up yet (the V4 I originally had needed to be returned to Apollo Computer). Some of my Betatesters have V2 so I could test the new game with that.
Best regards, Steffen
Last edited by MagicSN on 28-Sep-2024 at 02:32 PM.
|
|
Status: Offline |
|
|
matthey
| |
Re: New Classic Amiga market? Posted on 28-Sep-2024 18:13:51
| | [ #169 ] |
|
|
|
Elite Member |
Joined: 14-Mar-2007 Posts: 2387
From: Kansas | | |
|
| MagicSN Quote:
As to Heretic 2 no data for A1222 has been given as it requires a real FPU, does not work with SPE and despite what was told it is not so easy to make a SPE version in this case.
|
It would be interesting what A1222 with GCC -msoft-float compiled version could manage (may need -mfpu=none too?). The CPU clock speed and integer performance are high enough that it may be playable at lower resolutions. It should be easier than compiling a SPE version and faster than trapping PPC code using the standard FPU.
|
|
Status: Offline |
|
|
MagicSN
| |
Re: New Classic Amiga market? Posted on 28-Sep-2024 20:26:02
| | [ #170 ] |
|
|
|
Hyperion |
Joined: 10-Mar-2003 Posts: 711
From: Unknown | | |
|
| @matthey
If i remember right when i tried what you suggested some Weeks ago it required some other linker library to also be Soft float or it would not work. If someone provided me all the needed libs in soft float i could try again of course. |
|
Status: Offline |
|
|
matthey
| |
Re: New Classic Amiga market? Posted on 28-Sep-2024 23:28:11
| | [ #171 ] |
|
|
|
Elite Member |
Joined: 14-Mar-2007 Posts: 2387
From: Kansas | | |
|
| |
Status: Offline |
|
|