Poster | Thread |
NovaCoder
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 5-Apr-2024 11:26:42
| | [ #21 ] |
|
|
|
Regular Member |
Joined: 16-Apr-2008 Posts: 490
From: Melbourne (Australia) | | |
|
| @saimo
This is pretty cool, if I get the time I'd like to play around with it and see if I can use it for a game.
|
|
Status: Offline |
|
|
saimo
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 5-Apr-2024 11:40:06
| | [ #22 ] |
|
|
|
Elite Member |
Joined: 11-Mar-2003 Posts: 2484
From: Unknown | | |
|
| @NovaCoder
That would be really nice :) _________________ RETREAM - retro dreams for Amiga, Commodore 64 and PC |
|
Status: Offline |
|
|
saimo
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 12:37:46
| | [ #23 ] |
|
|
|
Elite Member |
Joined: 11-Mar-2003 Posts: 2484
From: Unknown | | |
|
| By chance I discovered that Zoomaniac might crash on my real A1200. After some investigation it turned out that it was due to a stack issue that happened when the execution dropped below 50 fps on 68020, 68030 and 68040 (an instruction was executed before instead of after a branch). That's fixed now. While searching for the problem, I realized a way to make the solid scaling routine a bit faster - so the bug, although finding it required some effort, was actually a good thing!
The new download is available on the PED81C page: https://www.retream.com/PED81C _________________ RETREAM - retro dreams for Amiga, Commodore 64 and PC |
|
Status: Offline |
|
|
Karlos
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 12:46:26
| | [ #24 ] |
|
|
|
Elite Member |
Joined: 24-Aug-2003 Posts: 4841
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @saimo
Does this still give any benefit over "copyspeed" C2P when you have a chunky buffer in fast ram (and due to cache w/r access patterns you don't want to move it)? _________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
saimo
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 13:09:29
| | [ #25 ] |
|
|
|
Elite Member |
Joined: 11-Mar-2003 Posts: 2484
From: Unknown | | |
|
| @Karlos
I don't know how classic C2P works, but I can say that with this system updating the CHIP RAM buffer is just a matter of copying data from FAST RAM - i.e., it can't get faster than that.
Here's the routine in Zoomaniac that does just that (it's a picture as the forum doesn't have [CODE]):
(Incidentally, this is exactly where the bug was: the move.w #$3700,sr was the first instruction of the routine.) _________________ RETREAM - retro dreams for Amiga, Commodore 64 and PC |
|
Status: Offline |
|
|
Karlos
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 14:19:27
| | [ #26 ] |
|
|
|
Elite Member |
Joined: 24-Aug-2003 Posts: 4841
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @saimo
I don't know that there are any examples of copy speed C2P for 030, but on 060, for example, the cost of doing the transformation, in the CPU cache, is hidden behind the cost of chip ram writes to the extent that the conversion approaches the same speed as a straight copy.
In your examples here, if I understand correctly, the chip ram is more contended than usual due to SHRES display modes. What I would like to know is, for an 030 target what's copy throughput from fast to chip here? Last edited by Karlos on 06-Dec-2024 at 02:24 PM.
_________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
saimo
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 14:42:37
| | [ #27 ] |
|
|
|
Elite Member |
Joined: 11-Mar-2003 Posts: 2484
From: Unknown | | |
|
| @Karlos
EDIT: I have done new tests and new calculations regarding the FAST->CHIP copy routine, which gave these results: * within Zoomaniac, bitplanes DMA on, staggered lines on: about 90 rasterlines; * within Zoomaniac, bitplanes DMA off, staggered lines off: about 89 rasterlines; * within Zoomaniac, bitplanes DMA off: about 89 rasterlines; * in a separate test made with TestCode, all DMA off: about 89.15 rasterlines. I made the separate test with TestCode as I was surprised that turning the bitplanes DMA off made almost no difference. The source code of the test blob is available here (it will be removed after some days/weeks).
I have corrected the calculations below accordingly.
Quote:
In your examples here, I I understand correctly, the chip ram is more contended than usual due to SHRES display modes. |
That's correct.
Quote:
What I would like to know is, for an 030 target what's copy throughput from fast to chip here? |
It's in the comments (EDIT: which were outdated and 1 rasterline off) Copying the whole buffer takes 89.15 rasterlines. It must be noted that the buffer is 128x256 (visually it doubles horizontally due to how PED81C works). To help make a proper comparison against a standard 320x256 screen (calculations are rounded): * the amount of data copied in those 89.15 rasterlines is 128x256 = 32768 bytes; * if the screen were 320x256, the amount of data to copy would be 81920 bytes; * copying that data would take 81920*89.15/32768 = 222.87 rasterlines = 0.71 frames.
Do you have figures relative to the best C2P routines? I'd be curious.
By the way, I think I've already written this somewhere: PED81C is meant especially for low end machines - its compromises don't make sense on machines which are powerful enough to handle C2P sufficiently well; and, of course, it's also meant to enable Amiga users to say "AGA can do chunky" EDIT: relatively to low end machines, an interesting comparison would be how fast the initial "infinite" zoomer of Zoomaniac would run on a stock A1200 with a classic 2x1 C2P; with PED81C it runs at 26 fps, but I guess that figure couldn't be reached with C2P, as that involves an extra read and write of the whole screen from and to CHIP RAM, which costs more than PED81C's DMA cost.Last edited by saimo on 06-Dec-2024 at 06:33 PM. Last edited by saimo on 06-Dec-2024 at 06:31 PM. Last edited by saimo on 06-Dec-2024 at 06:28 PM. Last edited by saimo on 06-Dec-2024 at 06:12 PM. Last edited by saimo on 06-Dec-2024 at 05:27 PM. Last edited by saimo on 06-Dec-2024 at 03:32 PM.
_________________ RETREAM - retro dreams for Amiga, Commodore 64 and PC |
|
Status: Offline |
|
|
Karlos
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 15:43:59
| | [ #28 ] |
|
|
|
Elite Member |
Joined: 24-Aug-2003 Posts: 4841
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| |
Status: Offline |
|
|
saimo
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 17:13:50
| | [ #29 ] |
|
|
|
Elite Member |
Joined: 11-Mar-2003 Posts: 2484
From: Unknown | | |
|
| @Karlos
EDIT: I have updated the figures below according to the new calculations in my previous post.
Ah, AB3D2... what else
Quote:
From the comments: Quote:
That's impressive!
The C2P routine is 320x256 2x1, so I guess that the source buffer is 160x256. To make a proper comparison, it is first necessary to scale the result reported in the previous post (89.15 rasterlines for 128x256 with all the DMA off) to see how much it would take to copy a buffer of the same size: 160*89.15/128 = 111.44 rasterlines = 0.36 frames (EDIT: afterwards I made a real world test and the exact figure was 111.427).
But there's a big "but": while the C2P routine supports 8 bitplanes, PED81C at most allows 81 different colors. So, for a more balanced comparison, the straight copy cost would be: 0.36*8/log2(81) = 0.45 frames. Performance-wise, PED81C is more convenient, but, again, that has to be weighed against the output quality.
By the way, I had forgotten to mention an important detail - I'll just copy&paste it from the manual: Quote:
Given that the DMA load caused by PED81C is "double" (see its documentation for the details), a version that uses only half the number (2) of bitplanes has been made to check the performance as if the Amiga had a native chunky video mode. Surprisingly, the performance did not improve at all: relatively to the CHIP bus access, the scaling code must interleave so nicely with the bitplane data fetches that having more bus cycles available does not make any/much difference. |
Last edited by saimo on 07-Dec-2024 at 09:05 AM. Last edited by saimo on 06-Dec-2024 at 08:31 PM. Last edited by saimo on 06-Dec-2024 at 08:30 PM. Last edited by saimo on 06-Dec-2024 at 06:18 PM. Last edited by saimo on 06-Dec-2024 at 05:26 PM. Last edited by saimo on 06-Dec-2024 at 05:23 PM.
_________________ RETREAM - retro dreams for Amiga, Commodore 64 and PC |
|
Status: Offline |
|
|
Karlos
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 18:17:23
| | [ #30 ] |
|
|
|
Elite Member |
Joined: 24-Aug-2003 Posts: 4841
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @saimo
Today, we have a degree (not perfect) of separation between the software framebuffer and the eventual display. I'm interested at some point to get 2x1 and 2x2 modes working, which of done properly should be a big boost for 030 class machines.
Your approach might save some extra CPU time for the rendering. If I were to try and include it, it would be up to the user to choose it. _________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
saimo
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 18:27:35
| | [ #31 ] |
|
|
|
Elite Member |
Joined: 11-Mar-2003 Posts: 2484
From: Unknown | | |
|
| @Karlos
Giving the option would be good. I'd be very curious to see the improvement in terms of fps!
As far as PED81C goes, the implementation is dead easy: just open a screen as per the documentation (it even contains the specific register settings for a 320x256 screen) and replace the C2P routine with a straight-copy one (feel free to steal the one I provided). You'd need a bit of work for remapping the graphics, though: the graphics palette must match the mode used for the PED81C screen.
Note: I have redone the calculations and updated my previous posts. _________________ RETREAM - retro dreams for Amiga, Commodore 64 and PC |
|
Status: Offline |
|
|
Karlos
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 18:32:05
| | [ #32 ] |
|
|
|
Elite Member |
Joined: 24-Aug-2003 Posts: 4841
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @saimo
I'm definitely going to add to the to-do list. There are hopefully some performance wins still to be had by fixing overdraw issues that I want to deal with before coming back to lower level things.
Is the display method system friendly, or does it require taking over anything? _________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
saimo
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 18:35:28
| | [ #33 ] |
|
|
|
Elite Member |
Joined: 11-Mar-2003 Posts: 2484
From: Unknown | | |
|
| @Karlos
It all boils down to opening a SHRES screen, so that can be perfectly done in a system-friendly way ;) _________________ RETREAM - retro dreams for Amiga, Commodore 64 and PC |
|
Status: Offline |
|
|
Karlos
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 19:17:21
| | [ #34 ] |
|
|
|
Elite Member |
Joined: 24-Aug-2003 Posts: 4841
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| |
Status: Offline |
|
|
saimo
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 20:25:46
| | [ #35 ] |
|
|
|
Elite Member |
Joined: 11-Mar-2003 Posts: 2484
From: Unknown | | |
|
| @Karlos
EDIT: originally I had used 89.15 rasterlines as that value had stuck to my mind, but the right value was 111.427; the figures below have been corrected accordingly.
Quote:
The colour scheme could be a problem. The game makes heavy use of gradient shading. |
Depending on how much time you'll be able/willing to invest, you might also define a color mode that suits the graphics better (even on a level basis - I guess the game has levels? Excuse me, but I've never seen the game ).
However, it might be worth doing some rough estimates first. Assuming that one day rendering at 160x256 will work, it is possible to calculate the gain provided by PED81C in some hypothetical scenarios.
We know that, with all DMA channels off: * C2P takes 0.83 frames = 259.79 rasterlines * copying the buffer to a PED81C screen takes 0.36 frames = 111.427 rasterlines (note: as per real world tests, with the screen and the Copper on, the actual figure almost doesn't change)
Then, let's calculate to overall performance considering the time taken by rendering the frame in FAST RAM alone.
If the rendering time were 5 frames (i.e. 10 fps) = 1565 rasterlines: * overall time with C2P: 1565+259.79 = 1824.79 rasterlines = 5.83 frames = 8.58 fps * overall time with PED81C: 1565+111.427 = 1679.427 rasterlines = 5.37 frames = 9.31 fps
If the rendering time were 2.5 frames (i.e. 20 fps) = 782.5 rasterlines: * overall time with C2P: 782.5+259.79 = 1042.29 rasterlines = 3.33 frames = 15.02 fps * overall time with PED81C: 782.5+111.427 = 871.65 rasterlines = 2.86 frames = 17.51 fps Notes: * of course, the longer it takes to render the frame, the less relevant the C2P time; * the performance difference might/will be bigger with the DMA channels on. Whether improvements of such magnitude would be worth the hassle or not is up to you to decide Last edited by saimo on 07-Dec-2024 at 09:09 AM. Last edited by saimo on 06-Dec-2024 at 10:34 PM. Last edited by saimo on 06-Dec-2024 at 08:32 PM.
_________________ RETREAM - retro dreams for Amiga, Commodore 64 and PC |
|
Status: Offline |
|
|
Karlos
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 21:07:06
| | [ #36 ] |
|
|
|
Elite Member |
Joined: 24-Aug-2003 Posts: 4841
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @saimo
The original game isn't very colourful so they might work in our favour. However my modification tries to be a bit less grey, so there is a potential challenge.
As for copper, we might want it on in the 2x2 mode to double the vertical height (which is how it currently works). _________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
saimo
| |
Re: PED81C - pseudo-native, no C2P chunky screens for AGA Posted on 6-Dec-2024 22:21:01
| | [ #37 ] |
|
|
|
Elite Member |
Joined: 11-Mar-2003 Posts: 2484
From: Unknown | | |
|
| @Karlos
Quote:
As for copper, we might want it on in the 2x2 mode to double the vertical height (which is how it currently works). |
Given that you need to keep the game OS-friendly, does the OS support the hardware scandoubling (i.e. FMODE/BSCAN2)? (EDIT: answering myself: I guess it does, given that it provides modes like DBLPAL and DBLNTSC.) That would avoid the extra Copper DMA load. Come to think of it, does the OS allow to set the fine scroll value differently for even/odd bitplanes? If not, a little bit of custom magic will be needed.Last edited by saimo on 07-Dec-2024 at 09:11 AM. Last edited by saimo on 07-Dec-2024 at 09:11 AM.
_________________ RETREAM - retro dreams for Amiga, Commodore 64 and PC |
|
Status: Offline |
|
|