Amigaworld.net - The Amiga Computer Community Portal Website

home

features

news

forums

classifieds

faqs

links

search

6223 members

Amiga Q&A / Free for All / Emulation / Gaming / (Latest Posts)

Login

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net

Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.

Menu

Main sections

»	Home
»	Features
»	News
»	Forums
»	Classifieds
»	Links
»	Downloads

Extras

»	OS4 Zone
»	IRC Network
»	AmigaWorld Radio
»	Newsfeed
»	Top Members
»	Amiga Dealers

Information

»	About Us
»	FAQs
»	Advertise
»	Polls
»	Terms of Service
»	Search

IRC Channel

Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online

22 crawler(s) on-line.

95 guest(s) on-line.

0 member(s) on-line.

You are an anonymous user.
Register Now!

matthey: 42 mins ago

number6: 54 mins ago

Hammer: 1 hr 44 mins ago

DiscreetFX: 1 hr 49 mins ago

kolla: 2 hrs 11 mins ago

minator: 3 hrs 12 mins ago

ruben: 4 hrs 26 mins ago

vintagewatches.pk: 4 hrs 37 mins ago

Rob: 5 hrs 24 mins ago

nbache: 5 hrs 30 mins ago

Forum Index

Amiga OS4.x \ Workbench 4.x

Interesting memory allocation benchmark

Poster

Thread

bernd_afa

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 17:19:00

[ #41 ]

Cult Member

Joined: 14-Apr-2006
Posts: 829
From: Unknown

@fishy_fis

of course a X86 with dualchannel RAM is lots faster, but OS4 run not on a System with DUAL Channel DDR ram.the test should only show that tlsf is the fastest allocator and it doesnt matter what mem sizes to alloc.

tlsfmem get always same speed, on my system if i use the best case mem alloc (the values i post)or the worst case values from fab.

@ChrisH
>Now that we have more reasonable sounding results,

this values are best case values, the overall alloc is only about 8 kb.so there need only 2 mmu pages changes by this constant free and alloc.

maybe you change the test to more realistic so 1 meg is alloc and there are instead of 20 allocs now 100.and 2 of them are in size 300 kb and 700 kb.

but all in all can see tlsf mem is lots faster as Slab even if OS4 need not change much MMU pages.

@umisef
>Using Bernd's size values, a million iterations take 15.4 seconds, 21 allocations per >iteration --- so about 750ns.
>Here is the output...
>1000 iterations: Elapsed time: 13393 us (0.013393 s)
>2000 iterations: Elapsed time: 26583 us (0.026583 s)
>10000 iterations: Elapsed time: 130578 us (0.130578 s)
>50000 iterations: Elapsed time: 855696 us (0.855696 s)
>100000 iterations: Elapsed time: 1506454 us (1.506454 s)
>1000000 iterations: Elapsed time: 15482070 us (15.482070 s)

for the Unix with MMU Test, its also usefull to see the test with the values from Fab.

Last edited by bernd_afa on 22-Sep-2009 at 04:05 PM.
Last edited by bernd_afa on 19-Sep-2009 at 05:25 PM.
Last edited by bernd_afa on 19-Sep-2009 at 05:22 PM.
Last edited by bernd_afa on 19-Sep-2009 at 05:21 PM.

Status: Offline

pixie

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 17:30:05

[ #42 ]

Elite Member

Joined: 10-Mar-2003
Posts: 3474
From: Figueira da Foz - Portugal

@Samwel

It was not me putting an if in his sentence...

_________________
Indigo 3D Lounge, my second home.
The Illusion of Choice | Am*ga

Status: Offline

fishy_fis

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 18:04:10

[ #43 ]

Elite Member

Joined: 29-Mar-2004
Posts: 2170
From: Australia

@bernd_afa

Sure, of course modern hardware will be a lot faster than peg/peg2/a1/sam/etc. I only posted the results incase anyone was interested to see them and also to compare different AROS setups (not hardware, but how its running,... in a VM for example RAM latency seems to suffer greatly(although this is no real surprise)). Id be interested to see results from other AROS setups too (linux hosted, qemu with virtualiser,etc.). Hopefully some other AROS users who run it in a different way will also post thier results.
On a slightly different note however, and its not really important, just a thought, but setting ideal values seems a little redundant to me. The original tests by Fab seem to be a little more valid than your idealised version. Writing a test to perform as optimally and cleanly as possible doesnt really tell a lot.

Status: Offline

number6

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 19:42:22

[ #44 ]

Elite Member

Joined: 25-Mar-2005
Posts: 11878
From: In the village

@Fab

Quote:
OS4.1 (advanced SLAB allocator) with membench_amigaos4:
iterations - result
1000 : ~5060000 µs (5 s)
2000 : ~10120000 µs (10 s)
10000 : ~50600000 µs (50 s)
50000 : ~256000000 µs (256 s)
100000 : ~514000000 µs (514 s)
1000000 : N/A... I hadn't enough time for that. :)

For whatever it's worth:

YOUR tests, as opposed to Bernd's, for reasons I explained in prior post:

Micro GX - OS4.0 final+July update (representing the complete final package)

membench_amigaos4

1000: Elapsed time: 3779427 µs
2000: Elapsed time: 8000613 µs
10000:Elapsed time: 39974912 µs
50000:Elapsed time: 201559684 µs

membench_68k

1000:Elapsed time: 4084742 µs
2000:Elapsed time: 8640998 µs
10000:Elapsed time: 43231655 µs
50000:Elapsed time: 216280114 µs

#6

Last edited by number6 on 19-Sep-2009 at 07:48 PM.

_________________
This posting, in its entirety, represents solely the perspective of the author.
*Secrecy has served us so well*

Status: Offline

Karlos

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 19:52:53

[ #45 ]

Elite Member

Joined: 24-Aug-2003
Posts: 4958
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

Where I come from, resource allocation is always considered to be slow and expensive compared to most operations, therefore you never, ever do it in time critical code and no application should be frequently allocating and releasing resources if it can be avoided.

_________________
Doing stupid things for fun...

Status: Offline

paolone

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 20:01:24

[ #46 ]

Super Member

Joined: 24-Sep-2007
Posts: 1145
From: Unknown

I've run the test on my Icaros machine (Athlon64 X2 5200+ 2,6 Ghz, 1 GB DDR2-800 MHz RAM) and it is actually slower than the virtual machine I've tested before. That's odd, since also my Core2 Quad machine uses DDR2 800 Mhz modules, and AROS is running inside a VM. Anyway, I noticed that

1. running the test more times, give always worse results (speed decreases over time)

2. have another application running, like OWB, but "in idle" slowdowns the test

Anyway here are the results for the AROS real machine

1000 >>>> 31103 µs
1000000 > 31757832 µs

I've used original Fab's sources, compiled with gcc -o membench membench.c

Status: Offline

number6

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 20:52:34

[ #47 ]

Elite Member

Joined: 25-Mar-2005
Posts: 11878
From: In the village

@Cyborg

Quote:
For the heck of it here the results with bernd_afa changes on OS4:
1000: 113044 µs (0.113044 s)
2000: 225120 µs (0.225120 s)
10000: 1155693 µs (1.155693 s)
50000: 5807400 µs (5.807400 s)
100000: 11615699 µs (11.615699 s)
1000000: 116656246 µs (116.656250 s)

Amended results including the OS4.x version after Chris H's recent upload fixing an earlier issue with the OS4.x version running under OS4.0:

Micro GX - OS4.0 final+July update (representing the complete final package)

membench-bernd_afa_OS3

1000: Elapsed time: 97459 µs = 97 ms
2000: Elapsed time: 194081 µs = 194 ms
10000: Elapsed time: 1016048 µs = 1016 ms
50000: Elapsed time: 5074529 µs = 5074 ms
100000:Elapsed time: 10162837 µs = 10162 ms

membench-bernd_afa_OS4

1000: Elapsed time: 64124 µs = 64 ms
2000: Elapsed time: 128945 µs = 128 ms
10000: Elapsed time: 637945 µs = 637 ms
50000: Elapsed time: 3193822 µs = 3193 ms
100000:Elapsed time: 6400816 µs = 6400 ms

#6

Last edited by number6 on 20-Sep-2009 at 05:24 PM.

_________________
This posting, in its entirety, represents solely the perspective of the author.
*Secrecy has served us so well*

Status: Offline

Fab

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 21:36:26

[ #48 ]

Super Member

Joined: 17-Mar-2004
Posts: 1178
From: Unknown

@Karlos

Sure, especially in realtime application, it's recommended to avoid dynamic allocations as much as possible.

But with an allocator like TLSF, allocation and other operations are bound to a given value, making it deterministic, and so qualifying for realtime usage.

Now, about desktop applications, even if you try to avoid allocations in critical code (which is not often the case in complex and badly designed c++ apps :)), you must also consider the system can run for several days/weeks/months/years (ok, unlikely for an amigaos-like :)). With a linked-list structure, and if we exclude the fragmentation problem, the allocation time could get really slow in the end, as opposed to TLSF (or any other o(1) allocator).

Status: Offline

marko

Re: Interesting memory allocation benchmark
Posted on 20-Sep-2009 1:50:56

[ #49 ]

Super Member

Joined: 17-Dec-2007
Posts: 1816
From: Gothenburg, THE front side of Sweden ;), (via Finland), EU

Hmm, this is interesting... and worrying :(

Here's some more numbers...

OS4.1 Workbench (with Quick-Fix) on Sam440ep-flex 800MHz

power2:
1000: 5773 ms (5.773 s)
2000: 11628 ms (11.628 s)
10000: 59183 ms (59.183 s)
50000: --
100000: --
1000000: --

bernd_afa:
1000: 94 ms (0.094 s)
2000: 190 ms (0.190 s)
10000: 955 ms (0.955 s)
50000: 4804 ms (4.804 s)
100000: 9825 ms (9.825 s)
1000000: 95997 ms (95.997 s)

-- --

OS4.1 without startup-sequence on Sam440ep-flex 800MHz

power2:
1000: 5585 ms (5.585 s)
2000: 11178 ms (11.178 s)
10000: 55919 ms (55.919 s)
50000: --
100000: --
1000000: --

bernd_afa:
1000: 84 ms (0.084 s)
2000: 169 ms (0.169 s)
10000: 845 ms (0.845 s)
50000: 4230 ms (4.230 s)
100000: 8461 ms (8.461 s)
1000000: 84616 ms (84.616 s)

-- --

OS3.x WinUAE/AmigaForever on Vista (with tons of background processes), AMD Athlon 64 X2 Dual Core 4200+, 2.2 GHz

power2:
1000: 20 ms (0.020 s)
2000: 59 ms (0.059 s)
10000: 280 ms (0.280 s)
50000: 1419 ms (1.419 s)
100000: 2880 ms (2.880 s)
1000000: 28639 ms (28.639 s)

bernd_afa:
1000: 19 ms (0.019 s)
2000: 39 ms (0.039 s)
10000: 199 ms (0.199 s)
50000: 1039 ms (1.039 s)
100000: 2080 ms (2.080 s)
1000000: 20739 ms (20.739 s)

_________________
AmigaOS 4.1 FEu2 on Sam440ep-flex 800MHz 1GB RAM
C128, A500+, A1200, A1200/40, AmigaForever 2008+09+16, 5 x86/x64 boxes
Still waiting (or dreaming) for the Amiga revolution...
m4rko.com/AMIGA

Status: Offline

fishy_fis

Re: Interesting memory allocation benchmark
Posted on 20-Sep-2009 5:57:45

[ #50 ]

Elite Member

Joined: 29-Mar-2004
Posts: 2170
From: Australia

oops, accidental reposting of an earlier post.

Oh well, seeing as I made a post I needed to edit,....

@paolone

Those results are unusual.... a VM should be significantly slower than a native set-up, but more than that there seems to be a huge difference between our 2 results,.. a factor of 5-10x. Granted a Core2Duo@3.6ghz is probably 150-200 percent the speed of athlon64 x2 5200+, but there's still a huge discrepency. Only thing I can think of is that maybe Icaros is using resources in the background that it shouldnt ? (I dont use Icaros). Id be interested to find out what's going on here.

Last edited by fishy_fis on 20-Sep-2009 at 06:10 AM.
Last edited by fishy_fis on 20-Sep-2009 at 05:59 AM.

Status: Offline

Hans

Re: Interesting memory allocation benchmark
Posted on 20-Sep-2009 6:13:08

[ #51 ]

Elite Member

Joined: 27-Dec-2003
Posts: 5123
From: New Zealand

@Fab

Quote:

Fab wrote:
Now, about desktop applications, even if you try to avoid allocations in critical code (which is not often the case in complex and badly designed c++ apps :)), you must also consider the system can run for several days/weeks/months/years (ok, unlikely for an amigaos-like :)). With a linked-list structure, and if we exclude the fragmentation problem, the allocation time could get really slow in the end, as opposed to TLSF (or any other o(1) allocator).

Considering that SLAB allocators are used on Unix systems that are kept running continually, I doubt that SLAB allocators have this slowdown of memory allocation over time issue. I've never actually tested this with Amiga OS 4.x though(my machine is switched off when I'm not using it), so it remains to be seen what happens. Amiga OS 3 probably does have this problem.

Hans

_________________
Join the Kea Campus - upgrade your skills; support my work; enjoy the Amiga corner.
https://keasigmadelta.com/ - see more of my work

Status: Offline

umisef

Re: Interesting memory allocation benchmark
Posted on 20-Sep-2009 8:24:39

[ #52 ]

Super Member

Joined: 19-Jun-2005
Posts: 1714
From: Melbourne, Australia

@umisef

From my earlier posting on the SheevaPlug, using the small allocations:
Quote:
100000 iterations: Elapsed time: 1506454 us (1.506454 s)

I have now put the code on my iPhone 3GS, and its performance is 3.3s for 100,000 iterations with small allocations, and 9.8s for 100,000 iterations for the power-of-two allocations.

So it appears the SheevaPlug isn't quite as slow as it feels It's twice the speed (in this) as the actual mobile phone CPU :)

Status: Offline

paolone

Re: Interesting memory allocation benchmark
Posted on 20-Sep-2009 9:50:51

[ #53 ]

Super Member

Joined: 24-Sep-2007
Posts: 1145
From: Unknown

@fishy_fis

i use icaros also in the vm test....

Status: Offline

itix

Re: Interesting memory allocation benchmark
Posted on 20-Sep-2009 10:13:52

[ #54 ]

Elite Member

Joined: 22-Dec-2004
Posts: 3398
From: Freedom world

@umisef

Quote:

Anyway --- if the relative merits (rather than failings) of allocators are what you want to look at, this benchmark is not particularly interesting. The allocation/deallocation patterns are very regular, and very friendly.
If you want to look at these things in the scenarios their complexity is meant to tackle, you'd need something like the source from here, which will happily fragment the memory map :)

Here is the result from the Sheevaplug:
Quote:

kittycam@ubuntu:~$ ./amemtest2 40960 1 3000000
3000000 iterations: Elapsed time: 1215286 us (1.215286 s), 0.405095 us per
kittycam@ubuntu:~$ ./amemtest2 40960 10 3000000
3000000 iterations: Elapsed time: 5795645 us (5.795645 s), 1.931882 us per
kittycam@ubuntu:~$ ./amemtest2 40960 100 3000000
3000000 iterations: Elapsed time: 3527694 us (3.527694 s), 1.175898 us per
kittycam@ubuntu:~$ ./amemtest2 40960 1000 3000000
3000000 iterations: Elapsed time: 4560545 us (4.560545 s), 1.520182 us per
kittycam@ubuntu:~$ ./amemtest2 40960 10000 3000000
3000000 iterations: Elapsed time: 7855803 us (7.855803 s), 2.618601 us per

Here are my results from Pegasos II G4 (1GHz).

MorphOS 2.3 with TSLF allocator:
Quote:

Varasto:Lähdekoodit/membench> amemtest2 40960 1 3000000
3000000 iterations: Elapsed time: 1653965 us (1.653965 s), 0.551322 us per
Varasto:Lähdekoodit/membench> amemtest2 40960 10 3000000
3000000 iterations: Elapsed time: 1676783 us (1.676783 s), 0.558928 us per
Varasto:Lähdekoodit/membench> amemtest2 40960 100 3000000
3000000 iterations: Elapsed time: 1985348 us (1.985348 s), 0.661783 us per
Varasto:Lähdekoodit/membench> amemtest2 40960 1000 3000000
3000000 iterations: Elapsed time: 2747902 us (2.747902 s), 0.915967 us per
Varasto:Lähdekoodit/membench> amemtest2 40960 10000 3000000
3000000 iterations: Elapsed time: 5915103 us (5.915103 s), 1.971701 us per

MorphOS 2.3 with SafeMemLists (MorphOS 1.x style memory system):
Quote:

Varasto:Lähdekoodit/membench> amemtest2 40960 1 3000000
3000000 iterations: Elapsed time: 16400619 us (16.400619 s), 5.466873 us per
Varasto:Lähdekoodit/membench> amemtest2 40960 10 3000000
3000000 iterations: Elapsed time: 16686613 us (16.686613 s), 5.562204 us per
Varasto:Lähdekoodit/membench> amemtest2 40960 100 3000000
3000000 iterations: Elapsed time: 18620755 us (18.620755 s), 6.206918 us per
Varasto:Lähdekoodit/membench> amemtest2 40960 1000 3000000
3000000 iterations: Elapsed time: 36703252 us (36.703252 s), 12.234417 us per
Varasto:Lähdekoodit/membench> amemtest2 40960 10000 3000000
3000000 iterations: Elapsed time: 255323008 us (255.323008 s), 85.107669 us per

When interpreting results reader should pay an attention to the fact that in MorphoS malloc()/free() is mapped to AllocPooled()/FreePooled() calls while Fab's benchmark used AllocMem()/FreeMem().

_________________
Amiga Developer
Amiga 500, Efika, Mac Mini and PowerBook

Status: Offline

bernd_afa

Re: Interesting memory allocation benchmark
Posted on 20-Sep-2009 10:18:42

[ #55 ]

Cult Member

Joined: 14-Apr-2006
Posts: 829
From: Unknown

>Now, about desktop applications, even if you try to avoid allocations in critical code >(which is not often the case in complex and badly designed c++ apps :)),

right, and C++ programs are a real problem to develop with the memtracking tools that are suggest to verify that programs do no memtrash or buffer overflow.

please try the 8 kb memalloc and run with wipeout for MOS2.0 and post time for 10000 Iterations, so try this.

http://aminet.net/package/dev/debug/Wipeout-morphos

also OS4 users can do this the programs name is memguard

when i use the best case version which only alloc 8 kb, and let run wipeout, then 10000 iterations need 11 sec.without need 0,6 sec on my winuae system.

when i use tlsfmem then without wipeout need 0,2 sec and with wipeout 7 sec.

this slowness is the reason that C++ programs run extremele slow when use wipeout and so it cant use every time wipeout to test.this is not good for best program quality during develop.I always want run wipeout when i develop.

This large slowdown i notice on all C++ programs extrem.OWB.also libxml thats need in netsurf do lots of memallocs.show large pages need with netsurf and wipeout runnning also several minutes that are show in few seconds when run no wipeout.

Openredalert for example need with wipeout start time of over 3 minutes.

I dont understand wy wipeout do so much slowdown, i have Dual channel DDR mem and sysspeed show for a memtransfer rate on fast2fast over 800 megabytes.

and when do 10000 memallocs a 8 kb in 11 sec, there are only 10 Megabyte /sec of mem check.

I think its very important to get a faster working wipeout version.I dont understand wy its so slow, maybe it need the memlist with hashes or so to speed it up.

Last edited by bernd_afa on 20-Sep-2009 at 10:28 AM.
Last edited by bernd_afa on 20-Sep-2009 at 10:23 AM.
Last edited by bernd_afa on 20-Sep-2009 at 10:22 AM.
Last edited by bernd_afa on 20-Sep-2009 at 10:21 AM.

Status: Offline

NutsAboutAmiga

Re: Interesting memory allocation benchmark
Posted on 20-Sep-2009 10:31:47

[ #56 ]

Elite Member

Joined: 9-Jun-2004
Posts: 12993
From: Norway

@bernd_afa

wipeout2097 is too fast on AmigaOS4.1, so it can't have anything to do whit memory allocations then.

(Screen resolution how ever does have a grate impact on speed.)

Last edited by NutsAboutAmiga on 20-Sep-2009 at 10:35 AM.
Last edited by NutsAboutAmiga on 20-Sep-2009 at 10:33 AM.
Last edited by NutsAboutAmiga on 20-Sep-2009 at 10:32 AM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

Status: Offline

bernd_afa

Re: Interesting memory allocation benchmark
Posted on 20-Sep-2009 10:35:28

[ #57 ]

Cult Member

Joined: 14-Apr-2006
Posts: 829
From: Unknown

@NutsAboutAmiga

I mean not the game wipeout, please read at the link what wipeout is
here is link to OS4 Version

http://www.os4depot.net/index.php?function=showfile&file=development/debug/memguard.lha

Status: Offline

wawa

Re: Interesting memory allocation benchmark
Posted on 20-Sep-2009 12:03:54

[ #58 ]

Elite Member

Joined: 21-Jan-2008
Posts: 6259
From: Unknown

lol

Status: Offline

itix

Re: Interesting memory allocation benchmark
Posted on 20-Sep-2009 16:03:58

[ #59 ]

Elite Member

Joined: 22-Dec-2004
Posts: 3398
From: Freedom world

@bernd_afa

Quote:

I dont understand wy wipeout do so much slowdown, i have Dual channel DDR mem and sysspeed show for a memtransfer rate on fast2fast over 800 megabytes.

Each time when you allocate or deallocate memory Wipeout fills memory block with 0xDEADBEEF (or similar) pattern and checks tracked memory.

_________________
Amiga Developer
Amiga 500, Efika, Mac Mini and PowerBook

Status: Offline

ChrisH

Re: Interesting memory allocation benchmark
Posted on 20-Sep-2009 17:03:07

[ #60 ]

Elite Member

Joined: 30-Jan-2005
Posts: 6679
From: Unknown

@number6 & others
I have recompiled & uploaded the OS4 versions, without any SObj dependencies (although anyone could have done the same by installing the last public version of PortablE, which does not have that problem).

So if any OS4.0 users want to run those test executables, they can now.

_________________
Author of the PortablE programming language.
It is pitch black. You are likely to be eaten by a grue...

Status: Offline

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]

Amigaworld.net was originally founded by David Doyle