Found: Dreamcast Tower of Babel Demo

Discussion in 'Sega Dreamcast Development and Research' started by Dreamcast, Feb 8, 2014.

  1. MetalliC

    MetalliC Spirited Member

    Joined:
    Apr 23, 2014
    Messages:
    180
    Likes Received:
    133
    but let's return to @TriMesh Naomi2 dispute.
    I've looked at Ninja lib source code (library from Katana SDK which does all the 3D math), and found there Spot light routine, it is ~130 lines of SH4 assembly code. how many CPU clocks it took? I'm lazy to calculate it precisely, SH4 is superscalar CPU and may execute 2 simple instructions at once, but there is also "heavy" instructions like division or square root which may took 10-20 or more clocks, so lets assume this routine execution time is same 130 SH4 clocks.

    then do some math: 200'000'000 clock/sec / 130 = 1`538`461 -- this is the number of vertices per second to which SH4 may apply single spot-light (and nothing more).
    okay, but how about per frame @60fps ? 1`538`461 / 60 = 25'641
    great, but what about 6 light sources (as usually it is in Naomi2 games) ? 25'641 / 6 = 4273
    yes, only 4K vertices, not 40K lol
     
  2. TriMesh

    TriMesh Site Supporter 2013-2017

    Joined:
    Jul 3, 2008
    Messages:
    2,373
    Likes Received:
    785
    Yeah, I think we were talking past each other here, since in the end we seem to both be saying the same thing - which is basically they they had to add the T&L because the SH4 wouldn't have been anywhere near able to handle the sort of scenes you can render on the Naomi2 without it.
     
  3. accel99

    accel99 Spirited Member

    Joined:
    May 27, 2008
    Messages:
    193
    Likes Received:
    19
    Wow. That is definitely a step down. Though counter argument would be in the old days games would be cut down to run on consoles like say virtua fighter arcade vs virtua fighter 32x and its mind boggling why Sega wouldn't consider that for the Dreamcast. The end results would have still been good.

    On the other forum the guy who mentioned wacky races also mentioned how a PSP port of the Naomi 2 game initial d was like 50k Tris @ 30fps and still manages to resemble the original. A far cry the 60k to 100k you mentioned for Naomi 2. I mean you yourself said wacky races goes up to 70k per frame, they could have !made something very faithful to the original despite the simplistic lighting it would require.

    Oh Wells.
     
  4. TapamN

    TapamN Active Member

    Joined:
    Sep 16, 2005
    Messages:
    32
    Likes Received:
    13
    It's been a while since I've done these load stress experiments, but from what I remember, the PVR2 didn't seem to reliably run at 3M poly/sec on real work loads. I've reached a bit over 4.1M poly/sec drawing tori (a lot of tiny, consistently sized, near 1:1 aspect ratio polygons that almost never cross tile bountries) which maxxed out the PVR (SH4 still had spare cycles, I think I had around 34-36 cycles per vertex on my best general purpose routine*), but when I tried to draw real world models (I used this model, with larger polygons that cross multiple tiles, with a variety of aspect ratios), the PVR seemed to start struggling with less than 2M poly/sec (that model isn't optimized very well for the PVR, though. The modelers liked to use trifans over tristrips, and a real game would have character models with smaller triangles to help inflate the polygon count, so you could probably still do better than what I got if properly optimized). If the Naomi 2 does a stable ~3M+, I'd say it's improved polygon throughput a bit. Not 2x, but maybe 1.25x-1.5x.

    * This was with textures, with one parallel light source and ambient. It possible to do a bit faster. If the model is being drawn multiple times and fits in cache, a modified vertex format (larger, 40 bytes instead of 32) would shave (IIRC) a cycle off each vertex at the cost of worse performance on models larger than cache or only drawn once, and using a approximation for perspective would save a cycle, I think, possibly two. Getting rid of lighting would not be a huge improvement, as lighting was done in parallel with the perspective divide. Maybe a one cycle gain with a rearranged vertex format, thanks to no longer needing to store the vertex normal.

    Wacky Racers had a very inconstant frame rate on real hardware. I not sure if it would be actually hitting 30 FPS in real HW on high-poly scenes. I remember it had an option to turn off outlines to reduce load, which helped stabilize the frame rate quite a bit.

    I think the SH4 can do better lighting than what has been demonstrated so far, but I don't really have time to type it my thoughts on it right now.
     
  5. Esppiral

    Esppiral Enthusiastic Member

    Joined:
    Oct 3, 2012
    Messages:
    550
    Likes Received:
    1,133
    Wacky racers has an uncappes frame rate it tops at 60 fps and goes down from that...

    On the world hub it maintains 60 fps but during races it struggle
     
  6. Esppiral

    Esppiral Enthusiastic Member

    Joined:
    Oct 3, 2012
    Messages:
    550
    Likes Received:
    1,133
    @TapamN have you ever released your work? I've seen that demo and I'd like to run it on my Dreamcast
     
  7. MetalliC

    MetalliC Spirited Member

    Joined:
    Apr 23, 2014
    Messages:
    180
    Likes Received:
    133
    @TapamN thank for writeup, nice to see here competent people

    Sega's specs says Naomi2 can render something more than 5M/sec (opaque, textured, gouraud shaded, single volume). there is 2x PVR2s, each render half of screen, so it expected to be close to 2x I think ?

    Naomi2 ELAN math - 13M vertex/sec; Lighting fixed function - ambient, point, directional, spot/angle, with or without distance falloff, diffuse and/or specular, may take in account front/back face ; up to 6 light sources is "free" (no performance impact), if use 16=max sources - 2.5x performance down);
    ELAN does fetch display list and models data from dedicated RAM, do the math and then push result to 2x PVR2's TileAccelerators.

    @accel99 there is one more interesting thing about Wacky Races - it is one of very few Dreamcast games which rendered in 1280x480 (and later scaled down to 640 with interpolation, to make image smooth and reduce aliasing)
     
  8. Esppiral

    Esppiral Enthusiastic Member

    Joined:
    Oct 3, 2012
    Messages:
    550
    Likes Received:
    1,133
    @MetalliC Wait What?! Wacky Racers renders natively in 1280x480?!
     
  9. MetalliC

    MetalliC Spirited Member

    Joined:
    Apr 23, 2014
    Messages:
    180
    Likes Received:
    133
    @Esppiral right, there is 2 more games which does the same - Omicron Nomad Soul and Ready2Rumble
     
    Esppiral likes this.
  10. -=FamilyGuy=-

    -=FamilyGuy=- Site Supporter 2049

    Joined:
    Mar 3, 2007
    Messages:
    3,097
    Likes Received:
    1,046
    It cannot output it though. It's a kind of horizontal 2xMSAA.
     
    Esppiral likes this.
  11. TapamN

    TapamN Active Member

    Joined:
    Sep 16, 2005
    Messages:
    32
    Likes Received:
    13
    @Esppiral
    Oh, I didn't remember that it targeted 60. Just that it dropped a lot when to much was happening.

    I haven't released much. I did release the high performance rendering code I mentioned, here, but most of what I've done is just prototypes and not ready for release. The Hyrule Castle engine demo isn't running at the moment for some reason; some of the resources seem to have been overwritten so I'd have to track down which ones are bad and regenerate them (plus it currently only loads it's resources over dcload, so it only runs on a real DC with coder's cable/BBA. IIRC, the clipping code also wouldn't run on an emulator without cache emulation).

    @MetalliC
    Are there any places with documentation on ELAN? I'm curious what the HW is like. Stuff like what it's registers and vertex formats are. MAME's source didn't have anything when I last looked at it.

    I wouldn't expect 2x performance improvement from two PVRs. That like that are always work out to be WAY less than you'd hope for. Gotta be pessimistic about that stuff. :p

    @-=FamilyGuy=-
    It's supersampling, not multisampling, actually. It'd be nice if the DC had multisampling, it would probably be really cheap with the way the PVR works.

    Someone needs to make a PAR code to disable the vertical blur that Sega's 2x SSAA register config does on VGA.

    Any ideas on where to continue this conversation if the site goes down in the middle of it? I want to explain my ideas on how to get better effects and lighting on the SH4 than what's been seen, but I don't want to work on a long post if the site dies before I can finish it.
     
  12. -=FamilyGuy=-

    -=FamilyGuy=- Site Supporter 2049

    Joined:
    Mar 3, 2007
    Messages:
    3,097
    Likes Received:
    1,046
    I didn't know the nuances between MS and pure SS, so thanks for giving me reading material ;)

    Most people from AG are migrating to obscuregamers.com but this technical discussion about Dreamcast would probably be welcome on DCEmulation.org too.
     
  13. MetalliC

    MetalliC Spirited Member

    Joined:
    Apr 23, 2014
    Messages:
    180
    Likes Received:
    133
    sadly no.

    why ? PVR's render is like 400 standalone renders of 32x32pix tiles, and it expected to be pretty much scalable.
    I can only imagine lower performance boost because of ISP/TSP caches, where is stored precalculated by "triangle setup FPU"s parameters, and in the case if next tile is not adjacent but in chequered order - there might be more "cache misses" and higher load of triangle setup FPUs.
     
  14. accel99

    accel99 Spirited Member

    Joined:
    May 27, 2008
    Messages:
    193
    Likes Received:
    19
    That's an interesting fact, thank you. Wanted to ask one more question despite being not competent as you put it. It's not exactly Dreamcast but it's related, the Sega hikaru.

    The set up for the Sega hikaru is two sh4 while the GPU takes care of transformation and lighting/ Phong shading/clipping right? Was the second sh4 necessary? Do they both take care of game code/ physics or does one assist the GPU, how does it work?

    This came to mind because a while back some one posted a snippet of a katana sdk comment about bumpmapping. Some thing if bumpmap function is submitted/updated once per frame you get correct phong shading. I imagine that must taxing on the Dreamcast but could have been interesting to see a real use case.
     
  15. MetalliC

    MetalliC Spirited Member

    Joined:
    Apr 23, 2014
    Messages:
    180
    Likes Received:
    133
    right
    this is the question for Sega, if they put there 2nd SH4 then they had in mind some usage scenarios, where 2nd CPU will be necessary.
    in most of games 2nd SH4 almost idle, I recall only AirTrix game where it does model animation / physics.
     
  16. accel99

    accel99 Spirited Member

    Joined:
    May 27, 2008
    Messages:
    193
    Likes Received:
    19
    Huh, almost idle. I guess they could have scraped by with just one sh4 and one aica then.

    Was there any merit to 1024 lights per scene or was that just hype. I remember there was a short lived hikaru emulator. Didn't get far but I remember the author mentioning that Hikaru used multiple lights in clusters. Any of that true?
     
  17. MetalliC

    MetalliC Spirited Member

    Joined:
    Apr 23, 2014
    Messages:
    180
    Likes Received:
    133
    yes.
    there is 1024 light records, which is grouped in 256 "Light Sets" - 4x lights and enable/disable mask. but, polygon might use only one light set, ie 4x lights max.

    in general, "per scene" light limits usually have no sense, but there is strict per polygon limit.
    so, for example Naomi2 ELAN can do 6/16 lights per polygon, but you may use whatever you want number of light sources per scene.
     
    Last edited: Jul 2, 2019
  18. accel99

    accel99 Spirited Member

    Joined:
    May 27, 2008
    Messages:
    193
    Likes Received:
    19
    So does the set have to be a Identical properties (color / type of light), can it be any type of light or only point light? . 4 per polygon huh, so if I make a set infinite/directional lights that affects the whole scene , any other set of lights would be ignored because I am using already 4 lights to light the entire scenes polygons?

    I notice that people seem to talk about hilaru as the Superior machine when compared to Naomi 2 , graphically doesn't seem so in my opinion. What say you?
     
  19. MetalliC

    MetalliC Spirited Member

    Joined:
    Apr 23, 2014
    Messages:
    180
    Likes Received:
    133
    lights can be different types of course.
    so far was identified light attributes: color, position, direction, distance falloff coefficients/type.
    no angle falloff, it might not exist in this hardware, or we missed it, or not used in games.
    not sure if I understood the question.
    basically, you may set up to 1024 light sources, and select up to 4 of them for specific polygon/model.
    I'm not sure, there is too little known about Hikaru specs/capabilities (I mean 100% confirmed information, not suggestions or rumors).
    if it really does per-pixel lighting - it is damn cool for its time (Hikaru PCBs dated as 1998, which also means Hikaru is ~2 years older than Naomi2).

    I'm personally tend to think Hikaru was not much powerful than N2, if was at all.
    they are too different, Hikaru is like successor of Sega Model3, while N2 is enhanced Dreamcast/NAOMI.
     
    Last edited: Jul 4, 2019

Share This Page