Discussion in 'Sega Dreamcast Development and Research' started by Mrneo240, Mar 8, 2019.
can you explain what you mean by this? and in what iteration of the engine?
What I mean is that in the very original, software rendered Quake 1 and 2 engine, the base texture maps and the lightmaps were not blended at runtime every time the polygons were drawn. Rather, if I remember correctly from back in the day, each visible surface (polygon) would get its base texture map and lightmap pre-blended at a given resolution and stored in a "flat" surface cache.
The actual polygon rendering pass would then draw the screen one scanline at a time using an edge table, and using the surface cache as texture to draw each polygon's portion of a scanline in a single pass.
So technically one could do the same, pre-blend textures and lightmaps in software or using render to texture into an atlas for each visible polygon, and then each final polygon would take the atlas' UV coordinates instead of its original UVs. This way the per-frame rendering pass could be completely opaque, while only the polygons that become visible between frames would require them being rendered into the atlas.
There's nothing special to be done specifically for maximizing fillrate for blended polygons; just do the same things should be done for all pixels: twiddle all textures, use mipmaps on anything minified, VQ compress if possible, and avoid very thin/narrow triangles (especially ones that are diagonal), avoid having a lot of overlap if transparency sorting is on.
Maybe. I've done my own mini-GL implementation before (it was as part of someone else's project, so I don't think I can release any of it without their permission), and I'm currently working on a DC-specialized rendering library. It's designed to be GL-like in usage, but with various changes to make it match the hardware closer, less error checking, and hooks to make it easy to mix with custom external rendering code.
Some things I've noticed about GLdc implementation at first glance:
pvr_poly_compile is used to generate PVR headers. It's really slow and should be avoided. Changing one setting causes everything to be re-evaluated, and its large size blows out the instruction cache. I made a replacement library for it (which I actually quietly posted with other code a long time ago). I've posted an up-to-date version is here. The library is designed to allow the compiler to optimize function calls to it really well.
When using the FTRV instruction in mat_trans_single3_nodiv and mat_transform3, the vector register fv12 is used for the calculation. That register is a saved register, which means GCC has to preserve and restore its value for the caller, which wastes probably around 8 cycles per mat_transform3 call. Changing it to using any other vector register should give a small speed increase to this function.
Another thing to consider is using the floating point color vertex formats instead of packed color. The TA will clamp the floats to [0, 1] and convert and pack them for you. There's no wasted memory on the PVR side using floats versus using packed color (although if you buffer the vertex data in RAM before sending it to the TA, the buffer will be twice the size).
So are 100% of all polygons (besides punch through) are submitted in the transparent list, or am I misunderstanding? If they are all submitted as transparent, why are they being drawn like that? I would modify the renderer so that opaque polygons are used where possible.
The PVR's rendering is pipelined by a couple frames, so modifying textures between frames is more complicated than other hardware. (You don't want to modify a texture while it's being rendered from.) You'd have to either double buffer all surface textures, or insert big rendering stalls between each frame. You'd also have to twiddle all the updated textures each frame, which would add extra CPU overhead. Multitexturing is probably the better option for the DC.
Ok, so I started to try and switch completely over to your alternative method.
So far it hasn't gone well, I seem to be only getting single Tris and not strips, also every tri is wildly different colorful rainbows.
Definitely a feature and not a bug ;-)
latest performance changes are on gitlab
DOES NOT INCLUDE THE RAINBOW SOUP ABOVE
heres a scrambled binary,
Its way faster now, and should run just about most mods but slowly.
I, for one, would like an option for the LSD DISCO mode...
Any current video of the performance?
Great job as always!
Near the end of the dev stream video I showed demos of a couple things. Quake is there.
I wonder how it would two of my favorite Dreamcast mods. The fun codename : corporal and codename envenom. I hope it's easy to use with bootdreams if I ever get to it.
Idk what those are, but sounds cool
Should be very easy with cdi4dc.
Ugh sorry double post.
Oh they are . You should try em.code name corporal is a 1 player crazy fast paced deathmatch with some nice music and envenom is like resident evil type game.iam a dumbass though when it comes to selfboot so who knows if I'll figure it out. I only know where to get codename corporal from:http://quakedev.dcemulation.org/downloads/codename
What is the actual alternative method you implemented?
Very neat! Great quality lighting! Wow! I'm too caught up with Mars but great stuff.
they both load but are dark.
I added gamma control, so thanks! It was a feature i didnt know we needed until just now.
Also can go darker if you so choose?
Using tapanms method for generating poly headers, it seems better on paper and the compiler output.
I just haven't been able to figure it all out yet and there's some stuff I need missing but that's minor for now
Separate names with a comma.