Discussion in 'Sega Saturn Programming and Development' started by Esppiral, Aug 30, 2016.
That's an excellent mod as it does not distort the 2D elements.
These codes all should be tested on real hardware for two reasons:
1) Hacks that won't run on real hardware are kinda lame
2) Hacks that run only on an emulator are likely to get broken when used on a different emulator than originally found on and when those emulators that they work on are updated there's a solid chance the hack will get broken.
See the many Super Mario 64 hacks that won't run on anything except specific versions of a specific emulator. I'm sure a lot of these that don't run on real hardware correctly can be made to work better by adjusting other values at other addresses. The emulator simply isn't accurate enough to reflect the problem. To be fair, beyond emulators I'm not aware of any convenient way to poke around the Saturn's RAM for addresses.
Though I am happy to see many of these codes working out of the box. Going to buy an AR if I can find one with the DB25 port and play around with these.
I do not have a real console. I do not need it, because it's easier and more comfortable for me to run the game on the emulator. If you are a true fan or adherent of real hardware, then use these hacks, as they say, at your own peril and risk.
First of all, thank you and the whole community for your incredible contributions to this great dear machine for all.
I have been able to discover with astonishment that Tomb Raider in Sega Saturn, does not use the Register of VDP1 High Speed Shrink (aka HSS).
You can modify the "distorted sprite" and "Scaled Sprite" primitives to use this Register.
In the best case improve it will double the fill rate and hopefully reduce the slowdowns of the original game.
I still do not leave my astonishment that they did not use it in their day.
Thank you, I keep waiting for your nice news about this!
David Gámiz Jiménez
If this were so, why do you think the developers did not take this opportunity?
Half-transparency (meshes) and HSS code for shadow (06066368 15C0):
They worked on Tomb Raider 2 and I remember reading somewhere that now that they knew the hardware better, the performance was much improved.
I guess Sega just didn't help third party devs that much.
TR1 ran on PS1 at something like 10 fps when the game was almost ready to ship until one of the dev went to see Sony and their performance analyser hardware. Just by making some small modifications, in a single day, the framerate was steady at 30 fps, thanks to Sony's good support.
In other words, I think too that adding hss would improve the framerate ingame quite a bit and it's likely they didn't know much about it. Even the first version of the vdp1 didn't support hss, so maybe they only had old technical documentation?
Thank you! @paul_met
Well, the real reason, I do not know. I sense three scenarios:
1) Lack of knowledge of the registry and its advantages. Very rare.
2) They will use it and see no improvement. Strange, but not impossible. Since they use a level of Tessellation or LOD, I do not know it openly. With two mip-maps. And they will think that it was enough or the maximum to earn fill rate.
3) They will use it and they will not like the visual result that they gave and they would think that with what they said before they already gained enough cycles or they could not be more.
I discard that they did not use it to consciously damage the game or the platform. It would be scandalous.
On the other hand. It's funny that you activated it in a "Polygon". Because theoretically according to the documentation could not.
Where it really is interesting is in the "distorted sprite" or "scaled sprite" for being "textured quads". That are the parts of the stage 95% of the graphics in a frame.
The texture and the drawing of it is what consumes more resources. If you divide the readings and writings in half on each alternate line, which is what the HSS does, theoretically drawing time must be gained. And where it does not reach, win it.
UPDATE 1: I'm sorry I was wrong. My memory failed me. The HSS does not alternate a drawing line. Alternate pixels of a line that has been reduced. With everything it is a gain. But do not double it as you estimate. Because will only affect quads that are reduced, such as a "distorted sprite" or a "scaled sprite". I apologize again.
UPDATE 2: Thinking better and calculating more or less. If the redraw factor of VDP1 can be between min = x0.3 max = x0.5. The gain effect of HSS could be between min = x1.3 max = x1.5.
Unless the slowdowns of Tomb Raider for SS were for the part of the CPU or the management of the Bus-B.
Finally. It is often a common mistake to call the mesh effect, as semi-transparency. The correct thing is to call it "mesh effect" or "mesh". Because semi-transparency is the actual Color Calculation mode that makes VDP1. That is a 50% real transparency.
Thank you anyway! Greetings!
David Gámiz Jiménez
The console was developed in the era of old tube TVs and with low-quality composite video signal. Therefore, the mesh on such TVs looked like a real half-transparency. So, the mesh in Saturn term, I think, can be equated to the translucency for that period of time.
Well, I activated HSS for all other scaled / distorted sprites, but I did not notice a performance difference. I tested it on the emulator, but maybe on the console there will be differences (USA version of the game).
Hex codes (File "0MAIN.BIN"):
Thank you very much! @paul_met
I will try it as soon as I can, on my SS PAL. In theory this registry will affect hardware, because it is a original way to "rasterize" of VDP1. And the emulators, although by software, having more power, are not able to reflect that details.
I do not expect big changes, maybe it a little improvement.
Thanks again!!! Great job!!
If the game slows down in emulators, that means it's cpu related and using hss might not lead to huge improvements.
I did tests in Z-Treme with gouraud shading, 64x64 textures and semi transparency on everything and it wouldn't even slow down on emulator (Yabause and SSF), while it would run in some areas at 3-6 fps on real hardware...
It turns out that you can achieve a better version of the translucency of the shadow. Here the two version:
Gouraud shading + half- transparent
And half-transparent Lara completely)
Yesterday I tried the AR codes on both YabaShashiro and my SS PAL.
In the emulator I checked that we effectively activate for the distorted sprites and polygons of the scenario and Lara the register HSS. But for Scaled Sprites no.
In the test of the console, I could perceive a slight improvement, but clear, in the sections where I already ran smooth, on 30FPS, but even more stable. In the sections where there were small slumps, a clear improvement. But where there are very big drops, practically nothing.
I have been able to draw several conclusions:
1) @XL2 may be right, and the real cause of slowdowns in open sites does not come from part of VDP1.
2) So do not use HSS, pq if you can perceive some degradation in the textures, as a slight distortion by misalignment of texels.
3) That these slowdowns come from the lack of optimization in the DSP stage or use of generic code from the PC / PSX versions much more similar. Since the TR engine, it is exactly the same in all 3 versions. Drawing distance, Depth Cueing, LOD or Tesselation of the stage and lighting of Lara.
4) Finally, it is known, because in an interview the developer confirmed it, that the game came out with bugs. One of them, and I think that penalizes the performance, since in PSX and PC it is not like that. And it seems logical to me, it is the back culling in elements with masks (door wood bars, harp,bridge tables, hanging plants...) And it is in these places where the FPS in the SS version are most strongly reduced. Another known bug is sound. In this stage of SS, it was known that the sound driver had problems, added to the fact that the developers tended to put PCM in the RAM and pass it to the sound system. Or to save on the official sound SDK or for ease of saving code of other versions. All these problems are in Bus-B, which like any bus has a limit. My suspicions go in this direction.
All in all, I think it's a great version and I respect very much the work done by Core Design at the time and tools that they possessed.
Thanks for the hacks !!!
The last hack you could try to improve performance is to put the resolution at 352x224. Horizontal resolutions of 352 and 704 puts the cpu and vdp1 at 28 mhz instead of 26.
The only problem is that during initialization it might all be hardcoded, so you have to make sure it's really at 28 mhz else it's pointless and might be super glitchy.
It shouldn't make a huge difference, but with both HSS and higher clock it might feel a bit smoother.
I have not seen any other scaled sprites in ingame, except for a vertical black bar. But I removed this strip in my first hack.
The game has an original resolution of 352x224. So, the CPU frequency is already raised to 28 mhz.
Attached image. Does not matter. Are only a few assets in the game.
[GALLERY=media, 2839]Selección_051 by corvusd posted Sep 13, 2018 at 6:41 PM[/GALLERY]
Yes, in NTCS version have this the resolution. For PAL is 352x256
Well. I have some checks more, my conclusions:
1) The transform and lighting code not are optimized. SS can process all this data most better are a lot of examples.
2) Maybe not optimized compilers in the time. Or not code optimized for DSP SH-2. For sure, not use at all SCU-DSP.
3) Maybe not well optimized traffic in Bus-B.
Because? My clues:
1) Change resolution, NTCS(less) to PAL(more), the quantity of pixels or extra MHz not change nothing.
2) HSS not change nothing, really. Only in areas whit not slow-down.
3) The bug for some distorted sprite whit mask, and no back culling, not help to improve fillrate. But I think the fillrate, not are the REAL problem. For this not use HSS at all. Whit the 2-levels of mip-map, very aggressive in some points. The fillrate variable are under control.
3) Finally I check Wireframe mode hack, and the problem is the same. Exactly the same.
I sad see, whit more technical deep, whit this all years ago. This game, in SS could been better, clearly.
In response to the last few posts, I was always under the impression that only VDP1 and VDP2 experience a speed increase when the horizontal resolution is raised from 320/640 to 352/704 - not the dual CPUs as well. My understanding is that both SH-2s are already clocked at 28 MHz, while the VDP chips have a variable range of 26-28 MHz, the exact figures differing slightly between NTSC and PAL systems.
No, the frequency is synched between the vdp1 and both cpus. So really 352x224 is the fastest resolution, but the ratio can be a bit weirder than 320x240.
Quite a convenient resolution I think (aspect ratio 16:10). Such aspect is preferable for modern displays, than 4:3.
Yes, but I meant in 1995-1996 when we all had 4:3 CRT televisions
DOS PC games also had that issue with 320x200 resolution that many disliked for the pixels' shape.
Transparency shadow code for Fatal Fury 3:
Hex code (Files: GARO3.BIN, GARO32.BIN, GARO33.BIN)
Separate names with a comma.