Tuesday, February 26, 2019

Fast Approximate Anti-Aliasing aka FXAA

According to multiple user requests, GLideN64 now supports Fast Approximate Anti-Aliasing aka FXAA. FXAA is post-processing filter, which eliminates aliasing. Check the image to see how it works with Super Mario 64 (click for full-size image).


 As you may notice, traditional Multisample anti-aliasing aka MSAA gives the best result. Result of FXAA is decent, but not as good. So, why use it? As you may know, N64 depth compare is hard to emulate properly on PC hardware. I made shader based emulation of N64 depth compare. Currently it can be enabled without sacrifice in performance, but it is not compatible with MSAA. That is if you enabled N64 depth compare you lost anti-aliasing. Now you may enable FXAA and N64 depth compare and play without major sacrifice in graphics quality. Also, FXAA is less demanding to hardware power than MSAA. Anti-aliasing techniques as FXAA are the only way to get AA  on mobile devices with GL ES 2.0. FXAA is currently available in WIP builds and of course it will be included into upcoming public release.

Monday, February 18, 2019

HLE implementation of BOSS ZSort microcode.

Hello!

One of the most exciting features of the upcoming public release is long awaited
HLE implementation of BOSS ZSort microcode. This microcode is used in two great racing games: World Driver Championship and Stunt Racer. Both games were unsupported by emulators for a long time due to core issues. Several years ago these issues were resolved, and the games became playable with graphics plugins, which support LLE mode. GLideN64 supports LLE mode and it can run the games in LLE, but performance is far from good.

HLE mode is usually much faster and has less glitches. The problem is that BOSS developed custom microcode for these games and no information about this microcode leaked. Thus, it had to be decoded from asm code. This work requires skills and patience. GLideN64 obtained the first results in microcode decoding yet in 2016, when Gilles Siberlin aka Gillou68310 decoded microcode for Kuiki Uhabi Suigo. Then me and olivieryuyu completed several decoding projects. Gilles started to work on decoding of BOSS ZSort microcode at the end of 2017.

This microcode is deep modification of ZSort microcode released by Nintendo in 1999. ZSort intended to solve problem with N64 low fill-rate. While N64 supports Hi-Res graphics, not so many games could use hi resolutions without sacrifices in speed or quality because of bottlenecks in the RDP. ZSort sorts objects by depth value before rendering and draws them in order from far to near thus eliminating the use of depth buffer. It not only saves memory but also frees RDP from rendering the depth buffer, which almost doubles the fill-rate for color buffer.

Unfortunately, ZSort was released too late in the system's life and was used only in the engine for series of soccer games, for example Mia Hamm Soccer 64. ZSort is very specific microcode; it has almost nothing in common with F3D/F3DEX series of microcodes which are used in most of N64 games. I had documentation for it and sources of two demos which use it, but nevertheless I spend weeks to implement and debug it.

It seems that ZSort also had some performance flaws, so when BOSS decided to use it for new games, BOSS programmers re-worked it a lot. The resulted ucode differs from original ZSort so much that it was clear - it must be decoded from asm to be HLEed. So, Gilles Siberlin took that task and successfully completed it. Since I did not participate in this task, I don't know all the details. Cite from Gilles: "Wow this is one hell of an optimized ucode!!! The ucode is doing both audio and graphic processing, plus everything fits in less then 1024 instructions so no overlay needed." It is unique case when ucode processes both audio and graphic, no other ucode does it. So, now GLideN64 has a bit of audio processing code. Emulation of this microcode requires modification of N64 SP_STATUS register. That is not allowed by original zilmar's specs for graphics plugin. mupen64plus implemented necessary changes in core to fix that issue and now both World Driver Championship and Stunt Racer work fine with mupen64plus. Work with Project64 is not guaranteed.