Emulation of CPU based frame buffer effects
HWFBE is a great tool for frame buffer emulation. Unfortunately, in case of N64 it is not universal. It works only when all manipulations with frame buffer are performed by graphics co-processor on original hardware. Graphics plugin emulates graphics co-processor. Thus the plugin performs all the manipulations with buffers and it can perform them as efficient as possible. The situation changes when frame buffer is modified without use of co-processor.
The central processor and the co-processor share the same memory (RDRAM) in N64. Thus, CPU can easily read and modify content of frame buffers stored in RDRAM. I deduced one empiric rule: if there is a possibility to do something with N64 hardware then there is a game which uses that possibility. Frame buffer modification on CPU side is widely used in N64 games. Sometimes CPU applies complex post-processing effect to the original image, e.g. blur. Resulted blurred image is stored in another area in RDRAM and is used as background for pause screen:
Often CPU just copies part of the color buffer into another area for various use, e.g. TV effect:
or environmental reflection:
Banjo Kazooie uses copy of color buffer to create jig-saw (puzzle) effect:
All this variety of effects has one in common: graphics plugin does not know about them, as they performed on CPU (emulator) side. Thus, in general it’s impossible to emulate them with HWFBE approach. Some hacks and tricks may help to apply HWFBE to a few of these effects, but only to a few. In general case, graphics plugin must fill color buffers in RDRAM with correct data to emulate CPU based frame buffer effects. CPU (emulator core) processes that data and produces result image, which becomes available to graphics plugin as texture data.
There are two ways to fill color buffers in RDRAM:
- software rendering
- read content of video card’s color buffer and convert it to N64 color buffer.
Software rendering is very CPU intensive. PSX graphics plugins successfully used it for frame buffer emulation, but I don’t know about good and fast software render for N64. Thus, the only practical way is read data from video card’s frame buffer. This also is the most traditional way of frame buffer emulation. Its implementation is easy, but it had one serious problem: read from video card memory to main memory was very slow. Since graphics plugin does not know, when CPU will need color buffer data, the plugin must read every rendered frame from the video card and copy it to RDRAM.
Enabling that emulation method on AGP cards made emulation too slow to be playable. Authors of 1964 emulator proposed an extension to Zilmar’s graphic plugin specifications to solve that problem. That extension allows graphics plugin to provide the emulator with information about allocated color buffers. The emulator traces commands which read and write to the memory area corresponding to the color buffers and passes information about read/write addresses back to the plugin. Having this information the plugin can decide when video card buffer copy is necessary. This is in theory. In practice, when I implemented this extension in Glide64 its use only crashed the emulator. Then Hacktarux added support for that extension in its Mupen64 and this time it was more successful. Several CPU based fb effects started to work without need to read every frame from the video card, and thus without slowdown. However, some effects still did not work. It’s hard to trace all possible reads/writes to particular area, and sometimes emulator missed such changes.
When I started to emulate CPU based frame buffer effects in my new plugin, I decided to rely on the old good read every frame method. Modern video cards have much better read speed. Plus, modern OpenGL has tools which allow me to simplify and speedup the process greatly. First, glBlitFramebuffer command, which copies one frame buffer to another. If size of buffers differs, the command does linear interpolation. Thus, if you need to copy your hi-res color buffer into lo-res N64 one, you don’t need to read whole hi-res buffer. Just allocate auxiliary buffer with size of N64 color buffer and blit main color buffer into it. Reading of that small auxiliary buffer into main memory will be much faster. And it’s not all. Reading speed can be twice as faster if read not to a plain buffer in memory, but use Pixel Buffer Object which allows asynchronous reads. As the result, my tests show no slowdown with “read every frame” option turned on.
Emulation of direct CPU rendering.
Since CPU can directly access frame buffer area in RDRAM, it also can render frames directly, without help of co-processor. Many games use CPU rendering for simple 2D images displayed on game start: logos, warning etc. Even some movies are displayed this way, for example Pokemon Puzzle League intro movie:
Technically, emulation of CPU rendering is easy. Image to display is stored in RDRAM, its address is known. The image is just loaded as texture to video card and rendered as full screen rectangle. The problem is the same as with CPU based fb effects: graphics plugin does not know that CPU started rendering by itself, emulator does not notify the plugin about it. To be honest, plugin's specification has the function for such notification, but it almost never works. Thus, plugin can’t rely on emulator’s help.
It is safe to assume that CPU does all rendering work until first display list loaded to co-processor, and show on screen content of image in RDRAM pointed by the address in Video Interface. ProcessDisplayList command switches plugin to the normal mode. However, the game may switch back to CPU rendering anytime. For example, Rayman 2 switches to CPU rendering in demo mode after each shown demo:
This happens without any notification from emulator’s side. Plugin will wait for display list to process while user will stare on blank screen. I found only one way to bypass that problem: make plugin force switch back to CPU rendering emulation when frame buffer was swapped several times without new display list. This solution was implemented yet in Glide64, and I still not found anything better.
There is another very specific use of CPU rendering, when it is used together with normal co-processor rendering. Co-processor processes display list and then CPU draws an image over just rendered frame. Examples: pills in Dr. Mario:
rain in Jet Force Gemini:
Emulation of such effects is almost the same as with usual CPU rendering, but the image must be applied with alpha blending. Since it must be done every frame, color buffer area in RDRAM must be cleared to avoid garbage on screen.