Friday, January 24, 2014

Frame buffer emulation. Part III. Emulation of Video Interface

Why emulate Video Interface

What is a Video Interface (VI)?
 Let me cite N64 programming manual:
”The video interface performs the following functions.
  • Creates signal timing when outputting to television monitor
  • Specifies resolution and depth of color frame buffer.
  • Transfers from color frame buffer to video DAC (D-A converter) and specifies filters used during transfers.
  • Provides current display position information.
Note: The video DAC creates analog data from digital data and outputs said data as a television video image. “

Thus, VI transforms digital frame buffer into analogues TV signal. PC video card has its own Video Interface which does the same: transfers frame buffer from video memory to PC monitor. Complete emulation of N64 VI presents only in emulators/plugins with software rendering. HW accelerated graphics plugins usually emulate VI in very small degree. Plugins take information about frame buffer parameters from VI registers and also use UpdateScreen command as a signal that color buffers must be swapped. Some plugins emulate Gamma Correction, which N64 also performs on VI side. In most cases that's all. Plugin renders into video card's color buffer, video card's VI outputs the buffer to the monitor as is.

Which problems can be caused by incomplete VI emulation? Let's see. N64 VI may not only put whole color buffer on whole TV screen. It also can take any rectangular part of the color buffer and place it on any part of the screen; move the buffer in any direction, stretch or shrink it. And all these manipulations need no work on RDP side. Color image will be the same, but picture on your TV will move. While it's rarely used possibility, some games look incorrect without its emulation. Most known example is slide film effect in Beetle Adventure Racing menus. Here VI slides vertically two successive color buffers.

About stretching or shrinking: N64 color buffer standard resolution is usually just a quarter of maximal TV screen resolution. TV signal in NTSC standard has 483 visible scan lines. N64 VI uses 480 scan lines. Maximal resolution is 640x480 interlaced. Standard color buffer resolution is 320x240. It's VI's job to stretch it on whole TV screen. For example, buffer of that size can be displayed in non-interlaced mode to avoid screen flickering. N64 also supports high-resolution displays at 640x480 dots. Such buffer can be displayed only in interlaced mode, thus with flickering. Some games use compromise: 640x240 color buffer. It has high horizontal resolution, but can be displayed non-interlaced. From the other side, there are games which use 320x480 color buffer in hi-res mode, with only vertical resolution increased.

Such buffer sizes can be a problem for graphics plugins, which expect 4:3 screen aspect ratio. Common problem with such games was that they either displayed only on top half of the screen or half of the image is stretched on the whole screen. PAL games, which were made especially for Europe, could use 320x288 color buffer, which is also not 4:3. Common problem with emulation of such games was that they stretched vertically too much and bottom part of the image was cut. The root of such problems is that graphics plugin tries to do work of RDP and VI in one step. It has to calculate scale factors for N64 color image to stretch it to selected PC resolution. The scale factors are applied to all rendered primitives, and ready color buffer in video card must have the same width-to-height proportions as the picture on TV screen. Scale factors calculation is based on values in VI registers and is performing before start of current frame rendering. If real size of the color buffer differs from values set in VI registers, scaling can be wrong. Actually, VI has information how color image will be scaled to TV picture and careful use of that information allows correct calculation of scales for PC resolution in most cases. However, I remember various weird cases, which required special hacks in Glide64 code.

N64 uses 480 half-lines, 240 for each half-frame. 320x240 color buffer can be displayed using only one half-frame, in non-interlaced mode. This is how it is described in N64 programming manual. VI_ORIGIN register keeps start address inside the displayed color buffer. Transfer to DAC will be started from that address. It's logical to expect that the start address corresponds to the origin of the color buffer. But it's not true. Start address is usually shifted vertically by one line from the origin. That is, the first line of the color buffer is not visible on TV screen. It's not the only surprise. VI_V_START register keeps start and end scan lines, which are used for output image. Usually, (End_Scan_Line - First_Scan_Line) == 474, not 480 as said in manual. This gives us 237 half-lines in non-interlaced mode. That means that not only the first line of the color buffer is not visible on screen, but the last two lines of standard 320x240 color buffer are ignored too. As I said, graphics plugin calculation of scale factors is based on values set in VI registers. The value for width of output image usually correspond to the width of the color buffer, but that 237 makes vertical scale factor wrong, because actual height of the color buffer is 240. Wrong scaling leads to wrong output image in case of traditional rendering to main color buffer. Thus, many plugins apply special correction to VI image height to get desired 240. However, 320x240 is not fixed size for color buffer. The color buffer can be of any size, VI will scale to it to screen as necessary. Some games use color buffer a bit smaller than 320x240, and height correction for such games may produce incorrect result.

Note, that I said "usually" about actual values. While VI_ORIGIN usually point on the second line of the color buffer, it may point on any address inside the color buffer. VI usually use 474 scan lines for NTSC, but it also can use less lines. Image shrinking effect can be obtained by increasing the first scan line and decreasing the last one at the same time in VI_V_START register. Shifting in horizontal direction can be obtaining by playing with VI_H_START register. All these manipulations can be used for special effects. If they can be used they surely have been used.

VI Emulation

New frame buffer emulation mechanism open me new way for VI emulation. Without frame buffer emulation (FBE) plugin renders directly to video card’s color buffer, which displayed on screen as is. With FBE I have separated rendering of frame buffer from its displaying. I have bunch of Frame Buffer Objects (FBO) and I have to copy one of them into the video card’s color buffer to display it on screen. That is I have the same pair of entities as N64: color buffer in FBO corresponds to N64 color buffer in RDRAM, video card’s color buffer corresponds to TV screen. The task of VI emulation is to copy color buffer from FBO into main color buffer the same way as real VI transfers color buffer into video DAC. Since color buffer in FBO is usual RGBA texture, the task is reduced to render textured rectangle on screen. Just need to correctly take into account values of VI registers. Incidentally, this way of VI emulation is not my invention. Similar mechanism was implemented by Vincent ‘Ziggy’ Penne in its LLE graphics plugin Z64 years ago.

What the impact of new VI emulation on resulted image? In most cases it can be considered as zero or even negative. Since plugin now displays only that part of color image which is visible on real console, the lines ignored by N64 VI are not rendered by the plugin too. This adds narrow black areas on top and bottom of the screen, and users don’t like black areas. From the other side, plugin now emulates manipulations with scan-lines and buffer origin. An example of scan-lines manipulation is Jet Force Gemini intro movie. With old VI emulation it looks like this:

With new VI emulation:

I implemented special hack in Glide64 to make it work right. With new VI emulation I got it for free.
Most known effect based on manipulation with output image origin is slide film in Beetle Adventure Racing menus. Two color buffers are stored in RDRAM one by one. The game just movies the origin from the top to the bottom of the first buffer. Now this effect is emulated too:

The video is a bit jerky, I don't know why. The actual emulation goes pretty smooth.

VI-based vertical image shrink is used in several soccer games, e.g. Mia Hamm Soccer. Top Gear Overdrive menu uses VI-based horisontal image shrink. These games are currently not supported by my plugin, so I can’t show you screen shots or videos for these effects.

Sunday, January 5, 2014

Frame buffer emulation. Part II.

Emulation of CPU based frame buffer effects

HWFBE is a great tool for frame buffer emulation. Unfortunately, in case of N64 it is not universal. It works only when all manipulations with frame buffer are performed by graphics co-processor on original hardware. Graphics plugin emulates graphics co-processor. Thus the plugin performs all the manipulations with buffers and it can perform them as efficient as possible. The situation changes when frame buffer is modified without use of co-processor.

The central processor and the co-processor share the same memory (RDRAM) in N64. Thus, CPU can easily read and modify content of frame buffers stored in RDRAM. I deduced one empiric rule: if there is a possibility to do something with N64 hardware then there is a game which uses that possibility. Frame buffer modification on CPU side is widely used in N64 games. Sometimes CPU applies complex post-processing effect to the original image, e.g. blur. Resulted blurred image is stored in another area in RDRAM and is used as background for pause screen:

Often CPU just copies part of the color buffer into another area for various use, e.g. TV effect:

or environmental reflection:

Banjo Kazooie uses copy of color buffer to create jig-saw (puzzle) effect:

All this variety of effects has one in common: graphics plugin does not know about them, as they performed on CPU (emulator) side. Thus, in general it’s impossible to emulate them with HWFBE approach. Some hacks and tricks may help to apply HWFBE to a few of these effects, but only to a few. In general case, graphics plugin must fill color buffers in RDRAM with correct data to emulate CPU based frame buffer effects. CPU (emulator core) processes that data and produces result image, which becomes available to graphics plugin as texture data.
There are two ways to fill color buffers in RDRAM:
  •  software rendering
  •  read content of video card’s color buffer and convert it to N64 color buffer.
Software rendering is very CPU intensive. PSX graphics plugins successfully used it for frame buffer emulation, but I don’t know about good and fast software render for N64. Thus, the only practical way is read data from video card’s frame buffer. This also is the most traditional way of frame buffer emulation. Its implementation is easy, but it had one serious problem: read from video card memory to main memory was very slow. Since graphics plugin does not know, when CPU will need color buffer data, the plugin must read every rendered frame from the video card and copy it to RDRAM.
Enabling that emulation method on AGP cards made emulation too slow to be playable. Authors of 1964 emulator proposed an extension to Zilmar’s graphic plugin specifications to solve that problem. That extension allows graphics plugin to provide the emulator with information about allocated color buffers. The emulator traces commands which read and write to the memory area corresponding to the color buffers and passes information about read/write addresses back to the plugin. Having this information the plugin can decide when video card buffer copy is necessary. This is in theory. In practice, when I implemented this extension in Glide64 its use only crashed the emulator. Then Hacktarux added support for that extension in its Mupen64 and this time it was more successful. Several CPU based fb effects started to work without need to read every frame from the video card, and thus without slowdown. However, some effects still did not work. It’s hard to trace all possible reads/writes to particular area, and sometimes emulator missed such changes.
When I started to emulate CPU based frame buffer effects in my new plugin, I decided to rely on the old good read every frame method. Modern video cards have much better read speed. Plus, modern OpenGL has tools which allow me to simplify and speedup the process greatly. First, glBlitFramebuffer command, which copies one frame buffer to another. If size of buffers differs, the command does linear interpolation. Thus, if you need to copy your hi-res color buffer into lo-res N64 one, you don’t need to read whole hi-res buffer. Just allocate auxiliary buffer with size of N64 color buffer and blit main color buffer into it. Reading of that small auxiliary buffer into main memory will be much faster. And it’s not all. Reading speed can be twice as faster if read not to a plain buffer in memory, but use Pixel Buffer Object which allows asynchronous reads. As the result, my tests show no slowdown with “read every frame” option turned on.

Emulation of direct CPU rendering.

Since CPU can directly access frame buffer area in RDRAM, it also can render frames directly, without help of co-processor. Many games use CPU rendering for simple 2D images displayed on game start: logos, warning etc. Even some movies are displayed this way, for example Pokemon Puzzle League intro movie:

Technically, emulation of CPU rendering is easy. Image to display is stored in RDRAM, its address is known. The image is just loaded as texture to video card and rendered as full screen rectangle. The problem is the same as with CPU based fb effects: graphics plugin does not know that CPU started rendering by itself, emulator does not notify the plugin about it. To be honest, plugin's specification has the function for such notification, but it almost never works. Thus, plugin can’t rely on emulator’s help.

It is safe to assume that CPU does all rendering work until first display list loaded to co-processor, and show on screen content of image in RDRAM pointed by the address in Video Interface. ProcessDisplayList command switches plugin to the normal mode. However, the game may switch back to CPU rendering anytime. For example, Rayman 2 switches to CPU rendering in demo mode after each shown demo:

This happens without any notification from emulator’s side. Plugin will wait for display list to process while user will stare on blank screen. I found only one way to bypass that problem: make plugin force switch back to CPU rendering emulation when frame buffer was swapped several times without new display list. This solution was implemented yet in Glide64, and I still not found anything better.

There is another very specific use of CPU rendering, when it is used together with normal co-processor rendering. Co-processor processes display list and then CPU draws an image over just rendered frame. Examples: pills in Dr. Mario:

rain in Jet Force Gemini:

Emulation of such effects is almost the same as with usual CPU rendering, but the image must be applied with alpha blending. Since it must be done every frame, color buffer area in RDRAM must be cleared to avoid garbage on screen.