Graphics API and Driver Design

Graphics Design Goals

To encourage the use of hardware features, as well as to provide a framework for expansive features.

Given the requirement that the design should promote the use of hardware, the user needs to be able to ask for very specific functionality without having to guess which driver provides which features. Additionally, the user needs to be able to ask if the current driver can perform a certain operation in hardware.

Also, the design should be written against a potential OpenGL-based driver. This would provide for a wealth of graphics features and unrivaled speed. Such wealth should be exposed to the user to a reasonable extent. Other drivers should be able to be written as well, so the design should not preclude less well-featured hardware. The minimum functionality should cover as much as possible, while the maximum functionality should be OpenGL or something pretty close to it.

Graphics Design

The graphics API and driver model provides the ability to manage one or more drivers. Each driver represents an API into a particular rendering surface.

A graphics driver, when created, operates in one of the following modes:

Fullscreen: The driver represents the entire screen of the display device.
Windowed: Requires a windowing-OS. The driver can only render to a specific window of a fixed size.
Windowed-Tool: Requires a windowing-OS. The driver can render to a specific window, but the size can change.
Null: The driver's framebuffer does not represent any visible surface. It has no update rates. (questionable)
Generic: Only one driver operates in this mode. This is a special driver detailed below. This driver has no framebuffer.

Bitmaps

The primary object of the graphics API is the 2D surface, or bitmap. A bitmap represents a two-dimensional array of memory along with whatever other state information is necessary to provide the functionality that the driver requires.

This information includes, but is not limited to:

Shared Data:
- The actual bitmap itself (the block of bits in a given format)
- The format of the bitmap
- The actual size of the bitmap
- Perhaps the mask color of a masked bitmap
Instanced data
- Clip Rectangle: No drawing operation will set a pixel outside of the clip rectangle.
- Coordinate Offset: Redefines the origin of the bitmap. Affects all functions that refer to this bitmap.
- Sub-Bitmap Rectangle: For sub-bitmaps, the maximum position of the clip rectangle. It cannot exceed this boundary.

Allocation

Using an OpenGL texture-like model, a bitmap is created with a given size and a request for a specific format. If resources are insufficient for allocating any particular bitmap, then the allocation mechanism will fail.

However, the driver is not required to fully honor portions of this request. Specifically, the driver may freely change the actual internal format used. The driver will attempt to fulfill the request as best it can, though this is not guaranteed. If a driver supports RGBA and alpha blending, then the driver can support all other formats by emulating them. If a driver cannot even remotely support a given format, it will fail and add an error to the error mechanism.

Color formats include:

RGB
RGB-Masked: Allows for masked blitting.
RGBA
Intensity (1-color replicated over RGBA)
Luminance (1-color replicated over RGB)
Luminance-Alpha (Luminance with a separate alpha channel)
YUV

With the exception of the YUV color format, one can request these formats in integer components of 4-bit, 8-bit, 16-bit or 32-bit per component, though the latter ones will likely be truncated down to 8-bit on many driver. One can also request floating-point storage formats of 16-bit and 32-bit per component. Many drivers will use integer components, but more and more hardware supports using floating-point color buffers.

The Intensity format specifies a single value that is replicated over RGBA. As such, it is a format with an alpha value, so alpha-based blending will work. Luminance is similar to Intensity, except it is only an RGB format; it lacks an alpha channel. And Luminance-Alpha is a format that uses Luminance and a separate alpha channel.

YUV is a special case, as the storage for it is different for the different components. <flesh this out>

The bitmap in question may also have additional attributes associated with it. Here is a list of attributes that a bitmap can be created with:

Trilinear
Mask Color (used for RGB-masked bitmaps)

Trilinear means that, after any operation that changes the pixel data in the bitmap, a sequence of bitmaps will be internally created. These mipmaps are scaled-down versions of the original bitmap. They are used to interpolate as one scales down an image. Only bitmaps that have been created with this attribute may be used in trilinear blit operations. A driver can specifically disallow the creation of trilinear bitmaps, and such drivers will fail to create a bitmap if the user sets this attribute. If the driver fails, it will expose an error explaining the problem.

The mask color tells an RGB-Mask bitmap which color should not be draw when using the bitmap as source data.

The driver is given a usage hint. This hint gives the driver some knowledge about how the BITMAP will be used. This allows the driver to optimize the location of memory and other factors for that particular use. Note that these hints are part of the shared data, and therefore cannot be changed in a BITMAP instance.

The usage hints are as follows:

Static Source
Stream Source
Static Destination
Stream Destination

Stream means that the bitmap will have data frequently uploaded to it, while Static means that these actions will be relatively infrequent (preferably, only the initial load). Source means that the bitmap will only be used as the source bitmap in blitting operations, while Destination means that the bitmap can be used as a render target in addition to being a source of render data.

Color Format Conversion

During the specification of the behavior of a driver, the conversion between multiple color formats will be discussed. All color conversion will be done in the following way.

All color formats can be converted to RGBA:

RGB: Goes to RGB with an alpha of 1.0.
RGB-Masked: Goes to RGBA with an alpha of 1.0 for non-masked pixels and an alpha of 0.0 for masked ones.
RGBA: No conversion.
Intensity: Replicate the color for all 4 components
Luminance: Replicate the color for RGB, alpha is 1.0
Luminance-Alpha: Replicate the luminance for RGB, alpha comes from alpha channel
YUV <to be determined>

Also, all color formats can be converted from RGBA:

RGB: Drop the alpha
RGB-Masked: Drop the incoming alpha. RGB colors that match the mask color have an alpha of 0.0, those that do not have an alpha of 1.
RGBA: No conversion
Intensity: Take the R component as the intensity
Luminance: Take the R component as the luminance
Luminance-Alpha: Take the R component as the luminance. Keep the alpha.
YUV <to be determined>

These two operations allow colors to go from or to any format.

Data Updating

Allocated bitmaps, like most allocated objects in other systems, contain arbitrary data. Filling data in a bitmap can be handled in several ways. First, it can be updated by giving the API a pointer to memory formatted in a very specific way. When this function returns, the bitmap will be updated with the new data. Note that this can cause format conversions which can take some time. Portions of a bitmap can be updated in this fashion. To some degree, this is like a plain blit operation without effects.

Another method is to blit from another bitmap. Depending on how fast non-framebuffer rendering is in the driver, this can be a reasonable option. This uploading, like the previous kind, can transform one format into another. If the format goes from one with fewer components to one with more, such as from RGB to RGBA, then the new component(s) will be set to the highest value.

Lastly, the bitmap can be cleared to a specific color.

Coordinate Spaces

A bitmap instance can have a coordinate offset that bifurcates the coordinate space into 2 separate spaces. Bitmap coordinates refers to coordinates relative to the upper-left of the bitmap. The upper-left corner is the origin of the coordinate space. Positive x extends to the right, and positive y extends downwards.

Offset coordinates are relative to a user-specified value. This specified value is the origin, and x,y coordinates in this space are relative to that origin. Positive x extends to the right, and positive y extends downwards.

All drawing functions take values for the source and destination in offset coordinates. Only a few functions specify a position on a bitmap in bitmap coordinates rather than offset coordinates, such as the function for setting the offset origin.

Newly created bitmaps, and instances of bitmaps, have their offset origin set to the bitmap space origin.

Clip Rectangle

The clip rectangle specifies a region of the bitmap outside of which any drawing function will no longer work. No rendering function will set pixels outside of the clipping rectangle.

Sub Bitmaps and Bitmap Instancing

Bitmap objects are instanced. The actual driver resource that represents the bitmap's internal data can be shared among any number of bitmap instances. The bitmap data itself, the uploaded graphics data, is part of the internal data, as is the format and actual size of the buffer. However, the clip rectangle and the coordinate offset values are both part of the instanced data. Two different bitmaps can exist that point to the same internal data, but offer different clip rectangles and/or coordinate offset values.

Also, there can be another difference in two instances of a bitmap. One can make the new bitmap a "sub-bitmap" of the original. Effectively, a sub-bitmap cannot access portions of the original surface outside of a specified rectangle, for either read or write. All coordinates, including the coordinate offset, of the sub-bitmap are relative to the sub-rectangle. The sub-rectangle is specified in bitmap-relative coordinates.

One can make a sub-bitmap of a sub-bitmap. However, this does not nest. Also, the sub-rectangle, for these cases, is specified relative to the sub-rectangle of the sub-bitmap.

The sub-bitmap rectangle, in bitmap coordinates, can be modified after creation. This modification happens in the coordinate space of the shared bitmap data, using its width and height.

There is one issue that instanced bitmaps. This is memory management. The driver and AllegroPro will use a system of reference counting to determine when the shared data is released. Each time a bitmap is instanced, an internal reference count is incremented in the shared data. After a bitmap is released, the internal reference count is decreased. If that reference count reaches 0, then the internal bitmap data can be destroyed.

Perhaps there should be an API for destroying all bitmaps that share data with the given bitmap. However, it is dangerous to use this API, as lingering pointers to deallocated bitmaps can cause all kinds of problems. This is something that should be reserved for a resource management system or other high-level construct that can handle lingering pointers.

Blit-param

The most frequent operation performed on a bitmap is blitting. This takes a rectangular region of a source bitmap and copies it, with some function applied to it, to the destination bitmap. Blitting operations, also, tend to be the most costly operation in 2D games.

There are any number of parameters and ways to blit a bitmap. To place them all in one function call would create a truly massive function prototype. The alternative would be to create a lot of functions for each individual type of blitting. This would, not only make the blitting API huge, but also decrease blitting functionality. In theory, any of the blitting parameters could be used in any combination.

Also, there is the concept of a "type" of blitting operation. When creating a layered tilemap, for example, all tiles of a given layer are blitted in a certain way. Indeed, it would be nice to provide some kind of object that could be defined and modified at runtime that encompasses a type of blitting.

As such, all blitting operations will involve an object known as the blit-param. This is a large structure that contains all of the particular parameters for a blitting operation, except for those that require specific knowledge of the two bitmap surfaces. In particular, the blit-param will store the following:

Scale factor
Rotation angle
Rotation type (source-relative or destination relative)
Rotation point (useful only if rotation and destination relative)
Filtering (useful only when rotating or scaling)
Vertical flip
Horizontal flip
Pre-Blend Function (and Pre-Blend Color)
- Modulate Function
Modulate color
Blend equation type (along with separate flag)
Blend function parameters (4 parameters)

This object is passed by reference to the blit functions. These are not driver-specific objects; they can be created and managed by the client, though some API functions should exist to set commonly used values, or to act as interfaces for setting parameters.

Blitting Equation

By far, the most common operation performed on bitmaps is blitting. While blit once meant "fast-block copy", it now refers to a whole host of possible operations, from rotation and scaling to blending. The blitting operation takes place in the following sequence.

Coordinate Offset, Rotation, Scaling, and Flipping

All coordinates are converted into bitmap coordinates. This creates a source rectangle and a destination coordinate. The source rectangle is clamped to the sub-bitmap region.

The scale, rotation, vertical and horizontal flip blit-param fields create a mapping from the destination bitmap (relative to the destination coordinate) and the source rectangle.

When considered from source to destination, the scale is applied first, followed by the horizontal and vertical flips. After these operations, the rotation is performed relative to the coordinate offset. This creates the destination rectangle, which need not be axis aligned.

The source bitmap, at this point, is effectively scan converted as though it were mapped to a rectangle being scan converted. The rest of the pipeline deals with how each sample from the source bitmap is dealt with. No destination pixels outside of the destination clipping rectangle are ever set.

Source Sampling

Using the mapping from the destination to the source rect, a floating-point coordinate on the source bitmap is generated.

If filtering is not specified, then the floating-point coordinate is clamped to the nearest coordinate and sampled.

If the blit-params use filtering, then the type of filtering is determined by the attributes of the source bitmap. If it does not have mipmaps, then bilinear filtering is used.

For bilinear filtering, the floating-point source coordinate is used directly. The 4 nearest colors to this in the source are sampled and blended in a bilinear fashion to compute the final sample color. Note that bilinear filtering is allowed to pick up colors from neighboring pixels that are outside of the sub-bitmap range.

If the source has mipmaps, and filtering is specified, then the mip-map pyramid is used as well. If the image is scaled down, then the two mip levels that lie between the actual scale are detected. The two bilinear samples from each mip level are linearly blended with each other to compute the final sample. The precise method of mip-map selection is implementation defined.

The sample is converted into the RGBA colorspace, though this conversion can happen before filtering.

Note: Any and all linear blend operations happen on all components of the sample. Also, all linear blend operations happen in RGBA space.

Masking

Bitmaps that use the RGB-Masked format have a mask color set into them. If the sample at this point equals the mask color, then the pixel is not written to the frame buffer.

Pre-Blend Operation

This sample color goes through the pre-blend operation. The pre-blend operation simply performs some computation on the source color before reaching the blend stage.

Current pre-blend functions include:

Modulate

A pre-blend function is provided with 1 color parameter. The Modulate function multiplies the source color by the pre-blend parameter color.

Blend Equations

During a blit operation, the user may request to perform "blending" operations. A blending operation is any operation where the final color set into the destination bitmap is some function of the source and destination colors. Technically, the non-blending case is a blending op, as the function is simply 1.0 * source RGB.

Notation will be important for understanding this section. All blending operations are specified in the RGBA color space format.

The source color is specified as S(rgba), where rgba represents the component(s) in question. The source represents the sample from the source bitmap as computed before the blending stage. The destination color is specified as D(rgba), where rgba represents the component(s) in question. The output color is represented as O(rgba) in the same way as S and D.

For example, the syntax:

O(rgb) = S(rgb) * D(rgb)

means to do a component wise multiply between the source and destination RGB componentsand store that in the corresponding RGB output. Effectively doing:

O(r) = S(r) * D(r)

O(g) = S(g) * D(g)

O(b) = S(b) * D(b)

The 4 blend function parameters can specify a number of things. A blend parameter may be one of the following:

Zero
One
Source Color
Source Alpha
Inverse Source Color
Inverse Source Alpha
Destination Color
Destination Alpha
Inverse Destination Color
Inverse Destination Alpha

Each parameter resolves to a full RGBA color. The scalar parameters, like One or Source Alpha, are replicated across the 4 components of RGBA. Inverse, in this instance, means 1.0 - the value.

Note that a blend equation may specifically disallow the use of certain parameters in certain situations. For example, it may disallow the use of the Source Color parameter as the multiplicative operand to S(rgba).

Parameters, using the notation above, are specified as P0(rgba) through P3(rgba). If P0(rgb) is specified, then this means the RGB components of the first parameter, as it resolves to color. If it is set to One, then it resolves to (1.0, 1.0, 1.0). If it is set to Source Color, it resolves to S(rgb). If it is set to Source Alpha, it resolves to S(aaa), thus replicating the alpha over 3 components.

The blend equation specifies the overall equation being used in the blending operation. It, also, specifies how the parameter values are used and which one are or are not allowed.

The following specifies the potentially-available blending equations:

Additive
Subtractive
Reverse-Subtractive

Additive blending specifies a blend equation of the following type:

O(rgba) = (S(rgba) * P0(rgba)) + (D(rgba) * P1(rgba))

The restrictions on P0 are that it may not be Source Color or Inverse Source Color. The restrictions on P1 are that it may not be Destination Color or Inverse Destination Color.

Subtractive blending provides a blend equation of this type:

O(rgba) = (S(rgba) * P0(rgba)) - (D(rgba) * P1(rgba))

The restrictions on P0 and P1 are the same as for Additive.

Reverse subtractive blending provides a blend equation of this type:

O(rgba) = (D(rgba) * P1(rgba)) - (S(rgba) * P0(rgba))

Thus, it reverses the order of the subtraction. The restrictions on P0 and P1 are the same as for Additive.

Blend Function Separation

Note, given the above, that the RGB and Alpha blend parameters are the same. This is usually what one wants, but not always. Sometimes, one really does want to have the RGB blend function use different parameters from the Alpha.

A blit-param can have a separation flag set. If it does so, and the driver supports it, then the following will happen.

What was one equation becomes two. The additive equation, for example, becomes:

O(rgb) = (S(rgb) * P0(rgb)) + (D(rgb) * P1(rgb))

O(a) = (S(a) * P2(a)) + (D(a) * P3(a))

A similar operation occurs for the other blend equations.

P2 and P3 have the same restrictions as P0 and P1, respectively. The *_Color parameters specified above, when applied to the alpha P2 and P3 parameters, mirror the operation of their *_Alpha equivalents.

Floating-point blend operations

Blending to a floating-point target may be possible, but it may impose additional limitations to the parameters. This will be determined once OpenGL figures it out.

Post-Blend Operations

Lastly, the color is converted into the format of the destination bitmap, and set into that bitmap.

This assumes a driver that exposes all of the functionality in question.

Color Operation Notes

All color operations are performed as though the colors were in a floating-point format. When converting from integer to float, the conversion is done by dividing by the largest integer of that format (15 for 4-bit components, 255 for 8-bit components, etc). Conversion back is done by multiplying the floating-point number by that largest value.

Normally, for integer formats, the color is clamped on the range 0.0 to 1.0. However, if rendering to a floating-point format, the color is not clamped.

Miscellaneous Blitting Notes

Drawing from and to the same bitmap is guaranteed to work only if the source and destination rectangles do not overlap. If they do overlap, then the results are undefined.

Framebuffers

When a driver is created, it creates one or more framebuffers. This represents the output surface being displayed. To render to the display surface, one may request the bitmap that represents this display and render to it as normal. However, the display bitmap is different from a normal bitmap. In many drivers, operations that draw directly to the framebuffer bitmap are significantly faster than operations directed towards other bitmaps.

The framebuffer in the driver has a concept of the update mechanism. If one draws directly to the memory that represents the screen being displayed, then tearing can result. This is where the display shows a frame in progress. As such, a number of buffering methods can be employed. It is the driver's responsibility to select one of them. In single buffer mode, the user is drawing directly to the visible screen. In any other buffered mode, the bitmap for the framebuffer represents an appropriate buffer. This buffer is copied or swapped in with a driver command. The bitmap that represents the framebuffer always points to the surface that is most reasonable to be rendering to, even after a buffer swap operation. As such, the user need not constantly ask for the framebuffer bitmap.

After doing all of the rendering, the user should swap, even if the framebuffer type is not one that requires a swap. Such an operation on a non-swapping framebuffer will not generate an error; it is merely a no-op.

The status of the framebuffer contents after a swap are defined by the type of framebuffer:

Single Buffer:
- The framebuffer is not buffered at all. Rendering draws directly to the viewable screen. The framebuffer contents are preserved after a swap.
Double Buffer Swap:
- The framebuffer consists of two buffers. Rendering draws into the unseen buffer, while the other buffer is displayed. A swap operation flips changes the two buffers. The contents of each buffer is preserved in a swap. A swap should block on vsync to avoid tearing.
Double Buffer Copy:
- The framebuffer consists of two buffers. Rendering draws into the unseen buffer, while the other buffer is displayed. A swap operation copies the data in the render buffer into the displayed buffer. The contents of the render buffer are preserved on a swap operation. A swap should block on vsync to avoid tearing.
Triple Buffer:
- The framebuffer consists of 3 buffers. One buffer is the render target. Another is being displayed. And a third an extra buffer. The extra buffer is either waiting to be filled or waiting to be displayed. On each vsync, if the extra buffer is waiting to be displayed, then it is swapped with the display buffer.
- On a user-requested swap operation, the current render target is swapped with the extra buffer, if it is waiting to be filled. This queues the buffer up to be displayed on the next vsync. However, if the user requests a swap, and the extra buffer is waiting to be displayed, then the swap operation will block until the extra buffer is swapped with the display buffer.
- The purpose of this mode is to keep a program from blocking on a framebuffer swap.
- The user should not depend on the contents of the buffer after a swap operation.
Undefined Buffer:
- This method, which all drivers must provide, is defined by being undefined. All that it promises is that it is buffered, so that rendering to the framebuffer will be displayed only after a swap. There is no guarantee as to the contents of the framebuffer after a swap. There is no guarantee as to whether the swap blocks on vsync or not. This method is guaranteed not to tear, however. Any of the buffered methods above can satisfy these conditions, but this mechanism allows an implementation to choose one arbitrarily, or to allow an underlying API to have an arbitrary buffering mechanism.

For drivers that operate in full-screen mode, they may choose to have multiple full-screen framebuffers. Each one represents a specific screen on a multi-monitor systems. The framebuffers are accessed and flipped independently of each other. A driver that can operate in fullscreen mode may choose not to support multiple monitors.

The framebuffer can be instanced and sub-bitmapped like a normal bitmap.

Graphics API Functionality

The following is a list of features that all graphics drivers must provide, with the idea that these operations should be as fast as possible.

Blitting
- Rectangular Clipping: prevents writing pixel data during any operation on the given surface.
- Coordinate Offset: Applies a given offset to any rendering commands to the given surface.
- Vertical Flip:
- Horizontal Flip:
Primitive Drawing
- Pixel
- Line
- Rect, with or without fill
Bitmaps
- SetPixelData
Framebuffer
- Buffered Undefined: Render to a non-screen buffer, but the update command leaves the current buffer undefined (could flip or copy).
Framebuffer formats
- RGB
- RGBA
Bitmap formats
- RGB
Bitmap Storage Formats
- 16-bit total (565, RGB only)
- Integer, 8-bit per component

The following is a list of the features that a graphics driver may optionally provide, if it can implement them reasonably fast in hardware.

Blitting (using only native or shared bitmaps)
- Fast Non-Framebuffer Rendering (of any kind)
- Scaling
  - Bilinear filtering
  - Trilinear filtering for scaling objects down. Mipmaps are autogenerated.
- Rotation
  - Bilinear filtering (same as with scale)
- Masked Blitting
- Pre-blend function
  - Modulate (pre-blend)
- Blend functions
  - Add
  - Subtract
  - Reverse-Subtract
  - Allow for separate RGB and Alpha params.
Generic Blitting: blits from generic bitmaps. These flags tell whether the operation forces an implicit conversion to driver command.
- Same as above, except for the following.
- Any blitting at all
Framebuffer
- Update Mechanism (request only number of buffers)
  - Double Buffer Copy
  - Double Buffer Flip: (halts on v-sync) Front and back buffers are swapped.
  - Triple Buffer Flip: (can block on v-sync if rendering too fast) 3-way swap operation, contingent on a swap-at-vsync hardware system.
  - Single Buffer (rendering directly to the screen)
- YUV Colorspace Overlays (to be defined)
- Gamma ramp (to be defined)
- Can blit from the framebuffer fast
Bitmap formats
- RGB-Masked
- RGBA
- RGB
- Intensity
- Luminance
- Luminance-Alpha
- YUV
Bitmap Storage Formats
- Float 16, 32-bit per component
- Integer 4-bit per component
Driver mode: Driver must pick at least one.
- Fullscreen
- Windowed
- Windowed-Tool
- Null
The possibility of multiple framebuffers for multiple monitors.
Heavy Resource use (the driver is a resource hog. Don't make too many of them)
Is a "framebuffer is different" driver.¹

Graphics Drivers

The driver will provide functionality to determine which optional features are available. For framebuffer and driver-specific features, the driver will simply expose them directly as flags. However, for blitting, thanks to the blit-param interface, the API will be as follows. A function will exist that takes a blit-param and tells if the operation will work as specified. Also, it will return whether or not the operation will be reasonably "fast". Generally, fast means hardware support. This allows a driver to disallow certain blend mode combinations, as well as certain combinations of features. For instance, a driver might allow for modulation and some limited blending, but not both at once.

Because the types of the bitmaps can be a factor in determining whether the blitting operation will go through, the user can pass in a source and destination bitmap type. The types are as follows:

Framebuffer
Driver Bitmap
Generic Bitmap

In addition to the type of bitmap, the format of the bitmap is passed in. This should be sufficient information to determine whether the blitting operation will succeed.

When creating a graphics driver, the user can give a list of driver/framebuffer functionality, as well as a prioritized list of blit-param objects that the user would like to work in hardware.

¹ For many drivers, the framebuffer is no different from regular bitmaps. They are allocated from the same place and operate in pretty much the same way. However, for some drivers, the framebuffer is a very different object from a bitmap. While it is still a bitmap, using it as a source for a blitting operation is not recommended. Indeed, in many these drivers, rendering operations are designed to go only to the framebuffer. If this flag is set, it is a warning against using any blits to non-framebuffer bitmap objects. Such blits can be very slow.

If a driver does not support some blitting functionality, but is given a blit-param object that specifies that functionality, then the driver will do one of two things. It will either continue to blit except without the requested feature. Or, it will fail and add an error message to the core error system. However, the driver is not allowed to chose which to do; the behavior is specified below:

Scale: Error out
Rotation: Error out
Modulate: Blit
Blend functions/unsupported params: Blit

The idea is that modulate and blending affect the visual look of the output. But, rotation and scale affect different destination elements. It is one thing to blit a bitmap wrong-looking. It is quite another to blit to unexpected portions of the screen, or to not blit to expected portions. To simply shut off rotation or scale would produce results that are always wrong; in these cases, it is better to error out.

Autodetection

There are two mechanisms of autodetection: simple and advanced.

Simple Autodetection

Simple autodetection selects drivers based on a very parameter list. These are the parameters:

Driver mode (Fullscreen, windowed, etc)
Framebuffer size
Framebuffer color format (RGB, RGBA, or don't care)
Framebuffer colordepth hints

This autodetection mechanism will return the highest performing registered driver that fulfills the requested parameters. Hints may or may not be honored. If no driver exists that can satisfy the requested parameters, then the mechanism will not return a driver.

Advanced Autodetection

The purpose of this method of autodetection is to allow the user to request a driver that fulfills very specific needs. The user can ask for detailed driver features, as well as even request that certain modes of blitting be supported by hardware.

The most important thing to note about this method of autodetection is that it can return a driver that does not fulfill all of the requested functionality. Instead, it returns the driver that is "close" to what the user asks for. This "closeness" is as defined below. Using the usual mechanisms, the user can ask the driver for its various capabilities.

The user passes two large structures, in addition to the parameters for Simple Autodetection. Note that, like Simple Autodetection, the non-hint parameters are guaranteed to be fulfilled. One of them is the list of required driver features. The other is an array of blit style objects.

The ultimate idea is that these two parameters are used to compute a score for each driver based on how closely it implements the requested features. Features are given different weights when computing this score.

Driver features are the following, with the score in parenthesis:

YUV Colorspace Overlays (2000)
YUV bitmap color format (2000)
Gamma ramp (2000)
Fast non-framebuffer blits (1000)
Can blit from the framebuffer fast (1000)
Bitmap Storage Formats
- Float 16, 32-bit per component (1000)
- Integer 4-bit per component (500)
Update Mechanism (request only number of buffers)
- Triple Buffer Flip (250)
- Single Buffer (500)
- Double Buffer Flip (500)
- Double Buffer Copy (500)
Bitmap formats
- RGB-Masked (1000)
- RGBA (1000)
- Intensity (500)
- Luminance (500)
- Luminance-Alpha (500)

After computing the score for the driver features, the score for the blitting features is computed. The feature list is assumed to be prioritized, with the first features being more important than the later ones.

The score for the features is as follows:

Each additional supported feature gives 50 points.

The best way to use this autodetection mechanism is to only ask for important features. If you ask for too many blitting features, they can override your driver feature request.

Windowing and Drivers

There will be the ability to create graphics drivers that are bound to a window that already exists. If the user does not provide a window, then the graphics driver will create an appropriate window. However, if the user does provide a window, via a platform-specific API, the driver may decide that the window is, for whatever reason, inappropriate for rendering into and will fail through the error mechanism. If the current operation system environment does not provide for windows, then this API will always fail.

When this API is used, it can be used in one of 2 ways. One way is to let the user create the window, then allow the driver to fully own the window. In this case, the driver handles all messages associated with the window. The other is to have the user do all message processing on the window, and then use the API to pass various information about the nature of the window to the API. This second API can only be used in windowed-tool mode, as discussed below.

Windowed-Tool Mode

It is often useful to have the same technology that runs the core game be used in game tools as well. For instance, if one has a tilemap system, it could be used in both the actual game and the tilemap editor tool that runs in a particular OS's GUI.

To facilitate this, AllegroPro graphics drivers can, if they so choose and the user requests, operate in windowed-tool mode.

In normal windowed mode, the graphics driver has total control over the size and shape of a window. In particular, the window's size cannot be changed. This is useful for running a game at a fixed resolution in a window, but this is not useful for writing tools in an OS native GUI. In such an environment, it is important for windows to be able to change shape at the behest of the user.

In windowed-tool mode, the window that a graphics driver uses can change size. Indeed, the user can set a specific size for a window by using an API function to change the resolution of the framebuffer. As mentioned in the above section, the user can also have full control over the window.

Windowed-tool and framebuffer

The framebuffer behavior in windowed-tool mode needs to be specified. The way it works is simple. To provide for the maximum flexibility, the entire contents (not just the recently revealed portion) of the framebuffer after a resize operation are undefined. The driver is free to completely destroy them at will.

Instances of the framebuffer that may be sub-bitmaps of the framebuffer may find themselves outside of the visible region of the framebuffer. In this case, blitting operations to these bitmaps produce undefined results; they will need to be recreated, modified, or the framebuffer needs to be resized to put them back inside of the buffer.

As a note to driver developers wanting to expose a windowed-tool mode, the expectation is that resize operations are fast.

User-defined Windows

Windowed-Tool mode has a special feature: the ability to allow the user to specify a window for the driver to be created in. If a driver supports Windowed-Tool mode, then the driver must support this as well. The API for supplying the window is platform neutral, but it is also platform specific. The object defining the window is defined by each platform, but the user is required to change objects based on the platform. Usually, using Windowed-Tool mode like this is already in platform-specific code, so this is not very restrictive.

However, the user must notify the driver when the window changes size; it must tell the driver what the new size of the windows is.

Driver Cloning and Object Sharing

A driver can allow itself to be cloned, though drivers can choose not to allow for cloning. A cloned driver is similar to creating a driver of the same type and mode. However, there are some significant differences.

When you create a cloned driver, the original driver becomes the primary driver, and the newly created driver becomes the secondary one. The driver can place restrictions on the number of secondary drivers, by simply refusing to allow for additional cloning. Cloning from a secondary driver creates a clone from the primary one; the hierarchy is only one deep.

Secondary drivers do not create objects of their own. Instead, calls to create objects from them are forwarded to the primary driver. However, secondary drivers are required to use the objects created by the primary driver. This is the principle purpose of driver cloning; it allows multiple drivers to share objects.

Each driver has a separate framebuffer object.

Cloning is only available if the primary driver is not in Fullscreen mode. Each driver in Windowed and Windowed-Tool mode will spawn a new window. In Windowed-Tool mode, each driver can have a separate user-defined window.

Generic Graphics Driver

A driver's primary purpose is to act as a mediator between the needs of the user application and the internal hardware. Though, in many cases, these drivers will call other API's, they will insulate the graphics programmer from changes in those API's, as well as providing a cross-platform interface to hardware.

However, because drivers are an abstraction of hardware, certain knowledge of what is going on inside of hardware is lost. Also, because one is relying on hardware, there are problems with limitations of hardware. In some cases, getting specific functionality is more important than performance.

The generic graphics driver is designed to provide all of the functionality that any driver could provide. It allows for any and all blend modes, while simultaneously rotating and scaling bitmap blitting. Additionally, there will be extra API functions that will only work on generic bitmaps.

The drawback to this is that it is all done in software. Generic bitmaps are stored in system RAM, and no operations on generic bitmaps are hardware accelerated.

One of the primary benefits of generic bitmaps is that the format of the data in a generic bitmap is known exactly. The user can gain access to the bitmap's storage itself and change it as needed.

The generic driver is never created or destroyed. It is part of AllegroPro itself, and lives as long as AllegroPro lives.. Its behavior should be consistent across all platforms.

It is expected that most non-generic drivers can use generic bitmaps as source data, though at reduced performance. Testing whether or not the drivers can share objects is all that is necessary. Worst-case, it is no slower than creating a new bitmap of the appropriate size, uploading the generic bitmap, and performing the blit operation.

Generic bitmaps provide for additional API functions that only work on generic bitmaps. These include:

Primitives
- Circle
- Arc
- Ellipse
- FloodFill
Bitmap formats
- Paletted (each bitmap stores a palette. They do not share them)
Direct access to bitmap data in a well-defined format (depending on the specified internal format).

Generic bitmaps are more functional, but are software-only.

AllegroPro Blitting Emulation Mode

AllegroPro lives between the user and the driver. As such, it is possible for AllegroPro to add functionality to a driver. When a particular hardware driver is created, it may or may not provide for fast non-framebuffer rendering. If it does not, then AllegroPro can create the driver in either pure-hardware mode or emulation mode.

Pure-hardware mode requires that the driver satisfy requests to do cross-blitting. This typically means a download/software blit/upload process. On some implementations, this is very slow.

However, AllegroPro can interpose it's will between the driver and the user. In emulation mode, AllegroPro assumes that every bitmap object allocated will be involved in cross-blitting software operations, and therefore allocates generic bitmaps for the job. In this case, AllegroPro chooses to do software blitting for cross-blitting cases, but uploads these generic bitmaps into hardware memory as needed. If a lot of these operations are done, the time saved by not doing so many upload/download operations will easily outpace any poor performance due to software rendering; as such, it would be a performance win.

Note that if the user asks for this mode, and the driver specifically advertises that it performs fast cross-bitmap drawing operations, then this mode will not be activated; the driver will do what the user needs faster than we can. As such, a driver should advertise this fact only when it is (usually. It is possible to build degenerate systems that have modern CPUs but old video cards, and therefore makes software faster than hardware. But this case should be ignored) faster than AllegroPro's functions.

Outstanding issues

Issue 1: OpenGL drivers cannot be queried for extensions until after the rendering context has been created. This requires a window. Indeed, even getting the possible pixel formats that OpenGL supports requires a HDC which requires a window (could possibly enumerate it from the desktop DC). How do we deal with this issue? Should the OpenGL driver create a small window that it destroys solely to test what features are available? Extensions can/will be used to improve performance.

Solution: Have the querying function create a small window and initialize appropriate GL drivers in it. The window should be reasonably sized so that the GL implementation doesn't get too confused.

Issue 2: Should we expose more per "fragment" functions like modulate? These could be useful. And most hardware that can handle modulate can handle more complicated things as well.

Solution: Expose only modulate for now, but separate the color from the modulate option. Make the modulate option a "pre-blend" stage that currently only has a modulate function. Later, we can expose more interesting functions simply by adding more per-blend function types.

Issue 3: Mouse drawing. The user can draw the mouse by blitting to a location. But, how does the user get the OS's mouse to stop being drawn? And, for windowed apps, the OS mouse's location could stray outside the window and deactivate it.

Solution: A driver has the ability to turn on or off the OS's mouse. Because a driver encapsulates a window, this should also capture the mouse, thus preventing it from leaving the window's control.

Issue 4: Threading and drivers. In some drivers, attempting to render while creating a bitmap or other such operations can cause problems.

Details: Each driver could get a semaphore that prevents threads from simultaneously accessing driver resources. However, the user should be able to create a driver without a semaphore, since he/she may guarantee that the driver is not used from multiple threads. Also, the semaphore mechanism seems to push for a "begin/end" mechanism for drawing, since the user may not want to lock a semaphore hundreds of times per frame, but still need the locking ability.

Solution: A driver can only be used in the thread that created it. This is the behavior that Win32 OpenGL enforces, and it is the lowest-common denominator behavior. Forcing such implementations to have other behavior could present a heavy code-burden upon the system, as well as create a host of potential problems and a loss of performance.

Issue 5: What happens to the framebuffer when a window covers our window?

Details: The problem is with OpenGL's specification. The expected assumption is that the framebuffer is just fine. However, OpenGL specifies that what exists underneath another window is undefined. We cannot change this behavior.

Solution: Define that the contents of any buffer of the framebuffer are driver-defined. A driver is free to leave them undefined as well, so the user is appropriately warned about the problem, but drivers can actually specify the behavior if they so choose.

Graphics Design Justifications