Soda Blog

Deriving a modern abstraction layer on top of OpenGL and Vulkan

This is a writeup on how I tried to design a renderer targeting OpenGL and Vulkan, mostly documenting the differences or pitfalls when abstracting both APIs. There are already a lot of good graphics API abstractions such as bgfx and DiligentEngine, so my main reason of reinventing the wheel is to learn as much as I can.

Anyways, here is my take on Vulkan and OpenGL.

The Render Pass

Back in the old days of OpenGL, we usually set the clear colors and bind render targets (FBOs) just in time before the draw calls, but in Vulkan every single graphics operation is performed within a pre-defined render pass that describes the color and depth stencil states of each attachment (I am simplifying a bit, there is much more information in a Vulkan render pass).

The render pass is so fundamental in Vulkan that it is created first and referenced in pipeline creation (VkGraphicsPipelineCreateInfo::renderPass) and framebuffer creation (VkFramebufferCreateInfo::renderPass)!

In my abstraction, at the beginning of the render pass scope is where we call OpenGL switches such as glClearDepth, glClearColor, glBindFramebuffer etc, setting up a context for following draw calls.

Command Buffer Recording and Execution

In OpenGL, the driver is responsible for accumulating commands and ensuring the results look like they happened in sequence, whereas in Vulkan we are responsible for first recording command buffers (while inserting proper barriers) and later submit them to a queue with proper synchronization via VkFences and VkSemaphores.

Note that OpenGL also has fences and barriers, but they usually only appear in advanced use cases such as persistant mapping.

In my abstraction, I am following the Vulkan record and submit usage, one of the noticable gains in Vulkan is host-thread scalability, where we record commands across multiple threads (each thread has its own VkCommandPool), so I tried to preserve this key feature. In OpenGL mode the fences and semaphores are only a facade, they are completely CPU-sided.

The Swapchain Pass and Swapchain Framebuffer

Now, let's try and get a triangle on the screen, but what is the lowest common denominator for drawing the same triangle with both backends? In OpenGL we have the default framebuffer that is already created along the OpenGL context, whereas in Vulkan we usually do the following:

swapchain creation
swapchain image retrieval
choose a present mode
choose depth stencil format based on surface and device capabilities (optional)
create one framebuffer for each swapchain image

Here I propose the Swapchain Framebuffer and Swapchain Pass.

The swapchain framebuffer encapsulates the OpenGL default framebuffer or the Vulkan framebuffers with swapchain images attached. The swapchain pass is therefore the render pass that is compatible with the swapchain framebuffer (recall that render-passes are required to create a framebuffer).

At first it feels wierd that the abstraction defines an additional render pass just for the final rendering, but in practice there are quite a few operations that fit into the final swapchain pass, such as tone-mapping from HDR, gamma-correction, or any other post-processing.

NDC Differences

Next, we move on to examine some key differences between Vulkan and OpenGL on the microscopic level, let's first talk about the Normalized Device Coordinates.

Vulkan NDC is right handed, positive Y axis pointing downwards, positive Z axis into the screen.
OpenGL NDC is left handed, positive Y axis pointing upwards, positive Z axis into the screen.
Vulkan Z axis clip depth ranges from 0 to 1.
OpenGL Z axis clip depth ranges from -1 to 1.

The Khronos Group is aware of the NDC differences across graphics APIs, and have addressed this in VK_KHR_Maintenance1, I am using this to flip the Vulkan viewports, solving the handed-ness problem (see this post by Sascha Willems).

Now what about the clip depth? Currently I am using glClipControl on OpenGL to enforce a depth range of GL_ZERO_TO_ONE.

Note that glClipControl directly modifies the clip space transformation policy, so we don't have to further mess around with vertex-winding-order / front-face in OpenGL. However, there is no official support of glClipControl on ES, so this limits the solution to desktop only.

I am not targeting mobile anytime soon, and this solution is much cleaner than hacking each vertex shader for OpenGL in my opinion.

Texture Origin Differences

Now let's move on to the next difference (we are almost done please trust me).

OpenGL texture memory storage starts from the texture's bottom-left corner, whereas Vulkan and most other graphics applications use texture top-left corner as origin.

The craziest part about this is that we now have to differentitate between the following usages:

rendering to the swapchain framebuffer
offscreen rendering (deferred rendering, post-processing, etc.)

Because OpenGL offscreen-framebuffer-attachments will appear vertically flipped!

Vulkan does not suffer from this insanity since Vulkan framebuffer attachments are simply image views that reference a regular VkImage, regardless of it being a swapchain image or some offscreen framebuffer attachment.

So how do we address this difference?

One solution is to apply a flip (pipeline front-face + viewport Y axis) on Vulkan during offscreen rendering, as described in this post. This results in offscreen-framebuffer-attachments being upside-down in both OpenGL and Vulkan, which is consistent.

In my abstraction I chose to use the first parameter of glClipControl to flip the clip space Y axis in OpenGL during offscreen rendering. This results in OpenGL logically flipping the offscreen-framebuffer-attachment twice. This solution does not require changes on Vulkan side.

GLSL Shader Differences

This post is getting lengthy, but luckily for handling GLSL differences, SPIRV-Cross can be used to transpile the Vulkan GLSL dialect back into OpenGL GLSL. Please refer to this presentation.

The only work I have to do is to remap Vulkan GLSL qualifiers set + binding into OpenGL GLSL binding qualifier (both API's use the keyword binding, but they have different semantics!). One compromise I have to make on Vulkan is that my abstraction bundles VkImage and VkSampler into a single binding of type VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER. This is mainly for compatability with OpenGL backend, where the sampler/texture state is tightly coupled.

SPIRV-Cross even handles the infamous case where the OpenGL builtin gl_InstanceID does not consider the base instance parameter issued in draw calls.

Summary

My abstraction is mostly centered around Vulkan, having OpenGL emulate Vulkan behavior. The more graphics APIs an abstraction supports, the less room there is to optimize one particular backend. Since I am currently only covering desktop OpenGL and Vulkan, I tried to preserve as much room for Vulkan optimization as I can.

Hopefully this post helps other graphics programmers who are also abstracting OpenGL and Vulkan.