Long Live WebGL
I’ll explain why WebGL is what finally made OpenGL click for me.
Old vs. New
I may get a few details wrong, but here’s the gist of it.
Currently there are basically two ways to use OpenGL: the old way (compatibility profile, fixed-function pipeline) and the new way (core profile, programmable pipeline). The new API came about because of a specific new capability that graphics cards gained years after the original OpenGL specification was written. This is, modern graphics cards are fully programmable. Programs can be compiled with the GPU hardware as the target, allowing them to run directly on the graphics card. The new API is oriented around running these programs on the graphics card.
Before the programmable pipeline, graphics cards had a fixed set of functionality for rendering 3D graphics. You tell it what functionality you want to use, then hand it data little bits at a time. Any functionality not provided by the GPU had to be done on the CPU. The CPU ends of doing a lot of the work that would be better suited for a GPU, in addition to spoon-feeding data to the GPU during rendering.
With the programmable pipeline, you start by sending a program, called a shader, to the GPU. At the application’s run-time, the graphics driver takes care of compiling this shader, which is written in the OpenGL Shading Language (GLSL). When it comes time to render a frame, you prepare all the shader’s inputs in memory buffers on the GPU, then issue a draw command to the GPU. The program output goes into another buffer, probably to be treated as pixels for the screen. On it’s own, the GPU processes the inputs in parallel much faster than a CPU could ever do sequentially.
An very important detail to notice here is that, at a high level, this process is almost orthogonal to the concept of rendering graphics. The inputs to a shader are arbitrary data. The final output is arbitrary data. The process is structured so that it’s easily used to render graphics, but it’s not strictly required. It can be used to perform arbitrary computations.
This paradigm shift in GPU architecture is the biggest barrier to learning OpenGL. The apparent surface area of the API is doubled in size because it includes the irrelevant, outdated parts. Sure, the recent versions of OpenGL eschew the fixed-function API (3.1+), but all of that mess still shows up when browsing and searching documentation. Worse, there are still many tutorials that teach the outdated API. In fact, as of this writing the first Google result for “opengl tutorial” turns up one of these outdated tutorials.
OpenGL ES and WebGL
OpenGL for Embedded Systems (OpenGL ES) is a subset of OpenGL specifically designed for devices like smartphones and tablet computers. The OpenGL ES 2.0 specification removes the old fixed-function APIs. What’s significant about this is that WebGL is based on OpenGL ES 2.0. If the context a discussion is WebGL, you’re guaranteed to not be talking about an outdated API. This indicator has been a really handy way to filter out a lot of bad information.
In fact, I think the WebGL specification is probably the best documentation root for exploring OpenGL. None of the outdated functions are listed, most of the descriptions are written in plain English, and they all link out to the official documentation if clarification or elaboration is needed. As I was learning WebGL it was easy to jump around this document to find what I needed.
This is also a reason to completely avoid spending time learning the fixed-function pipeline. It’s incompatible with WebGL and many modern platforms. Learning it would be about as useful as learning Latin when your goal is to communicate with people from other parts of the world.
Now that WebGL allowed me to focus on the relevant parts of OpenGL, I was able to spend effort into figuring out the important stuff that the tutorials skip over. You see, even the tutorials that are using the right pipeline still do a poor job. They skip over the fundamentals and dive right into 3D graphics. This is a mistake.
I’m a firm believer that mastery lies in having a solid grip on the fundamentals. The programmable pipeline has little built-in support for 3D graphics. This is because OpenGL is at its essence a 2D API. The vertex shader accepts something as input and it produces 2D vertices in device coordinates (-1 to 1) as output. Projecting this something to 2D is functionality you have to do yourself, because OpenGL won’t be doing it for you. Realizing this one fact was what really made everything click for me.
Many of the tutorials try to handwave this part. “Just use this library and this boilerplate so you can ignore this part,” they say, quickly moving on to spinning a cube. This is sort of like using an IDE for programming and having no idea how a build system works. This works if you’re in a hurry to accomplish a specific task, but it’s no way to achieve mastery.
More so, for me the step being skipped is perhaps the most interesting part of it all! For example, after getting a handle on how things worked — without copy-pasting any boilerplate around — I ported my OpenCL 3D perlin noise generator to GLSL.
Instead of saving off each frame as an image, this just displays it in real-time. The CPU’s only job is to ask the GPU to render a new frame at a regular interval. Other than this, it’s entirely idle. All the computation is being done by the GPU, and at speeds far greater than a CPU could achieve.
Side note: you may notice some patterns in the noise. This is because, as of this writing, I’m still working out decent a random number generation in the fragment shader.
If your computer is struggling to display that page it’s because the WebGL context is demanding more from your GPU than it can deliver. All this GPU power is being put to use for something other than 3D graphics! I think that’s far more interesting than a spinning 3D cube.
Spinning 3D Sphere
However, speaking of 3D cubes, this sort of thing was actually my very first WebGL project. To demonstrate the biased-random-point-on-a-sphere thing to a co-worker (outside of work), I wrote a 3D HTML5 canvas plotter. I didn’t know WebGL yet.
On a typical computer this can only handle about 4,000 points before the framerate drops. In my effort to finally learn WebGL, I ported the display to WebGL and GLSL. Remember that you have to bring your own 3D projection to OpenGL? Since I had already worked all of that out for the 2D canvas, this was just a straightforward port to GLSL. Except for the colored axes, this looks identical to the 2D canvas version.
- WebGL version (a red warning means it’s not working right!)
This version can literally handle millions of points without breaking a sweat. The difference is dramatic. Here’s 100,000 points in each (any more points and it’s just a black sphere).
A Friendly API
WebGL still three major advantages over other OpenGL bindings, all of which make it a real joy to use.
In C/C++ world, where the OpenGL specification lies, any function that accepts an arbitrary-length buffer must also have an parameter for the buffer’s size. Due to this, these functions tend to have a lot of parameters! So in addition to OpenGL’s existing clunkiness there are these length arguments to worry about.
Any time a shader, program, buffer, etc. is created, resources are claimed on the GPU. Long running programs need to manage these properly, destroying them before losing the handle on them. Otherwise it’s a GPU leak.
Unlike the buffer length advantage, lwjgl does not do this. You still need to manage GPU resources manually in Java, just as you would C.
Perhaps most significantly of all, I can drive WebGL interactively with Skewer. If I expose shader initialization properly, I can even update the shaders while the display running. Before WebGL, live OpenGL interaction is something that could only be achieved with the Common Lisp OpenGL bindings (as far as I know).
It’s really cool to be able to manipulate an OpenGL context from Emacs.
I’m expecting to do a lot more with WebGL in the future. I’m really keeping my eye out for an opportunity to combine it with distributed web computing, but using the GPU instead of the CPU. If I find a problem that fits this infrastructure well, this system may be the first of its kind: visit a web page and let it use your GPU to help solve some distributed computing problem!