Archive for 3D programming

Two months later…

I was working on something else (and took holidays), didn’t have the time to go back to the renderer until three weeks ago.
At first I wasn’t considering these three weeks work as a part of the SM3 Renderer, so I didn’t want to update this page.

But well, even if it’s not talking about a cool rendering technique, it’s still part of this project, and this is something I’d like to share too.

Here we go, let’s catching up with my new in-viewport GUI.

Windowing System and redraw.
There were three criteria to pay attention to: fast display of windows, make good use of the Alpha, having the whole system as flexible as possible.

The GUI is system is like the others, you have windows organized in a hierarchical way. There’re notions of active windows, focus window, "hover" window. You can capture mouse events (and stack the captures). There’s a global alpha constant for the GUI and for each top level windows, which is used by all the low level drawing methods (DrawRect, FillRect, DrawLineList, DrawMesh, DrawTexture, etc…) for fading effects. Redraw had to be optimal so I’m also using clipping region (using D3D Scissor Rect).

Rendering the windows’ content was obviously a big concern and not an easy task when you want things to be fast, flexible and with transparency. The most important feature of this system is you can decide for EACH windows (even child ones) if you want it to be cached in a texture. This way, if the window’s content doesn’t change, the cached texture will be used instead of redrawing everything. When a window is cached in a texture, it will also render into this cache the content of its child windows as long as the given child is not a also cached window. You can think about the benefits of having a hierarchical caching system.

Optimal redraw requests are made for transparent windows, using optimal computing of the transparent regions across the hierarchy.

Draw text, font and stuffs.
I had to extend the font system Ihad (which was fairly simple). More font styles are supported, there’s a font pool now used to avoid redundant creation of identical font pages. I also recorded properties such as font ascenders, descenders, spacing, etc. Added methods to compute the size taken by a given letter, word, line or text (bound by a logical area).

Drawing text is now more complete, you can specify a bounding zone, an alignemnt, auto wrapping, and auto display of an ellipsis if the line is truncated..

HTML Document display.
Ok, I have to admit, it was not a necessary thing, but well, I thought it would be a good test for the GUI (and also a challenge for me). At first I only wanted to do a multi-line edit control, but after I wanted to display more complex text formatting (color, underline, bold, font change), so I looked the Rich Text format. And when I realized it was more messy than the HTML one, I choosed the HTML (also because it’s way more popular now).
I won’t explained in detail the structure, but I’m kind of proud of it. It’s very efficient and flexible (and I’ll be able later to upgrade it to edit HTML content). I of course don’t support all the HTML tokens (far from it), but the structure is open and it’s easy to add the support of new ones.

Other controls
So I have now the following high level classes:

  • BaseWnd: the abstract class that all other windows are derived from.
  • Window: a top level window.
  • Control: abstract class for control typed windows.
  • Menu: to display a menu.
  • ObjIcon: display an icon of a given object, the icon is a 3D render of a mesh lighted with a global light. The mesh is chose from the type of the object which is viewed.
  • ObjExplorer: a little object browser to walk through a given object database (a 3D scene, the SM3 rendering architecture, IML Framework for instances)
  • EditCtrl: single/multi line display, HTML display, encoding from raw or C style text, raw text editing (stored in HTML).  

    The ObjExplorer class is still not finished, I’m currently working on the generic Drag n’ Drop system (which will be heavily used).

    Screenshots:

     
    This is how it looks like when I start my Test3DE.exe now.


    The Object explorer with a nice tool-tip that displays the content of a DirectX Texture.


    The object explorer displays the content of a Resource Pack (the main one of the scene)


    The tool-tip displays the content of a DirectX texture which is… the IML Console’s one.


    Just to show what a menu looks like

  • Comments

    New rendering features !

    I added Gamma Correction, bump/normal mapping, and Depth of Field.

    I also fixed few bugs.

    ScreenShots of Gamma Correction


    No correction


    Gamma corrected

    It’s brighter where it should be, and still dark where it should be too.

    The picture was took from the ATI’s sRGB sample.

    ScreenShots of Normal Mapping


    The left sphere is the high poly one (40K faces). The right is the low poly version (960 faces) with the normal map applied.
    The normal map was created with our 3D Studio Max Bump-o-matic plugin.


    Wire version of the first screenshot.


    Rendering of the normals.

    ScreenShots of Depth of Field


    The white AABBs symbolize the Plane in Focus. Check their intersection with the scene to get a better idea of their position.

    More about depth-of-field:

    I read many things about Depth of Field, the article in GPU Gems for instance, saw many formula without really knowing how to practically implement them.

    So I came out with an in-house one, really simple:
     Df = DP * abs(PosZPiF) / PosZ.
      DP is the Depth of Field Power. 0 to disable it, 1 for standard result, >1 to get something really blurry.
      PosZ is the position in camera space of the pixel to compute.
      PiF is the Plane in Focus position in camera space.
      Df is the result, I clamp it to [0,1] and use it in the lerp from the accumulation buffer and the blurred one during the ToneMapping.

    Comments

    Parallax mapping, more ambient occlusion n’ stuffs

    Parallax mapping is finished.
    The whole production pipeline is now ready for that technique. The 3D Studio MAX plugin now computes the correct scale/bias and can also display the result in a custom view.

    Screenshots


    As you can see, the specular highlight is not ‘real’ for that kind of material (supposed to be rocks…)


    Wireframe mode!

    I added a new parameter in the Ambient Occlusion Map creation
    which is the length of the rays used to perform the occlusion test. This way the occlusion map builder can now produce maps for indoor meshes.

    Screenshots

     
    Ambient occlusion off


    Ambient occlusion on


    Ambient occluion off


    Ambient occlusion on


    Ambient occlusion map


    3DS Max UVW unwrap modifier


    The original mesh of the room wasn’t mapped, so I used the flatten mapping of the UVW Unwrap modifier of 3DS MAX to generate mapping coordinates, then use the Bum-o-matic plugin to generate the Ambient Occlusion Map.

    The result speaks itself.

    Light volume rendering.
    Before, for each light was lighting every pixel on the viewport, which was quite slow/wasteful. Now for point and spot lights, their bounding volume is rendered to perform the lighting, as you can guess, this is much faster for small area lights.

    Screenshots


    Without


    With


    Without


    With

    I Added an IML Console right in the viewport.
    Having more and more rendering parameters I’d like to tweak in real-time, I’ve decided to take advantage of the whole IML architecture to interact with the renderer (and the 3D Scene) in run-time.

    Screenshots

    More about Ambient Occlusion builder:

    For each pixel on the map we’re created, its position into the mesh is located, and a series of rays are thrown to perform occlusion tests (intersection) with other part of the mesh itself. The problem for indoor environments is there’s always a intersection found (because the mesh is closed), making it impossible to produce an accurate map. By letting the graphist set a length for the rays that are cast, the occlusion can be perform on a limited area, then producing the expected result.

    More about IML:

    IML stands for Irion Micro Language, it’s a run-time wrapper to the C++ components, for each Irion component one is developing, he can create an IML Class that will be used to expose the component to the IML Framework. Using IML via an IML Console, you can create/edit/delete new components or existing ones. For instance, I developed an IML Class to wrap the SM3Viewport C++ class, I exposed a set of properties (rendering modes, rendering attributes, stats display, etc.) that can be later modified via an IML Console or Script.

    Comments

    More improvments…

    Added projector texture for Point and Spot lights.
    A cube map is used for the Point Light and a 2D texture for the spot, both are almost free concerning the rendering time.

    Parallax mapping is almost done.
    The technique itself is quite simple, but it implies many little things to get it “practical” and be able to produce graphic content using it.

    Every effects/techniques implemented so far are “practical”.
    That means you can produce 3D with them for a games of other kind of real-time applications, it’s not just for demo/screenshot! :)

    Improved the compatibility of the renderer with the logical 3D engine.

    I’ve made some tests of Sub-surface scattering.
    (the light ray going through a given object and lighting it on the other side).

    And at last I did a bit of performance tuning/optimisation, rearranged the main fx file which is starting to be big! :)

    Ok some random screenshots, not sphere/cube/coder art this time…


    If you look closely, the shadows are not accurate at some places, this was a minor bug that was fixed, but I was too lazy to start the screenshots again.maybe later!


    50K faces, 4 point lights


    50K faces, 4 spotlights


    400K faces, 1 direct light


    400K faces, 1 direct and 1 point light

    More about the renderer architecture:

    The 3D Engine is totally logical, it doesn’t have any dependencies with a given platform or hardware.
    There is an abstract renderer interface which can be used to develop new renderers (XBox, OpenGL, DX7, DX8, DX9SM3 were tested/implemented so far).
    If one wish to build is own renderer from scratch, no big deal, you don’t have to use this abstract interface if you don’t want to. The main reason is the rendering pipeline is not straight forward processed, but somewhat reversed processed: the 3D Engine won’t feed the renderer with 3D data (meshes, lights, etc.) but the renderer will take the data itself. Optimal computation/update of the data is provided: is computed only what the renderer needs, etc.

    More about sub-surface scattering:

    The technique can be easily implemented in the renderer and the production pipeline (one global density factor, and a texture for per-texel info), but I’m afraid that it doesn’t worth it. The main issue is I have to read the light Z-Buffer, and I can’t do it for direct and spot lights when using the nVidia’s UltraShadow. The concrete application of such effect is rare I guess, that’s why I’m putting it aside for now.

    Comments

    Weird things and improvments

    Ok for some mysterious reasons using four MRTs can generate big slow down on the 6800.
    So I separated the render of the MRT in two passes, the first one renders the Z-Buffer and the Z-MRT, the second one renders the three other MRTs (albedo, normal, material settings).
    This way the second pass took advantage of the Z Culling, sometime pixel shaders can be heavy when funky stuffs are done to compute the albedo, this should be faster when it’s the case.
    On the performance side, it’s always faster, regardless the vertices count of the meshes.

    I’ve finalized the soft shadows on point lights.
    I’m using only two samples, the vector used to address the cube map is slightly disrupted from the position of the pixel being rendered. I can’t say it’s perfect or nice, but well, it’s fast. Four samples instead of one make the whole lighting pass 50% slower!

    I also fixed few bugs.

    Screenshots:

     


    A 256*256 Cube map is used. The render time of the shadow map is not bad, about 10% of the VBL.

    Comments

    Shadow mapping improvments

    I implemented Point Light shadows, soft shadows on spot lights.
    Soft shadows on point lights are still in progress, the result is not great so far.

    I also implemented the tone mapping to the back buffer of a different size than the deferred buffers, not a hard thing to do.
    Doing all the deferred stuff using a 400*400 resolution for a final back buffer of 600*600 saves you a lot of rendering time (actually is almost twice faster) for a final result, not that bad. Maybe we can improve the final quality using few pixel shader instructions during the tone mapping.

    Screenshots time:
    600*600 rendering target, using a deferred/back buffer ratio of 1, 5 spots lights lighting the whole buffer each.

    I know, rendering time is quite awful, but:

    · Spot light lighting is currently NOT optimized and doing a loooots of stuffs (const/linear/quad attenuation, penumbra, emissive, ambient, diffuse, specular computing, soft shadow, etc…).

    · Each spot light is lighting the whole screen: 600*600 = 360 000 pixels (doing 5 times = 1 800 000 of lit pixels).

    · Geometry here, doesn’t matter (and texturing too), MRT render time is about 0.15VBL, putting 500 times more faces will push it to 0.4, no big deal.

    Shadow map rendering is about 0.15 VBL for five renders into a 512*512 D24X8 texture, yummy!

    More about point light shadows:

    Point light shadows are rendered using a R32F cube map (256*256 pixels for a face). The rendering time is bad compared to a spot which uses a 512*512 D24X8 shadow map (using nVidia’s UltraShadow). But I’ll be able later to compute efficiently shadow casters and receivers which will certainly cut off 1-3 faces render of the cube map.

    As usual the code is not fully optimized, but the design/architecture is. One may be afraid to the rendering time, but when you know few things like each light is lighting the whole screen (I don’t compute the bounding volume for an optimized lighting yet), after few majors optimization it should be better.

    Comments

    Perspective Shadow Mapping

    Perspective Shadow mapping is a real pain…
    I can’t get it work correctly, I’m putting the code on hold, and I’ll get back on it later.

    Comments

    Craaaaaaaaash!

    Lost three days into a partition crash!
    Almost lost the 150gigs of data stored there, took more than a day to recover everything.

    Comments

    The beginning of shadow mapping

    Implemented the spot light rendering.
    I have now the three basic types of light: directional, point and spot

    Added a Gaussian filter after the creation of the Occlusion Map.
    The results speak themselves, that is definitely a must have!

    The nightmare has begun: shadows…
    I knew that would be one of the hardest parts of the rendering (if not the hardest), and it is…
    As usual, I started by reading many papers and slideshows.I also looked the archives of the GD-Algorithms mailing list, and put back the topic because people there were apparently silent since one year ago.

    Between the two families, as I rely on the pixel power, my choice naturally tends to Shadow Maps. In the land of shadow maps, many people have different opinions about what is the best to use, and with time passing, it doesn’t seem to converge into one particular technique.
    Single buffer, multiple buffers, post perspective or not, trapezoidal, done in light space, oh my god…

    Before starting to ask for people’s opinion, I had faith into the Perspective Shadow Mapping (aka PSM), after reading its revision from Simon Kozlov in the GPU Gems. Many people still say it’s not a viable technique, because of its numerous special cases which are really hard to solve.

    The only thing everybody agrees is to make the most effort to limit the viewing frustum of the light, so I guess I’m starting from this point, and will head later on the PSM.
    For now, I’ve implemented the basic of shadow mapping. Precision and point lights are coming next…

    Trying to gain more precision, I first tried to make the depth comparison using a R32F buffer instead of the Depth Buffer. So I had to render my scene in that buffer, which is 60% slower than into the Depth Buffer only (from my tests).
    The whole thing doesn’t worth it, it’s slower, and as it’s not using the Z-Bias, the result is pretty bad (and I really don’t want to make a Pixel Shader to emulate it). So I stick with fast rendering D24X8 surface.

    I would like to thank Mark Harris from nVidia for his advices and support.

    Screenshots:


    Basic shadow mapping

    Comments

    Ambient occlusion: done!

    I’ve programmed the routine to compute a texture storing the ambient occlusion of a given mesh. The result is as expected: great! (see the screenshots).

    The computing process can take a while, had to throw a lot of rays per texel to get an accurate result. 512 is a good number. So for a 256*256 texture, you have at least 33 million rays thrown. When I say at least, it’s because if a given texel is shared by two faces of the mesh (across the edge), the double are thrown.

    Thanks to the opcode library, it doesn’t take forever…
    I’ll certainly add an option to filter the produced picture (useful when the ray count is low).

    Screenshots:

    The mesh (courtesy of Bruno Dosso) is 12592 triangles and 6424 vertices.

    Computed on a Athlon XP 2800+. You can see the render time in the texture’s window caption.


    Basic rendering, no diffuse texture, no ambient occlusion texture.


    Diffuse Texture, without Ambient Occlusion

    ²


    256*256 occlusion map, 16bits, 1024 rays per texel.


    Same as left, with a diffuse texture.


    The computed Ambient Occlusion Texture, done in 2min27sec.

    Comments

    « Previous entries Next Page » Next Page »