Monday, 6 August 2012

Playing with JavaScript - Three.js & Physijs

Web technologies are hot. Everyone is speaking about HTML5 even without a clear definition what HTML5 is. This especially after the latest announcement from WHATWG  and W3C and them driving to different directions.
So lately I have been also playing around with couple of JavaScript libraries, especially with the ones touching the areas of Graphics and Physics.

Getting started dead easy. This especially if selecting Three.js as your 3D library and Physijs as your physics lib. Three.js is an 3D library for dummies, abstracting things like camera, scene, materials, etc. and also includes features like shadow mapping through an simple definition. It also allows you to select which technology you want to use for rendering your scene. But most importantly it includes some really nice examples and its example folder is really a treasury.
Physijs is an plugin to this library to provide some basic physics - built on top of ammo.js which is a direct port of Bullet physics engine.


As starers, I was using these two libraries to build an simple application where you have cubes dropping to an heightmap and bouncing from there. Problem with this was the physics which didn't react correctly to my heightmap. Nicely enough Physijs it self just fixed this problem by introducing HeightfieldMesh which allowed physics to be interpreted correctly with  my simple heightmap (vs. PlaneMesh that I was originally using).

Anyway nothing special, just testing what you can do with these libraries. At this time attaching only an image of this demo appl as the performance with my example is _really_ bad (when calculating the physics for cubes hitting the heightmap, its only ~2fps).  Live demo, including sources from here.




Friday, 30 December 2011

Mesa - from GLSL to drivers

Mesa drivers just reached an important milestone regarding to its ability to support OpenGL 3.0. With the latest patch the version is officially pumped to GLSL 1.30 for Gen4+ drivers. What is interesting is the path towards this milestone and everything happening around Mesa shader compiler on generating machine code from your Shading Language.

What the Mesa shader compiler in high level does is that it takes your GLSL shader as an input, tokenizes the input string and parses this to an form of an Abstract Syntax Tree (AST). From the AST the compiler gets its first IR, an High level Intermediate Representation (HIR), optimized for compilers usage. From this HIR, GLSL IR through optimization and linking the code is generated to an form of Mesa IR, an IR made to be similar with GPU assembly. This is kind of an classic compiler design with your front and backend and also what Mesa has adopted from the beginning.


As Tungsten Graphics started to sketch out a new driver model for Mesa - Mesa Over Gallium, they also wanted to questions this Mesa Representation model. With Mesa Over Gallium the people from Tungsten wanted to split the DRI driver and isolate the OS dependent and independent parts. Along the new driver model Gallium also introduced its own syntax for shader called Tokenized Gallium Shader Instructions (TGSI). The driver for TGSI where coming interestingly from DirectX and one of the goals was to have it close as possible to what the proprietary Direct3D assembly looked like. The model for this was coming from AMD's Intermediate Language spec which essentially is Direct3D assembly. With this goal in mind it was hoped that if a given GPU would be well optimized for Windows, it would also perform well with TGSI.

With TGSI model the Mesa vertex/fragment shaders are translated into TGSI representation passed to Gallium Mesa state tracker from where the driver code will eventually convert them to GPU specific instructions. Still the TGSI model was far from perfect especially as the TGSI IR was achieved by first parsing applications shader programs and generating IR from that and covenanting eventually the result as TGSI.


Towars GLSL 1.30 and new model for IR's

The optimization with the two models continue and Intel for ex did a lot of optimization for the IR by rewriting the ARB program parser part. Still the model for layering multiple IRs and the next GLSL specification 1.30 required a bit of an bigger changes to the current Mesa drivers.

With Gallium/TGSI path there was an extra step on generating TGSI from Mesa IR and not doing this directly from GLSL IR. For this what started as an work to optimize and make Mesa GLSL 1.30 compliant was an effort of skipping completely the Mesa IR and to generate TGSI directly from GLSL IR. This work also introduced the first GLSL 1.30 support for Mesa/Gallium drivers as the translator added an native integer support for Gallium drivers.
On the Gallium side there was also an another effort on refactoring the IR model by placing Mesa on top of of LLVM and as an ultimate goal of having GLSL generating directly LLVM IR. The first ideas of this work was introduced already a while ago and is something that the people at LunarG have adopted wih their goal on having an HW independent IR and dependent parts of the IR separated. Layering the shader compiler called LunarGlass similarly to the Gallium architecture.

Later on the Classic Mesa driver side an new GLSL compiler was introduced and developed from scratch. The new GLSL version 2 (or at least take 2) aims to generate GLSL IR directly from assembly shaders and from fixed functions, and most importantly using GLSL IR directly with the drivers rather than stepping through Mesa IR - the original Mesa Intermediate Representation. This mainly as the Mesa IR was designed and written in the times when ARB - OpenGL Assembly Language was still the way to control the graphics pipeline and the and having 1-to-1 mapping all to way to HW instructions was seen as an benefit. This was not the case with modern GPU's and having higher level shader language existing.
Along the IR changes this work also brought the GLSL 1.30 support to Classic Mesa drivers for ex by not anymore assuming that the only HW type is vec4.


Were we are now is that that the pixel shader units themselves are still faster than what we need. Still the longer and more complicated shader programs will become more common even beyond games and also the usage of shaders is becoming more diverse. This meaning that the performance will definitely be welcomed also to our Linux drivers.

Saturday, 28 May 2011

Mastering Memory

Linux Kernel has two different existing memory management solutions - one based on GEM and the other one for TTM. Both of them are utilized through DRM and designed to fulfil different DRM operations. Still besides these two existing solution especially the SoC world is full of different out of the Kernel tree memory allocators. During the past couple of months the Linaro people have been taken some initiative to fix this problem but how - well the idea is to introduce a new memory manager fitting the ARM needs.

The problem setting with DRM based and SoC solution have kind of a different approach. Where DRM based memory manager is designed to provide applications a synchronous access to the HW, the SoC problem setting in the discussions has been kind of a turned upside down. In SoC case the main point of the memory manager is about how the different HW IPs/device drivers interconnect and share data. Allowing applications to share data is kind of an second priority.

A good question in this case is that why not to stick with the existing Linux kernel memory manager and making the existing upstream project to work. Making DRM to support sharing of data between the device drivers and narrowing it down by making some of its more legacy parts optional. And even if not taking DRM as the base of the solution, why every SoC have ended up to developed their own solutions - UMP, PMEM, KGSL and now ION from Google. One problem and challenges with this setup is the API. The evolvement of the graphics chipset is so fast that freezing an API cannot be taken into account. So especially in the case of SoC vendors the controlling of the API is highle beneficial and definitely a desired way of doing things. Being able to control not only the drivers but also the API will allow SoC vendors to ensure that their solution will perform the best and won't break anything around it.

So it will be definitely interesting to see whether Linaro is able to pull out an common solution that everyone agrees to use. Also interesting to see wheter Google with its power is able to driver this towards a common soluton with their ION based solution. The questions is also does the SoC vendors eventually really want to have an common Memory Management solution.

Saturday, 19 March 2011

Competition or cooperation: Gnome-Shell and Unity

The new Gnome-Shell has started to land and to take its final development steps. This landing has also raised some intensive discussions around Gnome - what is Gnome, whom are the competitors and what should get into Gnome.

From the major distros the Fedora camp have indicated themselves to utilize Gnome-Shell/Gnome3 with their upcoming Fedora 15 release, where as the Canonical/Ubuntu guys will be using their own homemade Unity Dekstop and Gnome3 applications. This division was initially fired already at October 2010 @Ubuntu Developer summit where Canonical stated that all the future releases would be based on Unity.

Currently ongoing discussions have been especially intensive around libappindicator - as it has been taken as _the_ item to discuss in various of different blogs [1]. Shortly the history of indicators have evolved from KNotificationItem to Canonicals StatusNotifier proposal and now to the rejection of libappindicator as part of Gnome 3.0 module set - "there is nothing in GNOME needing it". For Canonical this statement is quite clearly saying that Unity is not part of Gnome. Naturally this has raised some discussions.


This definitely becomes interesting when we are discussing about the future of Gnome. What we have here is freedesktop.org project aiming for "interoperability and shared technology for X Window System desktops" and two most popular Linux distros differentiating themselves. What we are also discussing about is free software. Surely in both efforts it is just about that but Canonical copyright assignments contribution agreements are something that Gnome does not want to depend on (this in a scheme where libappindicator would be part of GTK+).

The two projects do have quite different directions, making them competing against each other rather than cooperating. I personally do agree with the statement that healthy competition is always welcomed. The danger is that too much competition can also ruin the possibilities to compete the outside world - competing with the closed source solutions whom are running fast. Getting some new innovation emerging from these Linux based desktops is something that we all would want to see. This would require some big time cooperation instead of competition.


[1]
* collaboration's demise
* Application Indicators: A Case of Canonical Upstream Involvement and its Problems
* Has GNOME rejected Canonical help?

Sunday, 6 February 2011

Way to Wayland - take 1

There is an interesting discussion ongoing regarding to Waylands current memory allocation model and how this could be scoped better to be more suitable for different graphics architectures.

The rendering model with Wayland is simple and therefor powerful. The key part is the possibility to share a video memory buffer between the client and compositor. This all is currently done through DRM which also makes DRM an essential part of the whole Wayland concept.

So what has now been under the discussion in the Wayland forum is how the current model could,
a) provide interoperability between client and server. Different Wayland clients could more easily discuss with different Wayland Server solutions
b) isolate the important memory handling part away from the core protocol. Allowing Wayland to be used more easily with other memory architectures.

The proposal is to split the DRM specific part to an separate file which would solve both of these problems and would once again put Wayland closer to take the place as the future Linux graphics solution.

Thursday, 6 January 2011

Some interesting statistics for the beginning of new year

Year 2010 has ended so its nice to check the couple of interesting OSS Graphics components have evolved during 2010. With the interesting ones meaning Mesa, XServer and the one that everyone is discussing about, Wayland. Surely along these components there has been a lot of activity on the Kernel side from graphics point of view by moving the logic to the Kernel side (DRI drivers, the kernel side GEM scheduler and kernel mode setting) plus there has also been a lot of activity around the toolkits but as said now sticking with these three.
So what has happened,

With Mesa we have:
157 people committing
and those people pushing in 12816 commits (of different sizes)
(where as in 2009 the figures were, 118 people pushing in 9719 commits)

With XServer we have:
99 people committing
and those people pushing in 1305 commits (of different sizes)
(where as in 2009 the figures were, 109 people pushing in 2057 commits)

With Wayland the figures are especially interesting as it has received a _lot_ of publicity lately (Unity, Fedora). Now during the year 2010 there were already close to 20 contributors for the project where as a year ago there were only two.


So a lot of things happening in the head of mainline. If we check these commits on daily basis - Mesa having average of 35 commits happening every single day and XServer having 3.5 commits on daily basis. And not to even mention the Kernel which has over 100 commits per day.

Starting to deviate from this with an custom build behind the closed doors will mean an big maintenance and development effort for the instances deciding to do so. On the other hand this is sometimes mandatory. When discussing about an commercial company which is still aiming at to make profit you are bound to keep something behind the closed doors.
So balancing between Open Sourcing code and potentially exposing the bits and pieces from the latest and greatest can get tricky and really needs to be thought carefully.


PS. with MeeGo the problem is tackled by doing the XServer development with the git head - see.

Monday, 13 December 2010

SceneGraph in toolkits

SceneGraph architectures can be found from quite many of the new architectures introduced. Clutter with its latest rework on the SceneGraph to make it as retained as possible, Qt with its new SceneGraph architecture and also Mozilla's Layers architecture is getting its idea from SceneGraph. Earlier this has been the way the games render them selves but now this model is starting to define how toolkits render our 2D content.
First idea with the scene graph is that you will start to take the benefit from GPU rendering. Instead of having the applications doing the rendering in SceneGraph the compositing is done in retained mode, constructing an tree structure of the scene to be visualized. The SceneGraph paradigm and problem setting with the toolkit/widget and games is still quite different cmp'd to games. When we are discussing about 2D UI's we do not have complex geometry, fancy viewport transformations or dynamic lightnings. Instead we are mainly talking about getting our textures displayed on the screen and about blending them. This difference puts also requirements to the actual traverse algorithm and renderer to ensure that we can take everything out from the GPU from 2D toolkits point of view. Also the problem statement is rather aiming for optimizing of the performance than managing the complexity of the view.

With the GPU there is one particular thing that it cannot execute efficiently - changing states. What the GPU's are designed to do is to load in the shader program and do their stuff. Simple. With the imperative way of doing the rendering where the view is constructed by hierarchically rendering your scene by the application you are bound to do a lot of state changes. This as every single application is in charge of rendering itself - widget by widget and line-by-line, constantly loading content to the GPU. So what we want to do is to feed our GPU as much data as possible and letting it do its job. This by grouping our geometry, shaders and textures data together.
The most important data (or node) from toolkits point of view would be our appearance or geometry node defining our graphics primitives (vertices data and texture info. Data required for your widget rendering). This data is something that we want to store not only per-frame but per lifetime of an application to minimize the state changes.

Second important item is the drawing it self. Although GPU's are designed for drawing, the unnecessary drawing is naturally something that we don't want to do. An easy way to improve in this area is to draw less, or draw only what is visible on the screen. This is exactly what our SceneGraph architectures want to do. With its tree structure the SceneGraph model can smartly use the z buffer to avoid the overdraw. And by this not meaning the computation time clipping and depth buffering that the GPU's are able to do, but rather limiting the objects actually sent to the GPU. This is done by selecting the rendering order smartly (front-to-back when possible) or for ex in Mozilla's Layer case placing the transparent areas into their own layer. The challenge is of course that with the modern UI's we are discussing about a lot of transparency which will force us to do back-to-front rendering.

There are of course also challenges. The SceneGraphs are balancing between the memory consumption (caching too much) and performance. Other challenge is the different nature of different GPUs and drivers as they sometimes behave a little differently and do different things well.
With the toolkit SceneGraphs the challenge is also the target. Are they only aiming for optimized 2D rendering or would they also like to serve the games.