Monday, November 17, 2014

Opening DICOM files in the browser

Traditionally, there has been two main trends in the development of applications oriented to end users. The first one has been the most extended for many years: The development of native solutions, tied to a specific platform. The second one came into scene with the adoption of the web and its possibilities related to the creation of web based applications.

When HTML5 was introduced, the possible scenarios for web based applications broadened as never before. Web applications started to appear even in the most unexpected domains.

The medical imaging field did also benefit from this set of new tools and possibilities. Reading and rendering DICOM files was already possible in pre-HTML5 environments, but the new framework just made things much simpler and easier to implement, with no need of any kind of obscure hacks.

In my particular case, I decided, nearly two years ago, to switch from a flash (actionscript) based solution to display medical images to a different set of tools, using just plain HTML and javascript.

I started to work on my own image viewer, while following and getting some ideas from other products, like dwv.

Since this particular application was also oriented to run also on mobile devices, I tried to do my best trying to find an efficient solution. Some of the requirements of the new product were to minimize the amount of necessary memory as well as to maximize the performance of the final product.

One of the decisions to meet the performance goal was to rely on typed arrays, in particular DataView, since we had to read binary file headers containing heterogeneous data.
"Typed Arrays are a relatively recent addition to browsers, born out of the need to have an efficient way to handle binary data in WebGL..."
"DataView is the second type of view and it is meant for handling heterogeneous data. Instead of having an array-like API, the DataView object provides you a get/set API to read and write arbitrary data types at arbitrary byte offsets. DataView works great for reading and writing file headers and other such struct-like data."
Everything seemed perfect. The core language libraries offered us some tools which fit perfectly with the tasks we wanted to do. The fact of using the core tools also matches perfectly some well known principles of programming:

- Do not reinvent the wheel
- Center your efforts in the logic specific to your problem
- This is javascript: Native code has been compiled, while the code typed by anyone has to be interpreted.
- Core libraries in general include highly optimized code.

Sadly these premises ended being completely wrong in this case. All the coding efforts to get a fast code were unsuccessful. The implementation of the DataView class has some kind of weird problem which leads to horrible performance results.

This problem was noticed and reported longtime ago, but at the present day the problem still persists.

I have done some performance tests in a range of different machines. The code using DataView resulted to be extremely slow when compared with writing your own functions for raw data reading. Results are between 20x and 50x slower, depending on the test platform. Relying on DataView in your code is a nonsense given the current implementation problems. I do not catch the point of this class given such a bad results.

DataView performs between 20x and 50x slower than raw reading on a modern computer
So, the final thought, if you pretend to get performance on your javascript code, avoid using the  DataView javascript object. Otherwise you will end up loosing a big amount of your precious time writing useless code.