Wednesday, 26 August 2009

The sorry state of hardware video playback acceleration

I've been playing around with options for a HTPC for a while now. See my initial HTPC article for introduction about this. BTW: It's almost finished with another post following in a couple of months (I need to do some programming first).

Anyway, while buying hardware, I was looking for a discrete graphics card that would provide HW accelerated video playback since my old Athlon X2 3800 isn't quite up to the job when full HD H264 content is to be played.

Anyway, after looking around for a while, I decided to go for an ATI 4350 (30€), since Nvidia's entry level offerings were a tad too expensive for me (>50€). It was rather a matter of principle than price. Since both manufacturers advertize HW accelerated playback for years now, I was confident any offering would do the trick. Also initial searching for "HW accelerated H264" turned up some promising pages quickly "discouraging" me from digging deeper.

I could not be more wrong...

As it turns out, HW accelerated video playback isn't nearly capable of providing an average user with what they want. Be it ATI or Nvidia, both have their issues and problems, not to mention the very concepts HW based acceleration is currently implemented around. I should mention that Nvidia is currently in a bit of a lead due to their successful CUDA advertizing and support (CoreAVC released a CUDA accelerated codec). ATI on the other hand is betting on it's OpenCL horses, but currently don't even offer a driver.

I compiled a short list of problems I detected while trying to enable any kind of HW acceleration. Note that I have tried just about any codec that was mentioned on various net forums first. These problems are just the best case I found after trying them all and failing with each and every one of them.
The famed microsoft framework for HW based video playback seems to be a one way street. There's no return information. Either the HW can or cannot play a codec. If it can, it will be accelerated, otherwise it will not. Some other codec will have to take charge.
What's worse is that the card won't just decode the video stream - the whole system works only if the decoded content is displayed immediately on the screen.

2. HW accelerated functions
Actually a very similar problem to the one listed above. This one was the most dissappointing to me because ATI avivo is supposed to provide some nice and quite powerful deinterlacing algorithms. Since this functionality works in only "all or nothing" mode, you can get deinterlacing only if the avivo can also decode the stream itself. Otherwise you're so out of luck. Same goes for any other filter avivo or purevideo may or may not provide.

3. Supported video codecs
This one is just great! After playing around for a while, you can only find out that number of supported codecs by any of the two chipmakers is frighteningly low. To make matters worse: if chipmaker says it's chip supports H264, that doesn't mean it will support everything encoded in H264. It turns out, lots of important codec features are not supported and the stream also has to be encoded just the right way for even the supported features to work. It turns out that my own camera clips, carefully transcoded into H264 (x264) of course can't be played back. I sure as hell am not transcoding them again. Anyway, even when I managed to find a video that the card was willing to play back, it was a DVD - which my Athlon can already deal with quite well. I don't need HW acceleration for low res MPEG2. I need it for H264, dammit - high resolution.

4. Additional filters
With DXVA you can just forget that. DXVA is a one way street. This means no additional filters. Yep, that also means subtitles. A disgrace.

Quite frankly I can't help but be disappointed. I had a nice old 6600GT laying around and I bought the ATI card to help decode the more demanding content (my full HD camera) which my poor old Athlon struggles with at 80-90% CPU usage. It turns out I just wasted my money.
Even my friends pathetic 300MHz ARM in his MediaTank plays back just about any content he throws at it easily, thanks to properly implemented HW accelerated codecs. I can't believe that after at least five years of bragging, ATI and Nvidia can't provide decent acceleration for this.

It seems both of them will be saved by OpenCL in the end, but even with that coming up I don't believe we'll see a good solution until FFMPEG project gurus implement it. At that time I'm betting my HD4350 will turn out to be a pretty weak card offering very little acceleration. Well, no matter - i just need 20 - 30% off my poor old CPU and I'm hoping the little bugger will at least be able to do that much.

Until then I'll just have to swallow an occasional dropped frame and no postprocessing for my full HD.


Anonymous said...

MPC-HC's internal subtitle filter works perfectly with DXVA. You could have done better research instead of saying no subtitles support.

You simply chose the wrong hardware, Nvidia hardware is far more compatible even with non compliant H.264 streams aka L5.1, up to 16 reference frames since driver release 178.24 last year.

ATI's DXVA support is pathetic and the ATI users know it very well, even asking for support on the ATI forums and mostly getting snubbed by ATI.

The Mediatank can't even play high quality lossless FLAC audio in MKV container, which is what many people are using these days. I know this because I just seen a rather pathetic thread on a forum by someone complaining about it, it was hillarious to say the least.

Anonymous said...

To get H/W de-interlacing you just need to use a codec with NV12 output format support and which passes on the interlace info e.g. FFDShow/Dscaler5 - it doesn't need to support DXVA.

Velis said...

OK, I may have been wrong with the deinterlacing stuff. Maybe it really is as simple as outputting interlaced content for the HW filter to handle.
As for MPC-HC's internal subtitle filter:
This is laughable. I initially tried to use Linux as OS for my HTPC. In the end I gave up because everything was decoding video single threaded. Of course, my poor Athlon wasn't fit to handle full HD H264 on one core only.
Sure, I managed to compile mplayer with FFMPEG-MT branch, but that was one player and I sure as hell wasn't going to ise IT for the HTPC. As a result I went for Windows.
This is just an example. Sure, there's plenty of specific solutions that work for HW acceleration, but those solutions don't fit all my needs. Maybe I was a bit careless when I said that the problems listed were best-case.
Perhaps there is a solution that works for somebody in all his scenarios. I even hinted for such a solution (transcoding to properly supported codec and parameters)
Just not for me...

Unknown said...

I'm the one he's complaining about "have Media tank (HDX1000) playing all". Well, not quite all. I have problems with video (codec doesn't matter) over 8Mb/sec streams and AC3 6.1 sound. And my NMT cannot play video recorded by my camera (Canon).
I had also played with idea of assembling my HTPC, even started to design GUI for VLC suitable for TV, but then I just decided to go and order NMT with 1'5TB HDD. It can play music, video, show documents (.pdf tested), show pictures, run torrents, browse YouTube, show local weather.... ..out of the box.
It's not perfect, but it does what I want from it quite well. Only "ranting" I can have over it is quite bizare. I had to open the box, readjust motherboard and refit some cooling stripe, becouse assembly of box was terrible. Since then no more problems with overheating and freezing.
Told you Velis, go and buy yourself a NMT :)