Matroska Certification - possible?
This page is a draft. It attempts to highlight one of the things I think might be keeping Matroska from being more widely accepted by large hardwave vendors. Having worked for such a company, I've realized how important certifications are - not just for the company, but also the for normal people buying such devices.
Several hardware devices now claim to be able to play .mkv files. Once tested with real-life content, you might discover several unexpected problems like missing audio, missing subtitles and stretched image playback.
As of writing (Jan 2009) there was only a draft of a Matroska certification. As a result, anyone can make a media playback device and say "it plays matroska". Brands like DivX, XviD, THX all have certification processes that consist of a series of tests that the product is supposed to pass. Once all tests have passed successfully, the product receives a stamp of approval. This is typically in the form of a logo on the product that guarantees the end-user a fair degree of compatibility.
For a video or audio codec the certification usually consists of a minimum set of features that must be supported, before the product can be certified. For the XviD codec there are four levels of certification depending on capabilities.
What should it cover?
For a container format like Matroska, things are slightly more tricky. Matroska never tried to impose any kind of limitations. On the contrary, it is supposed to support a very large range of audio, video and subtitle formats - more than you can reasonably expect a hardware device to support. So a Matroska certification could work in two ways:
- Container-format oriented. Certification just proves that the product can parse and demultiplex an .mkv file. There are no guarantees that you can actually play any of the .mkv files you encounter out there, as the certification only covers the container format itself. There would be no specific requirements to either audio, video or subtitle capabilities.
This method lies well within the scope of the matroska.org project. It is easily verifiable and fairly easy to achieve and test.
The bad side of this is: This kind of certification doesn't help the end-user who has to decide between various devices, as it does not guarantee that the device can play a reasonable set of audio, video or subtitle formats. This is why such a certification would not make much sense to a normal non-technical person.
- A holistic approach to the .mkv format as a concept. To reach this level of certification, the product must be able to parse and demultiplex an .mkv file (like in 1), but it must also be able to play a pre-determined minimum list of audio, video and subtitle formats stored in the Matroska container.
This approach would go against the open and flexible nature of Matroska being only a container format. It would dictate a fixed list of allowed audio, video and subtitle formats. A file using formats outside these lists would have to be considered "not certified". This list might be out-dated in a decade, and changing the requirements would ruin the concept.
The benefit is that by wisely choosing a sensible list of formats, the amount of requirements a device has to meet can be drastically reduced, making the mumber of combinations and test-cases come down to a reasonable size. This is crucial for hardware vendors trying to cope with user-generated content.
The biggest benefit would of course be, that this kind of certification is harder to get, and thus it would be a better proof that it would play most files out there. And this is really what the end-user needs to know. Matroska would not just be a container format, but a sort of brand that end-users recognize and trust. Just like mp3 files. Because of these obvious benefits, I propose this method. The rest of this page assumes we're going for method 2.
Deciding on a proper list of formats
This is the hard part. We must take a lot of factors into account, like who is using the format and how they produce .mkv files. We must look at where the content originates from and what the intended target audience is. Gathering statistical information on existing content is also important but difficult, as different communities have different encoding tastes. Looking at various P2P networks will reveal a more or less coherent list of formats used. Looking at newsgroups, private FTP sites and such might give a different impression though. Public websites still haven't embraced Matroska very much, and do not provide sufficient statistical input.
My impression is that H.264 (usually using the www.x264.nl implementation) is by far the most common one. If a secondary format should be allowed, it would be XviD, as far as I know.
Video codecs can be indicated in two ways in mkv: The official Matroska way or using the V_MS/VFW method, which is a left-over from Windows, and should not be a part of the certification requirements in my opinion.
There is much more variation here. Lots of video content originate from DVDs, where it is common to simply copy the AC3 sound track into the .mkv file. Others choose to convert the audio into AAC (mp4 audio), Ogg Vorbis or mp3. Both stereo and surround are common. For lossless audio FLAC seems a logic choice as it is the most widely used format and also the most efficient in terms of CPU vs. compression ratio.
I've seen SSA/ASS in more than 90% of the cases. For this reason, and for the sake of simplicity, I think this should be the allowed subtitle format. Whether to allow the sub/idx graphical subtitles from the DVD format can be debated.
- The device must be able to play all combinations of the above formats with 0 or 1 video track, 0 - 8 audio tracks, 0 - 8 subtitle tracks.
- The playback of H.264 (mpeg-4 AVC) must cover a reasonable set of profiles (maybe Baseline and Main?)
- The playback of mp3 should handle both 44100 and 48000 hz, up to 320 kbit, including vbr.
- The audio tracks may have different audio codecs, sample rates, and number of channels. The device must be able to handle when the user switches between these.
- The device must be capable of timesearching in any of the allowed combinations of video/audio/subtitle codecs without bringing these out of synchronization.
- The device must be able to handle different image and pixel aspect ratios correctly.
- The device must be able to switch between the various audio tracks.
- The device must be able to switch between the various subtitle tracks, with "disabled" as one of the choices.
- The device must be able to time search beyond the 4 gigabyte point in large Matroska files.
This might be overly optimistic, but it would be extremely valuable to both end-users, people who encode files and hardware vendors, if a tool existed that was able to fully test whether a file is "Matroska certified" or not.
This could also be valuable in legal situations. Imagine a person who bought an expensive media player, that doesn't play his files. A tool like this would be able to help determining if the playback device or the files are faulty, and prevent a lawsuit.
Ideally such a piece of software should exist in two versions: A very user-friendly tool, that would simply say "OK" or "Invalid" to each .mkv file. The other version would be a developer/expert version of this tool running all the same tests, but showing verbose information.
The tool must test both the container format itself, and the validity of the audio, video and subtitle tracks. The following conditions will make the tool report a file being "Invalid":
- Not a Matroska file.
- Errors in the Matroska file format structure.
- If a codec or format is used that is not in the list of certified audio/video/subtitle codecs, the test fails.
- If a valid codec is used but with settings that bring it outside the chosen valid profiles (like a way too large image size, extreme bandwidth, missing keyframes etc.), the test fails.
- If the audio/video/subtitle streams contain corrupted data (this might be complicated to detect by the tool should try), the test fails.
- Subtitles that are not UTF-8 encoded.
- An illegal number of audio channels (valid numbers are probably 1, 2 and 5.1).
- An illegal image size or an erraneously indicated image size.
- Two or more video streams might also cause the test to fail (depending on profile?). Even though Matroska supports this, we cannot realistically expect a hardware device to cope with this.
Error resiliance and robustness
I recommend disregarding the "level" indication in H.264 because a large portion of all x264 encoded content is indicated by a wrong level (often as being 5.1 despite just being low bandwidth and low resolution.) Also, the level in itself does not fully guarantee whether the file will play or not. The device might as well always try to play the file. The user will stop playback if performance is inadequate.
Damaged files do still occur even in these times of checksums. How the device handles this should probably not be a part of the certification, but rather kept as a recommendation to anyone developing such a product. It does affect the user's over-all perception of the quality of the product.
Website by Joachim Michaelis