Saturday, 8 March 2014

Project Introduction

Audio Data Compression Quality Project


This project aims to be open to everyone for its duration. My name is Chris, and I wholly encourage any and all contributions that my friends and colleagues, as well as any member of the wider public, cares to make to this endeavour. Firstly, to this blog - if I am correct, then anyone should be able to leave comments on posts on this blog anonymously with no necessary registration, or with an existing Google account. Therefore any questions, suggestions, or discussions regarding the project's topic are gratefully received as they will contribute to the completeness of the work.

The topic of the project is straightforward on the surface, but complex in the detail. The generalised, primary research question I am investigating is:
"Does the current trend in audio data compression codecs point towards a lossless standard in the near future?"

In order to understand and tackle this question, it is first necessary to establish some definitions, especially considering that this project is concerned in equal measure with the everyday consumer of music as a product, as well as with established professionals in the audio industries.

A codec is defined as 'a device or program that compresses data to enable faster transmission and decompresses received data.' Currently, the most popular and well-known example of a codec is most likely MPEG-Layer 3, or mp3. Other examples within the audio sphere include AAC, AIFF and FLAC, to name a few. FLAC is the odd one out of this list, as it is the only codec in that list which describes itself as 'Lossless.' In fact, according to FLAC's developers at the website https://xiph.org/flac/ :
"FLAC stands for Free Lossless Audio Codec, an audio format similar to MP3, but lossless, meaning that audio is compressed in FLAC without any loss in quality."

The advantage of FLAC as a format should be immediately obvious, based on that definition. What hypothetical person would knowingly choose a higher quality over a lower one, for any product? The answer is of course that the situation is not as simple as it might seem. For example, a lossless codec necessarily produces a larger file, and so as a result storage space and internet bandwidth (for transmission) become limiting factors. An entire collection of your favourite albums all in FLAC format may take up considerable room on a hard drive. As an attempt to solve this issue (particularly serious in an era where digital audio was just emerging from a world of analogue tape storage, with extremely limited digital memory available (remember floppy disks?)), lossy codecs were created.

To clarify: the term 'Lossless' means that during the compression process (in which the data making up a given audio file is trimmed down and reduced in one way or another in order to make the resultant file smaller, as its stated purpose) as little data as possible is 'lost.' Therefore the logical opposite of this type of codec is one to be described as 'Lossy.'

The term lossy applies to every type of codec which reduces the number of actual binary bits which are eventually played back on the user's computer, iPod, hi-fi system or other sound device compared to the original copy. This includes .mp3 and all of the above. Naturally when data is removed about a sound, the sound becomes...less than it was. In some cases, this change is audible. The first part of this project intends to shed some light on exactly what parameters make that change audible to the consumer and to the professional.

The hypothesis: The majority of end-consumers will be able to perceive a difference between lossless (FLAC) and heavily compressed non-VBR .mp3 (E.G. 128kbps) but will be unable to point to exactly what the difference is. Not knowing that, they will not be concerned and would not go out of their way to use lossless codecs such as FLAC instead.

In order to provide some [hopefully] useful information to music consumers who wish to know more about the topic, part of this project will involve the creation of a Wiki-style online resource with information about different compression options and the implications of each.

The course of this project manifests in two distinct stages: firstly, there is a need to evaluate the ability (or, indeed, the necessity) of a music consumer to perceive the difference between two given compression formats of a different audio quality. The second stage is dependent (to a degree) on the first, and involves the application of web-based information souces to communicate more information about the topic to said consumers. The decision-making process for how this second stage can be approached roughly follows this flow:



The first step is to find out whether this difference can be heard. If it can, that is no guarantee that the difference is worth paying for, and that is the deciding factor to most music consumers. Likewise it is unlikely that as many non-professional music listeners are able to pick out the same changes in a sound because A) They do not have the same experience of critical listening and B) They are unaware of the processes involved. Neither of these points are implied to be negative, it is however a discrepancy which this project will account for and attempt to remedy if, indeed, such action is warranted or desired in general.

A secondary (but pre-requisite) research question must then be to ascertain whether there is in fact a need for consumer-facing lossless audio codecs. If the difference is found to be simply negligible, then that is a satisfactory conclusion to the project. That is not to say however, that the value of a wiki resource on the topic would necessarily be diminished, and indeed such a conclusion would, in itself, raise new questions regarding psychoacoustics and, sociologically, consumer habits.

The wiki should be arranged and produced in a way which is useful to a user at any level of knowledge. That is, segregating information into tiers and pages centred around what the user already knows. More complex concepts will be explained in easily circumnavigated sub-sections, and an extensive glossary will be necessary linking to supporting information elsewhere on the web.

---

The purpose of this blog is to keep any interested parties following the project informed of its progress, and to present a chronological (if at times retrospective) log of activities and thought patterns for assessment.

The blog will cover some complex topics regarding digital audio theory, but these sections can be elaborated upon as necessary and upon request for anyone interested in learning more. Perhaps an apt objective for the emergent Wiki would be to ensure that all the topics covered in this blog could be entirely understood by somebody with no prior knowledge of digital audio technologies through its use.

The next blog post will cover some of the details of the listening tests to be performed, the first of which - a controlled test in a calibrated listening environment - will take place on Tuesday 11th March 2014 in the mastering studio of Confetti ICT in Nottingham. If any parties reading this would be interested in attending this session then please do not hesitate to get in contact, though some participants have already been sourced. Please also be aware that there will be an online version of the test to be performed in the participant's everyday listening environment, and the potential for repeat controlled tests in the future as time allows.

Thank you for reading, do leave a comment with your thoughts.

Chris

No comments:

Post a Comment