Ogg Vorbis FAQ Part 3

Ogg Vorbis FAQ Part 3

OGG vorbis

What is Ogg Vorbis?

OGG Vorbis

Ogg Vorbis is a new audio compression format. Compared to today’s most popular digital audio formats such as MP3, VQF, and AAC, it differs from traditional digital audio formats in that it is completely free, open, and patent-free.

What is the origin of the name?
Vorbis is part of the Ogg project, a project that aims to create a completely open multimedia system, and is the name given to the audio compression scheme for creating Ogg Vorbis files.

What is the file extension used by Ogg Vorbis?
As part of the Ogg project, Vorbis files use the extension .ogg.

Is Vorbis a complete replacement for MP3? Or is it a complementary compression format?
Ogg Vorbis is designed to completely replace the proprietary audio format. This means that you can encode all of your own music content in Vorbis without hesitation. 🙂

When will Ogg Vorbis be completed?
Currently (as of November 2000), the final version 1.0 is about to be released. However, the file format has been completed for the time being, and the Vorbis files created at this stage are guaranteed to be compatible with future decoders.
The format is designed with flexibility in mind, allowing developers to improve file size and sound quality without wasting traditional encoders and players.

Why should an artist pay attention?
There are many reasons.
First, although many artists may not know it, MP3 is known as a lossy compression format (lossy compression). Therefore, much of the sound data is lost when it is converted to an MP3 file, and as a result, the sound quality is lower than that of a CD.
Vorbis is also a lossy compression format (lossy compression), but it uses an excellent acoustic model to reduce damage, so if it is the same size, it will be able to provide music with better sound quality than MP3 files.
And if you are an artist, you should also consider music format licensing issues. If you decide to sell your music in MP3 format, you will be responsible for paying Fraunhofer (a proprietary MP3 company) a flat royalty on the sales. .. But with Vorbis, you don’t have a patent or license, so you don’t have to pay to sell it, distribute it to others, or pass it on.

Why should music fans pay attention?
First of all, it must be of high quality. The file size is also smaller than MP3s and will be even smaller as development progresses. Many software players already enjoy it, and soon some of the major hardware players will be supported.
With Vorbis, you can enjoy high-quality music with less data area.
Using Vorbis means that you can choose encoders and decoders without being limited by licensing issues. Also, most companies are unable (or unwilling to) pay for encoder licenses, so encoders for creating MP3 files should be limited in the future. This is why using Vorbis gives you more options for your encoder.

Why Should Developers Pay Attention?
The distribution of developed hardware and software players is greatly affected by audio-related patents and licensing issues.
With Vorbis, you can develop hardware and software players without any of these encoding and decoding restrictions.
Vorbis also offers a flexible, high-quality audio format. For more information, see http://www.xiph.org/ogg/vorbis/index.html.
Why Should Music Companies Pay Attention?
Music companies should pay the most attention to Ogg Vorbis. While other technologies require a large investment to start a business, Volbis offers the only platform for startups that is easily accessible and saves funds for start-up. This is because your customers, music fans, will not be angry with incompatibilities and will be happy with the higher sound quality due to the wide range of compatibility and openness of the player.

■ Related to the license
What license applies to Ogg Vorbis?
The Ogg Vorbis specification is in the public domain and is completely free for commercial and non-commercial use. Software developers can independently develop software compliant with this specification without restrictions or fees. However, developers who want to use open source software must follow the proper rules (see below).

Ogg Vorbis FAQ Part 2

Ogg Vorbis FAQ Part 2

Ogg Vorbis

Ogg Vorbis

OGG vorbis

0. Introduction

The other day, I presented ogg using this material at a regular kmlug meeting, but recently, it seems that ogg-related sites are inaccessible for some reason. So I put it here so that I can only use the binary I got earlier, so use it when the following site is not available.

1. What is Ogg Vorbis?
(Excerpt from http://homepage2.nifty.com/eangel/Mizuno/Software/ogg/)

Ogg Vorbis has the same music / audio file format as “.mid”, “.wav” and “.mp3”. I’ll abbreviate it as Ogg below, but we’re going to touch on the background behind the birth of this format.
First of all, as a music format, there is the mp3 which has become so famous that no one knows about it now, so why is it Ogg now? The answer is simple, because mp3s are no longer free.
Below is an excerpt from the ZDNET news.
Last September, Fraunhofer IIS-A began collecting royalties. This cost is considerable, to say the least. Fraunhofer IIS-A requires $ 5 for an encoder like MusicMatch, plus 1 cent per song for every MP3 downloaded or streamed, or 1% of total sales.
Fraunhofer IIS-A is a German laboratory that created the mp3 format. The short answer is that making an mp3 encoder or distributing a song costs money.
In this way, abroad, there is now a big problem regarding copyright related to online distribution of music files. The result was the open Ogg Vorbis, which has no copyright, patent or property rights.
Ogg Vorbis complains about MP3 and better sound quality, smaller file sizes, and avoidance of the aforementioned copyright issues. And it is constantly moving towards reality. But it is important to note that Ogg is not yet a complete format. As of September 2000, it appears to have the functions that MP3 has at the moment.
Let’s examine two important points.
First of all, regarding file size, Ogg basically uses a variable bitrate. Variable Bit Rate (VBR) is a high bit rate when high sound quality is required and a low bit rate when high sound quality is not required. If the bit rate is low, the sound quality will be low, but the file size will also be small. By making good use of this, the file size is smaller with the same sound quality as MP3. (By the way, MP3 also has VBR). Generally, 128 kbps is said to be enough for MP3 to get satisfactory quality, but Ogg has a sound quality of 160 kbps, which is a higher range, and the file size is roughly the same.
The next thing is the sound quality.
First of all, the default bit rate of the Ogg encoder is 160 kbps, which shows that sound quality is important.
Additionally, MP3 like LAME and BladeEnc have announced that popular encoders will also support Ogg.
In fact, I made MP3 files and Ogg files from my CD and listened to them. At the same bit rate, the file size was naturally smaller for Ogg, but the sound quality didn’t seem to be much different from MP3.
So I increased the Ogg bitrate by a range and checked the sound quality with almost the same file size. When I added another Wav file extracted from CD to the comparison target and listened to it over and over again, I came to the conclusion that the Wav file had the clearest sound quality, followed by Ogg and MP3.

Ogg Vorbis FAQ

Ogg Vorbis FAQ

Ogg Vorbis

Q. What is Ogg Vorbis?

Vorbis Ogg

Ogg Vorbis is a lossy audio compression standard similar to MP3, WMA, AAC, etc. The difference with these existing standards is that they are free and open standards. The Basic Reference Library is provided in a format that conforms to the BSD license and you can freely incorporate it into your own software or modify it. Also, there is no obligation to pay the fee at that time. One of the main features of Ogg Vorbis is that it has fewer rights restrictions than other codecs and formats. Q. What does the name mean? Ogg is a container for various data. Vorbis refers to a lossy compressed audio format. These are proper names, not abbreviations of significant words. In addition to Vorbis, Ogg can include Theora (Ogg Theora) and Ogg FLAC (FLAC). Q. What about the sound quality of Ogg Vorbis? Sound quality is not bad. However, everyone’s ears are different, so try it out for yourself. In general, many people feel that they have an advantage over other formats at low bit rates (64 kbps to 128 kbps). It also makes a big difference depending on the encoder (compression program). If possible, try different encoders. Q. The bit rate (capacity per second) is not constant.

Ogg Vorbis is based on quality-based VBR (variable bit rate). Therefore, the bit rate is not constant. If you need a constant bit rate, use the bit rate management mode. However, even in this case, frame-based constant bit rate (CBR) encoding, such as MP3, is not possible (like many other formats).
Also, the stronger the bit rate limit, the more disadvantageous it will be in terms of normal sound quality. Also, the encoding speed will be slower. It is better to use the underlying Vorbis quality mode (VBR).

Q. What is lossy audio compression after all?
The word irreversible means non-reversible (denial of reversibility). Simply put, it is irreversible audio compression. And the fact that it doesn’t fully return means that the sound quality may not be exactly the same as the original. However, it is possible to reduce the data by minimizing hearing damage. This is because the human ear is sensitive in parts, but very insensitive. Modern format encoders like Vorbis try to compress while preserving subjective sound quality as much as possible by exploiting the auditory characteristics of those people.
By the way, there is a lossless compression that does not deteriorate the quality at all, but in general, a large compression ratio cannot be expected. FLAC and Monkey’s Audio are relatively popular as lossless audio compression standards, but the compression ratio is in the range of 30% to 70% of the original.

Q. I want to listen to an Ogg file, how can I listen to it?
The easiest way is to use an Ogg Vorbis-enabled software player. There are a lot of compatible players for different platforms, so I can’t cover all of them here. Try searching by typing “Ogg Vorbis Player” into an Internet search engine.
There are also hardware players, although the absolute number is still small. For example, you can play Ogg Vorbis audio files outside with a portable player. You can find several things in this area by searching for “Ogg Portable Player” in the search engine.
Although rare, there are non-Vorbis files even with the Ogg extension, so be careful.

Q. I want to listen to Ogg Vorbis audio with my usual player,
In order to play it with WMP on Windows, I need to install the DirectShow filter. One of the most actively being developed is the Xiph.Org DS filter. Install the plugin for QuickTime
when playing iTunes on Macintosh (Windows version is also acceptable). Also, there are many other programs that can support playing Vorbis in the form of plugins. Even if it doesn’t play by default, you might want to look it up on a search engine.

Ogg Vorbis, the ogg vorbis audio format

Ogg Vorbis, the ogg vorbis audio format

OGG vorbis

Ogg Vorbis is an unlicensed audio compression format developed by the Xiph.org Foundation, a non-profit organization.

Ogg vorbis

Ogg is a file (container) format specification, Vorbis is a compression format specification, and both are collectively called Ogg Vorbis. The Ogg container is standardized as RFC 3533 and can store video and other audio formats, as well as audio in Vorbis format. The standard extension is “.ogg”. The Vorbis format was developed as an unlicensed alternative format after it was discovered after broadcast that the MP3 format, which was widely used in the field of audio compression, could not be used freely due to patents owned by companies. . The specification itself is open to the public as a public domain with waiver of all rights, and can be freely used by anyone. The reference code to be used as a reference when developing related software is published as a type of BSD license. Vorbis compresses using MDCT (Modified Discrete Cosine Transform), and the compressed data is basically Variable Bit Rate (VBR). If the bit rate is the same, the sound quality is better than MP3, and if the sound quality is the same, it can be compressed to a lower capacity than MP3. However, it is more complicated to process than MP3 and consumes more memory area for playback. Initially, it was noted that the encoding speed was slow, but with the contribution of programmers around the world, high-speed encoders have been developed and released. In Vorbis, you can select the compression ratio by adjusting the value representing the sound quality called “quality level” in 12 steps from -1 to 10. Supports gapless playback at the format level, and the part can be skipped. Silent between songs and play it smoothly.

Comparison of compressed files

Comparison of compressed files

Ogg vorbis

The playback environment has gotten a bit matomo, so I listened to it again and compared it.

Ogg Vorbis

The file is the same as the one used in the previous item. Sound from Creative’s SB Audigy LS sound card is produced by Sony LBT-V610 (commonly known as Liberty – a model circa 1990 so this is a 17 year old player, and the speaker is a rear bass reflex type 3-way 2 -unit with a volume of about 20 liters) and plays on the speaker. Like last time, I used Lilith version 0.991 for playback. As a pseudo-blind test, listen to each of the two sound sources to be compared, then switch to loop play mode, press the next song button repeatedly with your eyes closed, and don’t know which one will play first. , and then I also tried playing it and trying to guess which file was which.

For the Wave file (compressed to ape format and distributed) and the MP3 file, the previously used file was used as is. Lilith was used to encode Ogg Vorbis (v1.1.0) (faster and with better sound quality than the audio encoder). ACC used iTunes. ogg and ACC only change the bit rate from the default setting (ACC’s “Use VBR Encoder” and “Optimize for Audio Files” options are not checked). Lilith was used to play the files, and the ACC file and its comparison files were played using the VLC player.

First of all, it’s compressed vs. uncompressed, but the air changes slightly even in MP3’s 256 kbbs (lame) mode. The compressed file has a subtle sense of purpose, or a slightly unnatural “missing sound sensation”. I did the pseudo blind test 10 times and made a mistake once.

Compared to MP3s, lame’s 128kbps exceeded 128kbps in the afternoon. I have not done a pseudo blind test because it is so different that it is incomparable. At 198 kbps and 256 bps of lame, 198 kbps is a little less lively and a bit boring. I did the pseudo blind test 10 times and made a mistake once. The difference with Wave is also widening.

The comparison between ogg (setting 0.6: 220 kbps nominal but equivalent to 192 kbps in terms of file size) and MP3 (poor: 192 kbps) is quite tricky. There were too many mistakes to pass the pseudo blind test (I did it 5 times and lost motivation when I made 3 mistakes). 192kbbs from AAC didn’t do a pseudo-blind test because Lilith couldn’t read the m4a file, but it feels brighter than ogg or MP3 (even though it’s a pseudo, I can’t say anything because I haven’t made a blind though when I compared it with the Wave raw, you may have intentionally overproduced the shine).

Ogg (0.4 setting: 172 kbps nominal, but equivalent to 128 kbps in terms of file size) and MP3 (poor: 128 kbps) are quite different, and the pseudo-blind test was performed 5 times and 1 error. MP3s have more weight in the sound than hardware (theoretically MP3s can cut high frequencies, so that may be the effect). With AAC (128kbps) and MP3 (poor: 128kbps), I haven’t done a pseudo-blind test, but AAC clearly sounds good.

After many tests, I noticed that the sound recorded from SoundFont as a test sample was not very appropriate (because it was pre-processed) (at the moment, the spectrum analyzer “sounds to the limit”. Confirmed that “is”). If I have the opportunity next time, I would like to compare by instrument, studio recording / live recording, and music genre. However, I was able to judge “whether the sound changes or not” for the moment, so I would like to get it right for the moment (it was quite difficult both physically and mentally).

For the moment, I will write what I have concluded in the previous Chombo. AAC is excellent at 128 kbps. This is probably the result of combining the AAC concept with the low bit rate limits. 192kbps is a close battle, but it seems that AAC selects “high sound quality as a compressed file” instead of fidelity to MP3 and ogg, which are slightly bright because they are true to the original sound (because it has a waveform after the compression)., It should have been like this before compression. ”

The proper use of 160/192 / 256kbps is quite delicate, and it is persuasive to say that “the size does not change that much, so there is a lot of space at 256kbps” and “the difference in sound quality is rarely a problem”. So the size. At 160 kbps, which is small, there is a reason. There is a problem with the use, but the ratio of taste and mood will be large. By the way, 160: 192: 256 = 1: 1.2: 1.6 = 0.833 ..: 1: 1.33 .. = 0.625: 0.75: 1.

Realistic compression method

Realistic compression method

Vorbis

Regarding the bit rate, 192 kbps is sufficient for MP3.

Ogg vorbis

A raw Wave file (44100Hz x 16bit x 2ch = 1411kbps CD sound quality) occupies approximately 42MB in 5 minutes, less than 5MB at 128kbps, more than 7MB at 192kbps, less than 10MB at 256kbps, but it depends on the user or if you really want to stick to the reproducibility of the treble range. If you want to welcome sound processing, you can also use lossless compression. With lossless compression, the more striking the sound (higher SPL and thicker highs), the harder it is to compress, but even with lossy compression, such sound tends to get distorted unless it has a high bit rate. The expression of sound quality is strongly influenced by the playback environment, and indeed if you play it at a small volume with a sloppy device, even 128 kbps is indistinguishable from lossy compression.

I also prepared a sample file (including the file explained in the next point). kanon_f_200.ape recorded the aforementioned fluent mix sound with a 200% Timidity ++ amp, and kanon_e_150.ape also recorded the aforementioned eaw-mix sound with a 150% Timidity ++ amp. Monkey’s Audio used version 3.99 and set the compression level to standard (high compression settings only slow down the process and don’t shrink much). The recording time of the fluid mix is ​​approximately 3 seconds longer, probably reflecting the difference in the last sustain. The mixture of fluids is easier to gain sound pressure and can reach up to approximately 250%. Also, because the sound pressure is high, compression is not very effective. eaw-mix has noise-like ripples after 15 seconds (and 55 seconds), and 150% is barely. If you ignore the momentary donzuki, it can go up to about 200%. For all the files below, the log level for the fluid mix was 200% and that for each mix was 150%.

kanon_f_gogo.MP3 and kanon_e_gogo.MP3 are generated from gogo.dll (version 3.13a) using the Timidity ++ function. Bitrate is CBR 128kbps, 44100Hz sample rate, no emphasis, and set stereo (for comparison, all of the following are recorded in CBR + set stereo + 44100Hz sample rate without emphasis). It seems that the psychoacoustic model, both gogo.dll and lame, cannot be disabled in recent versions (it seems that it cannot be disabled unless the build option is changed and recompiled). It doesn’t matter much, but when generated by default, the ID3 tag’s “Genre” element is “Anime”. I don’t know if it’s Timidity ++ or gogo.dll, but it’s a pretty good initial setup.

kanon_f_128.MP3 and kanon_e_128.MP3 come out of Life to lame (version 3.96.1) with encoding rate 2 (default) + psychoacoustic model + no preset. The numbers at the end are 192 and 256, respectively, because only the bit rate is increased with the same settings. If you use VBR, etc., you can improve the sound quality even with the same file size. It took a bit of effort, and is clearly better than the sound produced by gogo.dll. This doesn’t mean that gogo.dll’s performance is inferior to poor, but it’s just a matter of prioritizing processing speed with a bit of cost-performance blindness to bitrate and sound quality. Previously I posted a file encoded at 0 encoding rate (higher sound quality mode), but lame 3.96.1 has a bug and it seems noise can be added so I replaced the sample.

In my ears, I played it with Lilith version 0.991 and from Creative Sound Blaster PCI-128 (Gift: Thanks HGW) to Matsushita’s SA-AK15 (Gift: the very common so-called “stereo”, but was the maintenance bad? I wonder? if it was originally something like that, when i heard it with aiwa’s HP-X122 (1500 yen discounted headphones for actual sale) through a thing about “sounds better than a 30,000 yen class radio cassette player” , the difference between lossless compression and 128 kbps licks is almost imperceptible even at high volume (at best, I feel like “the room I’m playing in has changed” – it’s also a placebo effect, so it’s as a bug). It’s a different story if you listen to it in front of you with speakers, but actually you rarely do it (or rather, it’s a nuisance to your neighborhood). So, I put the number 192 kbps at the beginning, but 128 kbps (160 kbps in the best case) is enough to listen in my room tion (well in my case it is too poor and the playback environment is poor. It’s very crazy so it might not be very useful).

Regarding the bit rate, 192 kbps is sufficient for MP3.

A raw Wave file (44100Hz x 16bit x 2ch = 1411kbps CD sound quality) occupies approximately 42MB in 5 minutes, less than 5MB at 128kbps, more than 7MB at 192kbps, less than 10MB at 256kbps, but it depends on the user or if you really want to stick to the reproducibility of the treble range. If you want to welcome sound processing, you can also use lossless compression. With lossless compression, the more striking the sound (higher SPL and thicker highs), the harder it is to compress, but even with lossy compression, such sound tends to get distorted unless it has a high bit rate. The expression of sound quality is strongly influenced by the playback environment, and indeed if you play it at a small volume with a sloppy device, even 128 kbps is indistinguishable from lossy compression.

I also prepared a sample file (including the file explained in the next point). kanon_f_200.ape recorded the aforementioned fluent mix sound with a 200% Timidity ++ amp, and kanon_e_150.ape also recorded the aforementioned eaw-mix sound with a 150% Timidity ++ amp. Monkey’s Audio used version 3.99 and set the compression level to standard (high compression settings only slow down the process and don’t shrink much). The recording time of the fluid mix is ​​approximately 3 seconds longer, probably reflecting the difference in the last sustain. The mixture of fluids is easier to gain sound pressure and can reach up to approximately 250%. Also, because the sound pressure is high, compression is not very effective. eaw-mix has noise-like ripples after 15 seconds (and 55 seconds), and 150% is barely. If you ignore the momentary donzuki, it can go up to about 200%. For all the files below, the log level for the fluid mix was 200% and that for each mix was 150%.

kanon_f_gogo.MP3 and kanon_e_gogo.MP3 are generated from gogo.dll (version 3.13a) using the Timidity ++ function. Bitrate is CBR 128kbps, 44100Hz sample rate, no emphasis, and set stereo (for comparison, all of the following are recorded in CBR + set stereo + 44100Hz sample rate without emphasis). It seems that the psychoacoustic model, both gogo.dll and lame, cannot be disabled in recent versions (it seems that it cannot be disabled unless the build option is changed and recompiled). It doesn’t matter much, but when generated by default, the ID3 tag’s “Genre” element is “Anime”. I don’t know if it’s Timidity ++ or gogo.dll, but it’s a pretty good initial setup.

kanon_f_128.MP3 and kanon_e_128.MP3 come out of Life to lame (version 3.96.1) with encoding rate 2 (default) + psychoacoustic model + no preset. The numbers at the end are 192 and 256, respectively, because only the bit rate is increased with the same settings. If you use VBR, etc., you can improve the sound quality even with the same file size. It took a bit of effort, and is clearly better than the sound produced by gogo.dll. This doesn’t mean that gogo.dll’s performance is inferior to poor, but it’s just a matter of prioritizing processing speed with a bit of cost-performance blindness to bitrate and sound quality. Previously I posted a file encoded at 0 encoding rate (higher sound quality mode), but lame 3.96.1 has a bug and it seems noise can be added so I replaced the sample.

In my ears, I played it with Lilith version 0.991 and from Creative Sound Blaster PCI-128 (Gift: Thanks HGW) to Matsushita’s SA-AK15 (Gift: the very common so-called “stereo”, but was the maintenance bad? I wonder? if it was originally something like that, when i heard it with aiwa’s HP-X122 (1500 yen discounted headphones for actual sale) through a thing about “sounds better than a 30,000 yen class radio cassette player” , the difference between lossless compression and 128 kbps licks is almost imperceptible even at high volume (at best, I feel like “the room I’m playing in has changed” – it’s also a placebo effect, so it’s as a bug). It’s a different story if you listen to it in front of you with speakers, but actually you rarely do it (or rather, it’s a nuisance to your neighborhood). So, I put the number 192 kbps at the beginning, but 128 kbps (160 kbps in the best case) is enough to listen in my room tion (well in my case it is too poor and the playback environment is poor. It’s very crazy so it might not be very useful).