
Why is 44,100 used as the high quality sample rate?

Why did we choose 44.1 kHz as the recording sample rate?

People’s ears hear a sound whose frequency varies between 20 Hz and 20 kHz. By Nyquist’s theorem, the recording speed must be at least 40 kHz. Is this the reason for choosing 44.1 kHz?
Explain in more detail, the sample rate means how many “frames” should be recorded per second to have high quality audio.
According to the famous theorem created by a famous scientist named Nyquist, the sampling frequency must be at least twice the maximum frequency that we will record … then, as the human ear can hear approximately 20 kHz at most, twice that would be 40,000 per which was proposed 44,100 as a standard sampling frequency for high fidelity audio.
It is true that, like any convention, the choice of 44.1 kHz is something of a historical accident. There are several other historical reasons.
Of course, the sample rate must be higher than 40 kHz if you want high-quality audio with a 20 kHz bandwidth.
How to make 48.0 kHz was discussed (this matched well with 24fps and supposedly 30fps movies on North American television), but given the physical size of 120mm, there was a limit to the amount of CD data that could be stored and what an error detection and correction scheme is needed that requires some data redundancy, the amount of logical data that a CD can store (about 700MB) is about half of the physical data. With all of this in mind, at 48 kHz, we were told that it cannot hold all of Beethoven’s 9s, but that it can hold all of 9 on one record at a slightly slower speed. So 48 kHz is not.
However, why 44.1 and not 44.0 or 45.0 kHz or some nice round number?
Then in the late 1970s, there was a product called the Sony F1, designed to record digital audio onto readily available videotape (Betamax, not VHS). It was at 44.1 kHz (or more precisely 44.056 kHz). Thus, it will facilitate the transfer of recordings without oversampling and interpolation from F1 to CD or in the other direction.
My understanding of how this turns out is that the horizontal scan speed of the NTSC TV was 15,750 kHz and 44.1 kHz is exactly 2.8 times. I’m not entirely sure, but I think this means you can have three pairs of stereo samples per horizontal line, and for every 5 lines where you would normally have 15 samples, there are 14 samples plus an extra sample for some checking. for parity or redundancy in F1. 14 samples for 5 lines is the same as 2.8 samples per horizontal line and 15,750 lines per second, which is 44,100 samples per second.
With the transition to digital formats, audio was stored in the form of pseudo-video, which could be viewed as black or white (representing a binary format).
The frequency and field structure used by the television standard is as follows for 60 Hz video: 245 lines per field (excluding the first 35 skipped lines). With three samples per line, that is 60 x 245 x 3 = 44100 = 44.1 kHz.
This convention was later used for the CD format due to hardware compatibility issues (the first computer used to make master CDs used for CD replication was video-based).
Now, with the advent of color television, they’ve had to slow the horizontal line speed a bit to 15,734 lines per second. This setting results in 44,056 samples per second on the Sony F1.












