Codec is a contraction of coding and decoding. Codec is software expressly developed to handle media streams such as audio and video. Today, there are dozens of codecs in use, some of them lossy and some lossless. This is necessitated by key considerations such as storage space required for audio/video content, processing (as happens in cameras and studio audio) and streaming over networks (as happens in streaming over the internet) where bandwidth considerations come into play as well as seamless replay.
Raw files, huge sizes
Take audio. Audio may be in raw form such as PCM or AIFF or wav or RF64 where file sizes are determined by sample rate, bit depth, bit rate and channels (mono or stereo). Raw is rarely used except for processing. Some form of lossy or lossless codec compression is usually applied keeping in mind required audio quality, storage space and streaming. Compression algorithms contribute to reduction of file sizes at the expense of loss of audio quality to some extent depending on the degree of compression applied based on bit rate chosen. For example:
Uncompressed wav, aiff or PCM file size for one minute of audio at 1411.2 kbps, sample rate of 44.1 kHz, 16 bit CD quality and stereo is 10.584 MB whereas compressed file with bit rate 98 kbps is 735 kb. The difference is tremendous and particularly relevant for storage and transmission over the internet. VoIP codecs strike the right compromise by reducing file size while retaining quality, at least for speech.
Codecs in use for audio
You will find dozens of lossy and lossless compression codecs such as MPEG, MP3, WMA, AAC, flac and OGG Vorbis. These help to keep file sizes manageable for storage and transmission without overly affecting audio quality. However, for transmission over the internet there are other VoIP codecs that apply even more compression and these are:
- G.711: 64 Kbps
- G719: 32/48/64/128 Kbps, 28 ms frame size
- G.722: 48/56/64 Kbps
- G.723.1: 5.3/6.3 Kbps, 30ms frame size
- G.726: 16/24/32/40 Kbps
- ILBC: 15 Kbps, 20 ms frame size
- GSM: 13 Kbps, 20 ms frame size
- G.728: 16 Kbps
- G.72: 8 Kbps
- LPC10: 2.5 Kbps
As can be seen VoIP codec bit rate varies from 2.5 Kbps to 128 Kbps to allow for seamless and jitter-free fast audio streams over GSM or over the internet.
Video – even bigger
It gets even worse in the case of video that can generate huge file sizes in the raw mode. This is further complicated by the use of various file types such as Quicktime H.264 MOV, Prores MOV, HDVCPRO HD, DNxHD, AVCHD, Red Raw, Arri Raw, Sony raw, Adobe Raw, Phantom cine and MP4-H.264 used by different camera makers, as well as the use of resolutions that can range from standard definition all the way up to 8K. Bit rates vary wildly from 36 Mbits to 235 or even high variable bit rates. In short, without lossy compression of some type or the other it would be a challenge to transmit jitter-free video over the internet. Video file size varies, depending on codec, compression and bit rate chosen, with the last playing a significant role. Add audio to this mix and you have to factor in audio codecs and compression for streaming where file size can be crucial to quality. For video, in general, the file size is arrived at by multiplying bitrate with duration and compression ratio.
One frame of HD video in full color has a size of 8294,400 bytes. At 30 frames per second the file size for each second of HD video would be 249 MB, uncompressed. One minute of HD video takes up 14.93 GB.
As a thumb rule, 480p Standard Definition video has 2 MB size for one minute duration, 720p HD file size is 5 MB, 1080p HD is 20 MB and 4K is 84MB. Bitrate is the product of frame size multiplied by frame rate. This means you can manipulate bitrate to reduce file size but at the expense of clarity, motion and color fidelity. If it were not for codecs you would find it difficult to watch Netflix and Amazon Prime. Video codecs come into play and these codecs compress components of the stream and package them into a wrapper or a file format such as .mp4, containing audio code, video codec and captioning. Usual containers are .mp4, .move and .wavy but the codecs most commonly used are H.264, newer H.265, AV1, VP9 and H.266/VVC. H.265 and H.266 have royalty issues which led to development of the AV1 codec by Google, Microsoft, Cisco, Mozilla, Netflix and Amazon but it is yet to gain popularity over older and well established ones like H.264. In the time to come AV1 could well prove to be popular for transmission over the internet. Google also went its way to develop the VP9 royalty-free alternative to HEVC and it is finding more use in WebRTC audio-video chats and conferencing. However, there are compatibility issues with Apple.
It takes finesse to develop VoIP solutions that handle codecs for flawless streaming
The VoIP codec world and VoIP communications are further complicated since calling party may use one set of protocols and codecs whereas called party may use another and the mismatch means conversation is not possible unless you have something like the SBC to manage codecs and protocols or build this into the call center software or IP PBX. Developers of solutions such as WebRTC integration for conferencing, for example, must consider variables such as different bandwidths and internet speeds at different points and yet ensure that there are no breaks or jitters in the audio video streams, something achieved by using API integration, fine-balancing of bit rate referenced to bandwidth available and use of SVC or scalable video coding, an extension of the H.264/MPEG4 compression standard.
Yes, we could go on but suffice it to say that codecs, though in the background, do play an important in quality of service and performance of the application. Should you opt for VoIP solutions it pays to get your software from developers who are adept with codecs.