Audio
on the Web:
Abstract: People only retain 20% of what they see and 30% of what they hear. But they remember 50% of what they see and hear, and as much as 80% of what they see, hear and do simultaneously (Computer Technology Research, 1993). Internet-based audio and video tools are now used by educators to support instructional, research and administrative activities. Incorporating Web-based multimedia elements such as video, animation and audio into the delivery of on-line course materials can enrich learning by facilitating and encouraging active student participation in the learning process. However, if not designed properly, the addition of audio and other multimedia elements will detract rather than enhance Web-based instruction.
The focus of this paper details the design techniques and strategies educators can employ to record digital audio and add audio clips to their instructional Web pages to enhance their on-line course materials.
Adding audio and other multimedia elements such as animation and video to on-line course materials enriches the learning environment and ameliorates knowledge retention by involving students in the learning process. Adding interactive elements to on-line course materials can improve student motivation, promote active rather than passive student participation and increase learner control. Advocates of new multimedia and web-based technologies argue possible benefits of such technologies include increased intrinsic motivation, and the opportunity for students to learn in their own “style” and at their own “pace” (Weber et al. 1999). Effective design techniques and strategies should be employed to ensure digital audio elements enhance the delivery of on-line course materials. The foremost consideration of supplementing on-line course materials with digital audio is to fulfill the criteria of adding value to the content of the page. If the audio clip does not augment the content by some means, the audio file is burdening the Web page with additional bandwidth and should be excluded. Digital audio is successfully used to provide sample phonetic, language and music clips; explain examination questions and answers; provide instructions for assignments and supplementary messages to explain Web page content; and archive lecture notes.
Two types of servers distribute audio files on the Internet, a standard HTTP Web server or a special purpose media server. The HTTP protocol enables audio and video content to be streamed from a World Wide Web server. Although a standard HTTP Web server is not as robust or efficient as using a streaming media server, it provides an adequate method for delivering small, short audio clips to a limited number of concurrent users. The preferred method of delivering audio content is to distribute audio files from a dedicated streaming media server. RealNetworks developed streaming media (audio and video) technology, introducing RealAudio in 1994 to address the inherent problems and inflexibility of distributing audio by HTTP Web servers:
1. The Transport Control Protocol (TCP) used by HTTP Web servers to distribute information to Web browsers in small units known as data packets, resends the packet if the server does not receive confirmation that the packet was received without error by the browser. Retransmission of even a small number of data packets significantly increases the download time of an audio file.2. Web servers distributing audio files were unable to stream live audio broadcasts and did not support interactive user controls such as pause, rewind, or fast-forward through an audio file.
3. Web servers are limited to distributing audio files to a small number of concurrent users at a given time.
RealNetworks developed their own protocol, User Datagram Profile (UDP) and server software RealAudio, designed to address the limitations of distributing audio by HTTP Web servers. UDP resolved the problem of lengthy download times caused by ongoing packet retransmissions. The RealAudio protocol supports synchronous communication between the server and the browser, enabling the user to rewind or fast-forward playback of the audio clip. The RealAudio server was designed to deliver just enough information to keep the audio stream downloading with an additional amount as a buffer to compensate for delays resulting from network congestion. The ability to stream audio enables the server to distribute audio files to a greater number of concurrent users using less resources.
Streaming audio is either broadcast live or from a pre-recorded file. The RealNetworks dedicated media server, RealServer supports SureStream technology, the ability to detect and render the correctly encoded (converted and compressed) audio file based on the connection speed of the user. An audio file can be encoded for multiple connection speeds such 28K, 56K, or 112K, and the file distributed will depend on the available bandwidth based on the connection speed of the user. Although RealNetworks Server (http://realnetworks.com/) is probably the most popular dedicated media server, QuickTime by Apple (http://www.apple.com/quicktime/) and Windows Media Audio by Microsoft (http://www.microsoft.com/windows/windowsmedia) are other options for streaming audio over the Internet.
Before adding audio to supplement on-line course materials, consider whether the sound elements support and enhance the content of the page. Audio requires additional bandwidth, hence it is critical to identify the types of connections of your target audience and design accordingly depending on the connection speed, file format, sound quality, compression scheme, and browser support required for the intended audience.
The connection speed of your audience will ultimately define the quality, size and download time of the audio file created. Bandwidth, measured in Kilobits per second (Kbps), is the amount of data that can be sent through a network connection during a defined period of time. Connection speeds vary from slower speeds of 28.8 Kbps or less, 56 Kbps, to faster connections of 200 Kbps or greater. A compromise is necessary between reaching the largest target audience by creating a lesser quality audio file, or limiting the audience by creating a higher quality audio file. Audio files created for distribution on the Internet may require encoding to a lower sampling rate or bit depth to decrease sound quality and reduce file size and download time.
Historically, the file format chosen to distribute audio over the Internet depended on the computer platform generating the file. Apple developed AIFF for playback on Macintosh, NeXT Computer and Sun Microsystems jointly developed AU for UNIX, and Microsoft and IBM jointly developed WAV for Windows. There are a myriad of file formats used to distribute audio over the Internet such as MIDI, QuickTime, MPEG, and RealAudio as well as newer media formats such as RealMedia, Shockwave Audio, Flash, and Windows Media.
To play a digital audio file, the appropriate plug-in or helper application that can play the format of the audio file must be installed on the user’s system. Helper and plug-in applications are external or internal applications to the browser that are executed as media elements are invoked from within the Web page. RealMedia, QuickTime, and Windows Media Player are three common applications that play audio files. The browser identifies the plug-in or helper application that can play the audio file based on the file extension such as AU, WAV, AIFF, RA, RM, QT, MOV, AVI, or MP3 by referring to the Multipurpose Internet Mail Extensions (MIME) configuration specified in the browser’s preferences settings. The browser and the server must be configured to support the MIME type of the audio element.
To create an audio file for distribution on the Internet, analog audio data is recorded into a digital format, typically in an uncompressed format such as WAV, AU or AIFF. Before converting to a compressed audio format, it is preferable to start with the best quality “master copy” obtainable in an uncompressed format and then encode the file as RealAudio or RealMedia, QuickTime, MP3 or other compressed format. An audio codec (COder/DECoder) is a software scheme that encodes analog audio data and converts it to a binary format for processing on a computer and then decodes the data for playback on analog devices such as speakers or headphones. A number of compression methods are available to encode an audio file into a smaller file format suitable for distribution on the Internet. Lossy is a commonly used compression scheme that discards the high and low ends of the audio source file at the expense of quality to achieve a highly reduced file size. RealAudio, MEG Layer III (MP3), QuickTime Qdesign 2, and MS Audio are three popular audio products that provide a number of compression codecs designed for a variety of applications and connection speeds.
Sound is recorded from an analog source such as speaking into a microphone, input from a previously recorded cassette or reel-to-reel tape, or from a digital source such as CD, hard disk, or digital audio tape (DAT) using a sound editor. The digital file format generated is typically WAV, AIFF or AU. Record using a high quality sampling rate and bit depth such as CD quality, 44.1 kHz, 16 bit audio, and decrease to a lower level during the encoding step. Keep a backup of all original high quality source recordings.
To obtain a better sounding audio file, use a good quality microphone and digital interface sound card to help minimize hiss and distortion. Record using a high quality sampling rate, such as CD quality, 44.1 kHz with 16 bit depth resolution and encode to the lowest quality sound needed in the Web page. If using pre-recorded audio files, use the best quality source file available. Experiment using different audio codecs and settings. Reducing the sampling rate is more effective than reducing bit depth to achieving smaller, good quality audio files for Internet distribution. Record in a quiet environment, minimizing room reverberations or reflections by choosing a smaller room, with carpet and soft furniture.
If recording a single sound source such as spoken voice, keep the microphone close to the source and away from the sound of the computer fan. Avoid holding the microphone with your hand. Record speech in mono, and set the balance for all inputs to the center to prevent losing the right input channel frequently discarded by some sound cards when recording in mono mode. Use a pop screen to reduce plosive “P”s and “S”s. Decrease distortion by setting input levels to as close as possible to 0 dB without exceeding 0 dB during the loudest section of the file.
Avoid damaging the computer speakers during playback by setting the volume of the input device (tape recorder, CD player, etc.) to a low level and slowly increase the volume. Set the volume to the half way mark and then change the volume using the device’s volume control. If sound cannot be heard, check the operating system recording and playback levels and ensure the mute box is not marked for the master volume, line in, line out, microphone, CD and any other connections for the sound card.
The audio file should be edited to improve the quality of the sound file. Empty space at the beginning or end of the file should be deleted to eliminate long pauses. Normalization balances the sound wave by maximizing the input level of the loudest peak of the file after recording and should be used if the sound level is too low. Normalize the audio file to –0.5 dB. Equalization (EQ) adjusts the high and low tones in the audio file so that bass and treble can be made less or more pronounced.
Due to the tremendous size of the sound file generated from recording at a higher quality level, the digital source file must be encoded to compress the file into the appropriate sampling rate and bit depth required to distribute audio over the Internet. The size of the audio file is significantly affected by the sampling rate, bit depth, and the number of channels (mono or stereo). The sampling rate is the number of samples or snapshots of sound taken per second. For example, the sampling rate for CD quality audio is 44,100 samples per second. Bit depth defines the dynamic resolution of the audio. The greater the number of bits that are used to represent the sampled value, the more accurate the sample will be with respect to the original analog sound wave. Decreasing bit depth to 8 kHz increases distortion and hiss, especially at higher sampling rates. Stereo requires two channels, doubling the file size required for recording in mono. The determining factor regarding the selection of these variables is based upon balancing the sound quality and storage requirement. File size can be decreased most effectively without compromising sound quality by reducing the sampling rate and using one mono channel. The disk space required for one minute of sound varies significantly from 10.5 MB for 16 bit, 2 channel (stereo) CD quality audio to .6 MB for 8 bit, 1 channel (mono) telephone quality audio (Tab. 1).
|
Sample Rate |
Bit Depth |
Channels |
Disk Space for One minute of Audio |
|
44.1 kHz |
16 |
stereo |
10.5 MB |
|
44.1 kHz |
16 |
mono |
5.2 MB |
|
44.1 kHz |
8 |
stereo |
5.2 MB |
|
44.1 kHz |
8 |
mono |
2.6 MB |
|
22.05 kHz |
16 |
stereo |
5.2 MB |
|
22.05 kHz |
16 |
mono |
2.6 MB |
|
22.05 kHz |
8 |
stereo |
2.6 MB |
|
22.05 kHz |
8 |
mono |
1.3 MB |
|
11.025 kHz |
16 |
stereo |
2.6 MB |
|
11.025 kHz |
16 |
mono |
1.3 MB |
|
11.025 kHz |
8 |
stereo |
1.3 MB |
|
11.025 kHz |
8 |
mono |
.6 MB |
Table 1: Storage Requirements for One Minute of Audio
An encoding tool such as RealProducer (http://www.realnetworks.com/developers/index.html) is used to select a codec based on the bandwidth target (connection speed) and the audio content (music or voice) to compress the original sound file and create a new audio file. If you need a large dynamic range of sound, use 16 bit resolution. If all sounds are about the same volume, 8 bit may be suitable. Full-bodied vocal requires 16 bit resolution. RealNetworks’ RealAudio consistently rates highly at generating the best quality speech and music audio clips at various sampling rates (Tab. 2).
|
|
Speech (28Kbps) |
Rock Music |
Classical Music |
Electronic Music |
|||||
|
Female |
Male |
28 Kbps |
56 Kbps |
28 Kbps |
56 Kbps |
28 Kbps |
56 Kbps |
||
|
a2b |
Poor |
Fair |
Fair |
Fair |
Poor |
Fair |
Poor |
Poor |
|
|
Liquid Audio |
Fair |
Fair |
Good |
Good |
Fair |
Excellent |
Fair |
Good |
|
|
MP3 |
Good |
Excellent |
Poor |
Poor |
Poor |
Poor |
Fair |
Fair |
|
|
MS Audio |
Excellent |
Good |
Good |
Excellent |
Fair |
Excellent |
Fair |
Good |
|
|
QuickTime |
Fair |
Fair |
Good |
Excellent |
Good |
Good |
Fair |
Excellent |
|
|
RealAudio |
Excellent |
Excellent |
Good |
Excellent |
Good |
Excellent |
Good |
Good |
|
Table 2: Streaming Audio (PC Magazine, 1999)
Audio files are distributed over the Internet by file downloading or by streaming audio. Irrespective of the method chosen to incorporate an audio file for playback within a Web page, the user’s system must have the appropriate plug-in application installed that can play the audio file format, and the server and browser must be configured to support the MIME type of the audio element. Assist users by including a link in your Web page to the vendor Web site where the player can be downloaded.
The simplest method of adding an audio file to a Web page is to create a link to an audio file. When the user clicks the link, the audio file is downloaded and the player configured to play the file type begins playback once the complete audio file is downloaded.
<a href=”audioclips/filename.ext”>Listen to Narrative Audio Clip</A>
An audio file can be streamed from an HTTP Web server. Within the Web page, a link is established to a text-based metafile with a RAM file extension containing the URL of the file. The user clicks on the link in the Web page and the browser downloads the RAM metafile, launches the player and starts streaming the audio file.
Link in Web page: <A HREF="audioclips/audioclip.ram">Listen to Narrative Audio</A>
RAM file contents: HTTP://web.uvic.ca/~bgerth/audioclips/audioclip.rm
A variation on streaming audio is to embed the player into the Web page rather than launching and opening a separate window for the player. The link in the Web page is established with the embed command. The noembed and href tag provide the alternate method for downloading the audio file if the browser does not support the embed tag. The type attribute specifies the type of plug-in player to embed in the page.
<embed src="audioclips/audioclip.rpm" type="audio/x-pn-realaudio-plugin" console="Clip1" controls="All" height=125 width=275 autostart=false loop=false></embed>
<noembed><a href=”audioclips/audioclip.rm">Listen to Narrative Audio Clip</A></noembed>
The metafile is named with the file extension of RPM and contains the URL of the audio file:
http://web.uvic.ca/~bgerth/audioclips/audioclip.rm
Setting autostart to true forces the audio file to begin playing automatically when the page loads and setting loop to true forces the audio clip to repeat until terminated by the user (not recommended), or loop to X where X represents the integer specifying the number of times to repeat the audio clip.
Many Web pages provide sound clips that can be downloaded for use on a Web page, often illegally. Before downloading a sound clip to distribute on your Web page, ensure you are not contravening copyright laws by obtaining permission to use the audio file by the copyright owner. Otherwise, create your own audio clip, find a site that offers free sound clips and has the right to offer them, or obtain permission to use the audio file.
A simple method of protecting your own works is to indicate the intent to copyright by attaching a copyright symbol, the date, your name, and the term All Rights Reserved by the link within the Web page to load the file. Audio editing programs such as RealProducer enable the copyright information to be embedded within the audio file and the copyright information is displayed by the media player during playback.
This paper has explored some of the design techniques and strategies for incorporating digital audio into the delivery of on-line course materials. Factors such as the connection speed, the type of fidelity, the file format and the use of compression significantly affect the successful implementation of digital audio on the Internet. Digital audio when carefully prepared and judiciously added, will enrich learning and enhance the delivery of Web-based instruction.
Hecht, J.B. (1999). Bleeding on the Edge II: Instructing with Live Audio, Video, and Text over the Internet. World Conference on the WWW and Internet, 1999, Association for the Advancement of Computing in Education, Charlottesville, VA. 1290-1291.
Ho, T.I. (1999). Experiences with Real-Time Streaming Audio/video in Delivering Web-based Courses. World Conference on the WWW and Internet, 1999, Association for the Advancement of Computing in Education, Charlottesville, VA. 1296-1297.
Krauss, D., Steffanos, G., & Steffanos, M. (1998). Streaming Audio. [WWW Document], http://www.skwc.com/WebClass/Task-Sound5.html.
Mudge, S.M. (1999). Deliver Multimedia Teaching Modules via the Internet. Journal of the Staff and Educational Development Association, 36 (1), 11-16.
PC Magazine. (1999). Streaming Audio. [WWW Document], http://www.zdnet.com/products/stories/reviews/0,4161,2313783,00.html.
RealNetworks, Inc. (1998). RealProducer User’s Guide Version G2. [WWW Document], http://docs.real.com/docs/produceruserguideg2.pdf.
RealNetworks, Inc. (1998). RealSystem G2 Production Guide. [WWW Document], http://docs.real.com/docs/smil/prodguideg2_7.pdf.
Weber, R.K., Schoon, P., & Gonzalez, A.R. (1999). Designing Multimedia and Web-based Units for Technology Integration: Motivation and Student Learning Styles. World Conference on the WWW and Internet, 1999, Association for the Advancement of Computing in Education, Charlottesville, VA. 1133-1137.
Copyright by AACE. Reprinted from the SITE 2001 Proceedings, March 2001 with permission of AACE (http://www.aace.org).