MMMAIN

Creating Multimedia for the Web

Contents

Overview
Planning
Content Design and Development
Compressing Multimedia Content: Codecs
Hardware Requirements
Software Requirements

Creating Images
Planning
Production: Image source
Image File Formats
Sample Content Creation Scenario
Acquiring and Licensing Images

Creating Audio
Planning
Production: Audio source
Audio File Formats
Compressing Audio Content: Codecs
Sample Content Creation Scenario
Acquiring and Licensing Audio Content

Creating Video
Planning
Video source
Video File Formats
Video production
Compressing Video Content: Codecs
Sample Content Creation Scenario
Acquiring and Licensing Video Content

Abstract:

This section addresses the fundamental issues of creating multimedia content and how these issues relate to delivery on the World Wide Web (WWW). The stages in developing a multimedia project (planning, design, production, testing, and delivery) are discussed in terms of the different types of multimedia content (images, audio, and video). This section is not a discussion of interface design principles or a "how-to procedural guide" for different types of software used in creating multimedia content. For this type of information, many books are available, along with the user documentation for specific software packages.

This section is intended for the person just starting into multimedia content creation. By understanding the basic principles and concepts involved in developing and delivering general multimedia content, you will be better able to understand the principles and guidelines presented later on using MicrosoftÒ NetShow™ to enhance network multimedia presentations.

Overview

What is multimedia?

Multimedia has been a term used for a number of years in computer delivery. A common definition of multimedia is the blending of multiple types of media (audio, text, images, video, and animation) together to enhance the message that you want to deliver. You're very familiar with how much a sound track can add to the story of a movie or how the new generation of special effects can enhance visual impact. This is the professional level of multimedia, and while your project probably won't have the time, money, or resources to make the next Jurassic Park, you can create multimedia enhancements that will facilitate getting your message across to your audience.

Steps in creating multimedia content

As with any project or plan, a series of phases occur: planning, production, building, testing, and delivery. The more attention paid to these phases, the better the outcome. In fact, the more effort put into the early planning stages, the better. Early planning saves many headaches later on; DON'T skip this vital stage. The more up-front planning you do, the better the likelihood of your multimedia creation meeting your audience's expectations.

Planning

Planning consists of several stages, from conceptualization to finalization of the concept, and moving on to production phases. Initial planning requires you to develop and understand some basic questions:

What do I want to create?

Why is it being created?

What is the message to be conveyed?

Who is the audience for the presentation?

Who is the client that the project is being done for?

How is the content to be delivered?

--multiple platforms

--multiple browsers

Take time to answer these questions before you start the actual content creation process. After this initial project conceptualization phase, several other stages are important.

Brainstorming

This phase is your opportunity to dynamically develop and gather ideas about the overall project creation and presentation. Gathering ideas from more than one perspective or viewpoint and without constraints or judgements on the range of ideas gives the content producer a wide base from which to work. A major error in some multimedia projects is too often starting out with a preconceived idea about what the presentation should be and how it should be created

Commonly used methods of recording information from brainstorming sessions exist, and all may prove equally effective. Whiteboards, flipcharts, videotaping, and paper lists are all worth investigation. Taking notes on a computer may not prove the most effective method of gathering brainstorming information unless the information is made visible to others in the brainstorming session. Remember that the purpose of brainstorming is to make a wide range of ideas visible to the content team. Experiment, but just make sure you go through a brainstorming session, if only to clarify your own ideas and preconceived notions about the project.

Project Proposal

Once you have a clear understanding of the project scope and have formulated an idea about the content and presentation, a proposal should be developed. This proposal serves several purposes:

It requires you to present an organized plan for creating and delivering the presentation

It summarizes the project scope and direction.

It identifies the audience.

It identifies the project message.

It identifies the client for whom the project is being done.

- Hired by an outside client?

- "Hired" by a group internal to your company?

- Personal project

It provides a plan of record that can receive a signoff from the client

Storyboards

Storyboarding is a way of visualizing abstract concepts early in the project's life. Words convey only a certain part of a multimedia experience. A storyboard is an illustrated scene-by-scene plan for how the story is to be told, the message you want to convey, and how the overall audience experience will look and feel. Storyboards are a visual way to show the interaction of words, visuals, and actions over a timeline.

A storyboard can be as complex and detailed as resources, skills, and time allow, but even primitive(rough) drawings of the storyline displayed over time will greatly increase the chances of the final outcome achieving the goals you want.

Prototyping

This is a time of experimentation using different implementations of ideas from the storyboards. By putting together a small sampling of the design presentation, you can gather a representative model of the final content creation. Try to include at least one example of each media element in the instructional context planned for the final delivery. This prototype provides an opportunity to test your ideas on your intended audience and make revisions early in the project. Remember, a prototype is only a snapshot of the project with a balance between time, cost, and quality. Don't spend excessive time or money to produce a prototype of greater quality than is needed to present an accurate model of the final project.

Script writing

Once the concept is understood and defined for the multimedia project, the process of putting words, visuals, and actions to the storyline is started. This process is referred to as script writing, or scripting the project. Script writing as defined in this discussion refers to the narration or storyline, the visuals that support the narration, and how these two components interact to convey an overall message. Just as in a movie production, the audio track, the visuals and action sequences, and how the scene was actually shot must all work together to convey the story. Other phases of multimedia production can reference scripting as authoring activities using scripting languages such as Microsoft's VBScript, NetScape's JavaScript, or Macromedia's Lingo.

Textual Storyline/Content Script


Visuals	Timeline	Event/Action	Audio track
Logo, then speaker (casual, yet professional manner)-Opening	0- 20	URL flip to questions posed...then features start appearing	How effective and captivating was your last presentation; lost a few people along the way? Give me just three minutes and I'll show you how MicrosoftÒ NetShow can make your next presentation more powerful and compelling.
Talking head	20-25	URL flip to list of features	We're going to cover 6 main features of NetShow player and the authoring. There are many other features we could cover, but we don't have much time, so let's get going!

Streaming media label	26-45	URL flip to streaming media	How often have you wasted time waiting for a large amount of content to download over the network? NetShow streams content to users so you can receive information without these painful download times. NetShow takes a few seconds to get the information needed, and then starts playing. No wasted hard disk space or long download wait times.

Talking head	46-50	URL flip to markers page 1	Have you ever had to listen to an entire videotape because you didn't have a good way of jumping to the section you really wanted to listen to? NetShow uses markers, which are like media bookmarks.

Markers label	51-1:10	URL flip to markers page 2	Markers can define sections, chapters, scenes, or other logical points in time. You can move forward or backward quickly and easily to a marker and then continue playing from that place.

Content design

Ideally, a team can be assembled based on the necessary project skills: graphic designers, scriptwriters, audio and video production personnel, an instructional designer, and a producer, at a minimum. However, in reality, many of these tasks end up being handled by less than a full team; a few people may have to handle many tasks. Keep in mind, however, that the skills involved in designing great-looking visuals or recording great video and audio scenes are quite different from the project management and visionary skills of the producer. Content design can only be as good as the resources dedicated to produce the multimedia elements. Audiences today expect TV and movie quality and impact even when viewing computer-based multimedia content; it's better to have a simple, clear, well-done presentation than develop a large, complex presentation with poor audio, video, and images.

Often an overlooked part of multimedia content creation is the information design stage. Regardless of the individual pieces of multimedia content, it is the organization and presentation of the content that makes the project a success. Information design organizes and integrates all the media pieces into a clear and accurate representation of the information. Spend time in the early planning stages to develop a clear, effective information design that puts all the multimedia elements and the overall project message in the best possible perspective.

Content development

The processes and procedures for developing the individual multimedia elements vary depending on the source of the media;

Does the media already exist?

Do new media elements have to be created?

The following sections on images, audio, video, and animation address both of these scenarios. Obviously, if the media already exists the tasks are more related to editing. New media production requires far more work, from planning through testing and delivery. However, the opportunities for producing media elements that are better suited to your project and for providing overall higher-quality sources for possible later use are much greater and are generally worth the time, cost, and effort.

Compressing multimedia content: codecs

Stay with me now; finally, the "techie" topic you'd rather not hear about! But believe me, when you start developing audio and video content, the success (or, unfortunately, the lack of success) of your content creation depends to a great extent on compression technology. Graphic images can also be compressed, as you'll hear about under File Formats for Images, but the complexities are fewer than when working with audio and video. This discussion presents an overview of media compression, some of the common codecs used for compressing audio and video, and suggestions for when to use certain codecs.

Before we start discussing the technicalities of compression and decompression, be aware that compression is not the only way to reduce the size of image, audio, and video files and thereby decrease the bandwidth required to transmit the files over a network. Your initial planning should provide some clues about how you need to develop your content based on delivery bandwidth and the expected quality based on the audience and the message you want to get across.

If your project doesn't have to be the next Jurassic Park, then you have several options available to you before you think about using codecs. By reducing the frame rate of your video, you can reduce the data rate proportionally. Or if the frame rate is of utmost importance, then decreasing the size of the video windows by half decreases the file size by a factor of 4. For example, if you start with a 10 MBps video that is 320 x 240 and 20 fps, and reduce the size to 160 x 120 and the frame rate to 15 fps, the video drops to 1.88 MBps.

While you can't decrease the window size of audio, you can decrease the sampling rate (frequency) and the bits per sample for similar decreases in file size, and thereby decrease the data rate. This chart provides a representative sampling of the relationship between audio sample rate, bits/sample, and data rate.

Audio Type	Sample rate (frequency)	Bits per sample	Mono or Stereo	Data rate/minute
Telephone -quality speech	11 kHz	8	Mono	662 KB
High-quality speech	11 kHz	16	Mono	1.32 MB
Music	22 kHz	16	Mono	2.65 MB
Music	22 kHz	16	Stereo	5.3 MB
CD-quality	44.1 kHz	16	Stereo	10.6 MB

Codecs (short for compressor/de-compressor) are the key to media compression and decompression. Typically, a codec converts between an uncompressed format and a compressed format. By using codecs for compressing audio and video data into smaller packages, network and multimedia applications provide richer, fuller content and don't consume as much hard disk space or network bandwidth as non-compressed media. Two stages exist when using codecs; compression, or encoding, and de-compression, or decoding.

There are many different codecs, each with advantages and disadvantages, depending on what you want to accomplish by the compression process. Understanding a few basics about how compression works and the pros and cons of the main codecs helps the content creator to choose a codec that best fits their needs.

There are two categories of compression:

lossless

Images, audio, and video data contains redundant information. For example, the same color may exist in many pixels of an image or video frame. Rather than representing every pixel by this color, a lossless compression scheme will represent the color once, and then "remember" how many other pixels of this color are represented in the original image. When the image is decompressed, all the pixels are represented, displaying an exact copy of the original.

This type of compression generally results in a lower overall compression ratio than a lossy-type compression scheme, which we discuss next. Common compression ratios are from 2:1 to 3:1.

lossy compression

Rather than looking for redundant information and "remembering" the amount and location of this redundancy, a lossy compression scheme removes redundant data that is less important to perception. For example, certain pixels are actually removed from the image, but the overall appearance is not degraded or is degraded only slightly.

Lossy compression provides a greater compression ratio, but the decompressed media may not appear exactly as did the original, uncompressed media. Generally this is not a major problem with images, but this may not be a good solution for compressing some types of audio. The human ear is much more sensitive to lost audio information than the human eye is to lost visual information. For example, a music track might not suffer from loss of some data, but a voice track may sound poor. Fortunately however, voice data is of low bandwidth, so high compression ratios are not normally needed.

Two other terms are commonly heard when discussing the methods used by codecs to compress data; intraframe, or spatial compression, and interframe, or temporal compression. The key points for these terms that you need to understand are that intraframe compression reduces each frame of video, while interframe compression reduces some frames referred to as key frames, and then records only the differences between the next frame and the preceding frame. These "difference" frames are often referred to as p (predictive) or delta frames; the key frames are referred to as I frames.

Without getting into a detailed description of the differences between lossy, intraframe compression for .avi file formats and lossy, intraframe compression for MPEG files, we should mention one thing. The key frames in MPEG compression are referred to as "I" frames and the changed or delta frames are referred to as "B" and "P" frames. The "B" in B frames stands for bi-directional, which means that this type of frame gets its information from I frames before and after the B frame in the video stream. P frames, or predictive frames, get information only from preceding I frames.

You ask, "So what?" Well, intraframe compression is easier to edit and compress; however, the amount of overall compression is usually less than with interframe compression methods. So if you need maximum compression ratios you'll probably end up using an interframe compression scheme like MPEG, Intel Indeo Interactive, or VDOWave.

Codecs for the Microsoft Windows^{®^{platform are commonly referred to as ACM or VCM compatible. These terms refer to Audio Compression Manager and Video Compression Manager, respectively. These are standard implementations for audio and video compression on Microsoft Windows platforms. This standardization allows files to be opened, played, and saved using ACM or VCM-compliant codecs installed routinely with Windows operating systems or applications such as Microsoft's Video for Windows. These video codecs include:}}

MS Video 1 (MS-CRAM)
Supermac Cinepak
Microsoft RLE
Intel Indeo R2.1/Raw
Intel Indeo R2.1/YVU9)
Intel Indeo R3.1 (IV31)
Intel Indeo R3.2 (IV32)
Intel Indeo Interactive 4.1

QuickTime is a video format from Apple Computer, Inc. that provides a common multi-platform video format that includes video and audio codecs for high quality playback and authoring. Several of the common video codecs such as Cinepak, and JPEG, can be used in content played back in a QuickTime format. QuickTime provides two software codecs for video, a video compressor and a compact video compressor. These codecs compress and decompress video running on computers using Apple's Macintosh operating system and Microsoft's Windows-based operating systems.

Note: In the context of this discussion of codecs, bit rate refers to approximate network data rates of:

Low < 40 Kbps or lower

Medium 50 to 150 Kbps

High > than 150 Kbps

These audio and video codecs are examples of common software compression and decompression solutions. Most of these codecs are available through various software applications, such as video editing and audio editing packages, or directly from the developer of the codec. This list is for reference purposes only and is not meant to be comprehensive or a comparison of the quality and performance of the various codecs.

Video Codecs

Software Codec	Company	Best Used For
Indeo Video Interactive R4.1	Intel Corp.	Full motion, 24-bit video at mid--to-high-bit-rates; slow compression times even on fast machines; higher-quality video than Indeo 3.2, Microsoft Video, or Microsoft RLE; video displays best on fast processors.
Indeo Video R3.2	Intel Corp.	Useful for 24-bit video at mid-to-high-bit-rates; best used on raw video source media that hasn't been previously compressed with another lossy compressor; has low CPU utilization; quality comparable to Cinepak with lower bit rates.
VDOnet VDOwave	VDOnet Corp.	Low-to-mid-bit-rate video; small window sizes; optimized for Internet delivery of high quality, low rate video.
H.263	Intel Corp.	Video telephony standard designed for low-bit-rate video over 28.8 Kbps connections.
MPEG-4	Microsoft Corp.	A limited implementation of the MPEG-4 video standard; excellent for low-to-mid-bit-rate video delivery.
TrueMotionÒ RT (Duck)	The Duck Corp.	Full motion, mid-to-high-bit-rate video. Provides excellent video quality and playback performance.
ClearVideo	Iterated Co.	Low-bit-rate video delivery for Video for Windows and QuickTime platforms.
Cinepak	Radius Corp.	Full motion, high-bit-rate video. Provides good video quality with good playback performance.
Microsoft Video 1	Microsoft Corp.	Full motion, moderate quality video with low CPU overhead, 320 x 240 or smaller, 15 fps or less. Supports only 8-bit (256) color.
Microsoft Run-Length Encoding (RLE)	Microsoft Corp.	Intended for compressing clean graphic images such as bitmaps. It has a low CPU overhead, but does not handle rapid, complex scene changes well.
Indeo Video Raw (YVU9C)	Intel Corp.	Useful for capturing uncompressed video of high quality. This is NOT the same as capturing with no compression; in other words, raw video. Large files and high bit rates, but excellent image quality. This is the BEST source, along with Raw video, of video content to be compressed by other methods later.
Hardware Codecs
Motion-JPEG	ISO and Consultative Committee, International Telegraph and Telephone	Intended for compressing a series of JPEG images. No audio capabilities are available with Motion JPEG. Motion JPEG is generally quicker in displaying images than MPEG; however the file size is two to three times larger than an equivalent MPEG video.
MPEG-1	ISO and Consultative Committee, International Telegraph and Telephone	Intended for delivery of high-quality, 30-fps motion video at a frame size of 352 x 240 compressed to a data rate of approximately 150 Kbps (in other words, equal to single-speed CD-ROM performance).
MPEG-2	ISO and Consultative Committee, International Telegraph and Telephone	Intended as a broadcast video standard providing 720 x 480 playback at 30 fps. To achieve this high quality, the data rate is very high, ranging from about 500 Kbps to greater than 2 MBps. Because of this high data rate, MPEG-2 is currently better suited for dedicated video servers.
DVI (Digital Video Interactive)	Intel Corp.	Based on a chip set developed by Intel and used by IBM for video and audio compression and decompression. The software portion of DVI requires this special, proprietary hardware, hence the term hardware codec. To date, this codec has not received widespread use; however, more recent hardware advances might change this scenario. Currently this codec is unlikely to be part of the content producer's arsenal for compressing video.

Audio Codecs

Software Codec	Company	Best Used For
DSP Group TrueSpeech	DSP Group, Inc.	Low-to-mid-bit-rate voice-oriented sound; excellent all around audio codec.
Microsoft Network (MSN) Audio codec	Microsoft Corp.	Low-to-mid-bit-rate audio, both voice and music.
Lernout & Hauspie CELP 4.8kbit/s	Lernout & Hauspie	Low-bit-rate voice audio; not optimized for music.
Microsoft PCM Converter	Microsoft Corp.	Uncompressed, high-quality audio for higher bit rate content.
Microsoft Adaptive Delta Pulse Code Modulation (ADPCM)	Microsoft Corp.	High-quality compressed audio for higher bit-rate content; good for audio stream associated with high-bit-rate video.
Microsoft Interactive Multimedia Association (IMA) ADPCM	Microsoft Corp.	High quality compressed audio for higher bit-rate content.
Fraunhofer IIS MPEG Layer 3	Fraunhofer	High-quality audio with low bit rates; mono and stereo; NetShow and Shockwave use FHG for audio; works better with "mixed audio signals" than pure voice; excellent all around audio codec.
G.723	Intel Corp.	High-quality audio for low bit rates; good voice and music reproduction.
Voxware	Voxware, Inc.	High-quality speech for low bit rates
Microsoft Groupe Special Mobile (GSM) 6.10	Microsoft Corp.	Mid-to-high-bit-rate voice-oriented sound.
Microsoft Consultative Committee for International Telephone and Telegraph (CCITT) G.711 A-Law and u-Law	Microsoft Corp.	Provided for compatibility with telephone standards for Europe and North America.

Other codecs will obviously be developed as demand for the "magic codec" that can improve data quality while providing a fast network delivery increases rapidly.

Hardware requirements

More is always better...memory, disk space, processor speed, video speed, and so on! The processes involved in creating video and audio puts huge demands on the hardware to rapidly and accurately move large amounts of information to the computer. You quickly find yourself spending more time waiting than working if the hardware is not up to this. While no single list of hardware requirements is the ultimate answer because of unique needs and ever changing hardware availability, the following hardware components are representative of an efficient multimedia creation system regardless of the platform type. This list represents hardware for general multimedia creation, not for the creation of Microsoft NetShow specific content, which will be addressed in other sections.

Processor:	Intel-based CPU : 133MHz minimum	166 MHz or 200 MHz recommended
	Macintosh-based CPU : PowerMac 7200 minimum	8500+ recommended
RAM:	32 MB (minimum)	64 MB recommended
Hard disk space:	2 GB (minimum)	Ultra fast wide SCSI controller ; 4 GB + recommended
Video:	2-MB video memory with monitor capable of displaying 24-bit color (16.7 million colors) at a minimum of 800 x 600 resolution and a refresh rate of >72 Hz
Sound:	16-bit sound card and speakers
Video capture card:	Truevision, Intel, Videum, Miro, FAST, Data Translation, DPS, etc. PCI-based cards provide faster data transfer than ISA- or NuBus-based cards
Digital camera:	Sony, Kodak, Epson, Olympus, and Ricoh. An alternative to videotape as media source; most useful for frame, not motion capture
Videotape deck:	VHS (minimum)	S-Video or Beta deck recommended
Scanner:	24-bit color scanner
Microphone:	best quality possible; unidirectional
Audio mixer:
Graphics tablet :	Wacom, Kurta, or CalComp pen or puck tablets
Backup device:	Removable media such as Iomega or Syquest units

Software requirements

Software, like hardware, is continuously changing and improving, and the specific requirements of each content producer likely differ. If you intend to work with images, audio, and video, however, certain categories of software are necessities. This list is meant to be representative of some of the software useful in developing images, audio, video, and animation media elements.

Image editing (Raster)	Adobe Photoshop, Macromedia Xres, Microsoft Image Composer, JASC PaintShop Pro, Corel Draw! 7
Image editing (Vector)	Adobe Illustrator, Deneba Canvas, Macromedia Freehand, Macromedia Flash, Corel Draw! 7
Image utilities:	DeBabelizer, JASC Media Center, QuarterDeck HiJaak and HiJaak Pro, MetaTools Kai's Power Tools, Adobe Gallery Effects, Ulead Systems MPEG Converter
Audio recording and editing	Sound Forge, SoundEdit
Video editing	Adobe Premiere, Asymetrix Digital Video Producer, Hollywood FX, Adobe After Effects, MetaTools Final Effects, Ulead Systems Media Studio Pro. Corel Lumiere
Animation creation	Macromedia Director, Macromedia Flash, GIF Animator, GIF Construction Set

Creating Images

Planning

Two of the first questions you should ask when confronted with developing new images or re-purposing existing images for a multimedia project are:

What will be the delivery mechanism for the images?
Will the images be displayed on Web pages delivered over a dial-in modem, over a high-speed corporate network, from a CD-ROM drive, or from a user's local hard disk?

These questions are vital, since they determine many of the required characteristics of the images, such as color depth, color palette, how large the image is in size, and in what file format the image should be saved. All these factors determine how quickly a user receives an image, how quickly the image is displayed on the user's screen, and how good the quality of the image is. The answers to these questions lie in the concepts of bandwidth and data rate.

Bandwidth can be defined as the amount of information a phone line, network, or other delivery mechanism can transmit in a certain time period. For example, a 28.8 modem connection refers to the ability to transmit and receive 28,800 bits of data/information per second (28.8 Kbps), while a high-speed local area network in a corporation might be able to carry 10 million bits of data per second (10 Mbps). Data rate in the context of this section refers to the amount of data or bits transmitted in a certain time period, usually expressed as kilobits per second (Kbps), Kilobytes per second (Kbps), or Megabits per second (Mbps). This table gives an idea of the image size at different color depths and screen resolutions.

# of colors	Bits per pixel	Image Size at selected display resolutions
		640 x 480	800 x 600	1024 x 768
256	8	307 KB	480 KB	786 KB
65, 536	16	714 KB	960 KB	1.57 MB
16,777,216	24	921 KB	1.44 MB	2.36 MB

Platforms and Browsers

Once the issues are understood for determining how fast the media will be delivered to the user, the content creator must consider the user's (client's) hardware and software capabilities. For example, the computer operating system and the type and version of Internet browser have an impact on the file format of images displayed, the quality of images displayed, and the overall presentation of the content.

Image source

Two major sources determine where you'll get your images: existing images and newly created images.

Existing images

Existing images include artwork obtained from clip-art packages, stock photo art, or any other image that already exists in digital form. Working with images that already exist, commonly referred to as "re-purposing an image," generally involves four main activities:

Obtaining the image in digital form.
Resizing.
Color-palette manipulation for Web delivery.
Converting the file format for Web delivery.

If the image requires editing to change its visual quality and appearance, these steps generally are the same as those used on newly created images.

Creation of new images

Does an image that is to be captured exist in an analog form, or is the image to be created entirely in a digital domain? The answer to this question determines your next steps.

Digital: The creation of new digital images can be accomplished in two main ways:

(1) A graphic designer using image editing type of software.

(2) The image is captured with a digital camera or image scanner and transferred to the computer for manipulation with image editing software.

Analog: If the image exists in analog form, the equipment used to capture the image is generally the same as that used for capturing video images, which is covered in more detail under the Video section. In summary, however, the most common source of images in an analog form is videotape. Using a variety of video or still-image capture hardware, specific images are converted to digital form and transferred to the computer for storage and later editing.

This list of video and image capture hardware is by no means comprehensive, nor does it recommend any specific component. It is meant to give a representative range of equipment offerings available for video and image capture. Also, this list does NOT represent the video capture boards that have been tested with Microsoft NetShow; these are capture solutions for general multimedia work prior to any NetShow-specific production.

Video capture boards
Intel Smart Video Recorder III	PC	Intel Corp.
Bravado 1000	PC/Mac	Truevision Corp.
TARGA 1000 and 1000 Pro	PC/Mac	Truevision Corp.
TARGA 2000	PC/Mac	Truevision Corp.
miroVideo DC10-30 series	PC/Mac (30 only)	Miro
Videum and PCMCIA Video Capture	PC	Winnov
VideoVision	Mac	Radius
Image Manipulation System PCI150	PC	Imageman
Nogatech PCMCIA Conferencing Card	PC	Nogatech
Osprey 1000	PC	Osprey
SE100	PC	Creative Labs
Wakeboard Multimedia Pro	PC	Digital Video Arts
FAST's AV Master	PC	FAST
Broadway	PC/Mac	Data Translation
Hollywood and Perception series	PC/Mac	Digital Processing Systems, Inc.
AzeenaVision 500	PC	Azeena Technologies

Image capture boards
All of the above video capture boards can be used to capture individual frames.
Snappy Video Snapshot	Play Corp.
AZer FunTV	AVerMedia

Digital cameras: still
Sony	DSC-F1
Ricoh	RDC-2
Specom Technology	VisionCam
Kodak	DC20, DC40, DC50
Connectix	Color QuickCam
Apple	QuickTake 150
Canon	PowerShot 600
Nikon	E2N
Epson	PhotoPC
Olympus	D-200L

Digital cameras: video
Sony	DCR-VX1000
Sharp	Viewcam VL-D500U, VL-DC1U
Panasonic	DVCPRO AJ-D700, PV-DV1000
JVC	GR-DV1 MiniDV CyberCam

Image Scanners
AGFA SnapScan	300-dpi
HP Scanjet 5p	300-dpi
Mustek Paragon 600 II	300-dpi
Epson Expression 636	600-dpi
HP Scanjet 4c	600-dpi
MicroTeck ScanMaker	600-dpi
Mustek Paragon 1200	600-dpi
Nikon Scantouch 210	600-dpi
Ricoh FS2	600-dpi
UMAX Vista	600-dpi

Image File Formats

Understanding file formats is an important part of working with images. A variety of file formats have been created to serve various needs. Some file formats are optimized for high image quality or ease of editing and scaling to various sizes, while other formats are optimized for small size and rapid display on a computer monitor. As discussed earlier in Planning, knowing how the image is to be delivered can easily solve most of the issues related to the different file formats.

Generally, if the image is to be displayed within a browser, the choices are limited to JPEG, GIF, BMP, and more recently, PNG. However, this selection is likely to change rapidly as other technologies, such as MacroMedia's Shockwave for Freehand allow additional file formats to be delivered within a Web browser. Even with this limited number of choices for Internet delivery, it is important to have a basic understanding of the features of each image type.

Three types of images are commonly referenced when talking about file formats:

raster, or "bitmap/paint" type images
vector, or "draw" images
metafiles

Bitmap or raster images are made up of a series of pixels, each having distinct properties such as the color depth and pixel color. If a bitmap image is enlarged, you see a series of pixels, each of which can be edited with the appropriate image editing software. Because a bitmap image is made up of individual pixels, certain characteristics are associated with a bitmap:

The image generally does not scale well (the pixels get pulled apart or squeezed together).
The file size for a bitmap is larger than a vector or draw image.
The print output of a bitmap image may not be high quality.
Many compression schemes are applied to bitmap images to make the image smaller.
Special effects are easy to do with bitmap images, as each pixel can be independently altered.

Common bitmap file formats
BMP -- Windows and OS/2 Bitmap	A Windows and OS/2 file format that supports 1, 4, 8, and 24-bit color.
CLP -- Windows Clipboard	A Windows Clipboard native format, and can therefore contain many different kinds of data, including raster, vector, and metafile formats.
DIB -- Windows Device Independent Bitmap	A Windows and OS/2 file format that supports 1, 2, 4, 8, and 24-bit color.
EPS -- Encapsulated PostScript	Encapsulated PostScript Language is a a device-independent page description language for printers that displays on a computer monitor using a metafile. EPS supports a variety of drawing elements, advanced font handling, halftones, color effects and color separations. The bitmap portion of an EPS file contains a replica of the EPS image in a bitmap format such as TIFF.
GIF -- CompuServe	CompuServe Graphics Interchange Format is a bitmap owned by CompuServe. This file format has become extremely popular on the Internet due to its small size, transparent background, single and multiple images per file (animated GIFs), interlaced and non-interlaced images, and color depths of up to 256.
ICO -- Windows Icon	A Microsoft Windows file format used to display icons commonly representing files and programs. ICO supports 1, 4, 8, and 24-bit colors and can contain several images displayed as animated icons.
JPG - JPEG (Joint Photography Experts Group)	JPEG is a compression method for bitmap images. JPEG is a lossy compression scheme, meaning that there is a loss of some image data. That is, as the compression ratio increases, image size and quality decrease. JPEG is, however, a very common Internet file format for graphics because of the small image size and fairly high image quality even at high compression ratios. JPEG is generally much better with photographic images than line drawing simple bitmaps.
MAC -- MacPaint	The original Apple Macintosh format for black-and-white bitmapped images. This format supports only black and white and a palette of patterns with 38 standard fills. This file format does not work well with photographic images, and the image size is limited to 720 x 576 pixels.
MSP -- Microsoft Paint	Microsoft Paint is a raster format with file type and extension MSP. This format is used primarily with the Microsoft Windows 2.0 paint program. MSP uses RLE compression and supports two colors.
PCD -- Kodak Photo CD	This is a proprietary format developed by Eastman Kodak for storing bitmap images on CDs. Photo CD allows high-quality digital storage and manipulation of photographic images. PCD stores images in multiple resolutions: 64 x 96, 128 x 192, 256 x 285, 512 x 768, 1024 x 1536, 2048 x 3072, 4096 x 6144.
PCX -- PC Paintbrush	PC Paintbrush is a bitmap format developed originally by ZSoft Corp. for their paint program. PCX can store images of up to 64K x 64K pixels in 24-bit RGB color only.
PNG -- Portable Network Graphics	Due to licensing issues with Compuserve's GIF format, Portable Network was developed as a GIF replacement for use on the Internet. PNG supports RGB color and grayscale and color depths of 1, 2, 4, 16, 24, 32, 48, and 54 bits. PNG also supports interlace, transparencies, and an alpha channel.
PSD - Adobe Photoshop	This is a proprietary format developed by Adobe Systems for Photoshop. This format supports 1, 8, and 24-bit color with alpha channel and multi-layer image support and RGB, CMYK, indexed color palette, and grayscale color models.
RLE -- Windows Bitmap	A Microsoft Windows bitmap format used primarily for storing pictures and clip art. This format supports 24-bit color. RLE stores images in a compressed form using a lossless compression scheme referred to as run-length encoding.
TGA -- Truevision	Truevision Targa is a bitmap format supporting 8, 15, 16, 24, and 32-bit color and an alpha channel. TGA is no longer a standardized image format; many variations exist and not all programs that support TGA may be able to display a particular file with a TGA extension.
TIF -- TIFF	Tagged Image File Format is a bitmap with many variations; not all applications that support TIF will support a particular TIF file. TIF allows for 24 or 32-bit color, with the maximum color depth and color palette dependent on the class of TIF. Photographic images are commonly stored in a TIF format if cross-platform compatibility is an issue.

Vector, or "draw" images are made up of a series of lines that are stored as mathematical functions rather than individual pixels. If you enlarge a vector image you do not see individual pixels, only a line representation of the image. Only recently are software packages being developed that allow some of the special effects commonly associated with bitmap images. Vector images have unique characteristics compared to bitmap images.

The image scales very well.
The images print at extremely high quality.
The file size is very small in comparison to a bitmap.
More specialized software is required to edit vector/draw images.

Common vector/draw file formats
CGM -- Computer Graphics Metafile	Computer Graphics Metafile is a standard for vector graphics interchange between vector draw packages and contains a bitmap element, thereby gaining the metafile reference.
DXF-- AutoCAD	AutoCAD Drawing Exchange Format is a computer-aided design (CAD) vector. DXF is Autodesk's format for moving AutoCAD drawings to and from other applications.
EPS -- Encapsulated PostScript	Encapsulated PostScript Language is a a device-independent page description language for printers that displays on a computer monitor using a metafile. EPS supports a variety of drawing elements, advanced font handling, halftones, color effects, and color separations. The bitmap portion of an EPS file contains a replica of the EPS image in a bitmap format, such as TIFF.
GEM-- GEM Metafile	GEM Metafile is the native vector file format for applications running under the Graphical Environment Manager (GEM) desktop. GEM supports RGB color representation, Bezier curves, and graduated fills.

Metafiles are a combination of vector and bitmap components that attempt to capture the advantages of each format (primarily the output advantages and small file size of vector files and the on-screen display quality of bitmap images).

Common metafile file formats
AI --Adobe Illustrator	AI is the Adobe Illustrator native metafile file extension made up of Encapsulated PostScript Language. Just as with native EPS, AI supports a variety of drawing primitives, advanced font handling, halftones, color effects and color separations. Many AI files contain information that gives the reading application enough information about the AI file itself to enable it to be placed on a page and sent to a PostScript printer. This information may also contain a bitmap of the AI image in a format such as TIF.
CDR-- CorelDRAW	CorelDRAW is a true metafile format created by Corel supporting both raster and vector elements.
CMX-- Corel Metafile Exchange	Corel Metafile Exchange is a metafile format that supports raster and vector information and the full range of Pantone, RGB, and CMYK colors.
DRW-- Micrografx Draw	Micrografx Draw is a metafile format used for graphics information exchange by Micrografx applications. This format is a true metafile, handling Bezier curves, splines, parabolas, fountain fills, rasters, and compression. This format also supports up to 16.7 million colors (24-bit).
EMF-- Enhanced Metafile Format	Enhanced Metafile Format is a metafile format that is the native internal file format of Windows^{®^{95 and Windows NT^{®^{. EMF supports both raster and vector information and 24-bit RGB color. Most Windows 95 applications and the Windows Clipboard support this format.}}}}
EPS -- Encapsulated PostScript	Encapsulated PostScript Language is a a device-independent page description language for printers that displays on a computer monitor using a metafile. EPS supports a variety of drawing elements, advanced font handling, halftones, color effects, and color separations. The bitmap portion of an EPS file contains a replica of the EPS image in a bitmap format such as TIFF.
PICT1-- Macintosh PICT	Macintosh PICT1 is the native metafile format for the Macintosh Clipboard. PICT1 supports up to 8-bit color but does not, however, support grayscale, color corrections, or Bezier curves.
PICT2-- Macintosh PICT	Macintosh PICT2 is a newer metafile for the Macintosh Clipboard. PCT2 supports 8, 24, and 32-bit colors and the RGB color model. Similar to PICT1, PICT2 does not support Bezier curves.
WMF-- Windows Metafile	Windows Metafile Format is the native internal file format of Windows 3.x. WMF supports both raster and vector information and 24-bit RGB color. Most Windows applications and the Windows Clipboard support this format.

Sample Content Creation Scenario:

Depending on your hardware requirements and budget, you can obtain nearly any level of image production and editing functionality. There are software packages in the freeware and shareware domain that are available for free or for only a small cost, as well as professional graphics packages costing thousands of dollars and requiring dedicated hardware. This section will not cover the high-end graphic and hardware solutions generally sold as packaged bundles or for professional production purposes.

This content creation scenario is an overview of the planning and processes that go into creating a new image. It is not meant to be comprehensive, nor is it a tutorial on individual software packages. Consult the documentation for your specific software package if you need help on detailed usage. This scenario briefly presents the processes involved in several separate production areas. Many different tools are available to do this work; we've told the story behind just one way of creating a new image. The example covers:

1) New image creation

2) Image Editing - Special effects

3) Image Utilities

(a) Image sizing and cropping

(b) Color palette manipulation

(c) File format conversion

2:00 P.M. Wednesday:

Oh great, they need this logo by Friday in order to get it ready for the printer and implemented into the Web pages; good luck! This type of work always takes much more time than we have...we get agreement on the message, and then comes the hard stuff; getting buyoff on the colors, the size, the overall presentations, that "look-and-feel stuff," and on and on.

10:30 A.M. Thursday:

Well, amazingly I got a buyoff on some key issues. First of all, I'm only going to make the logo for the Web page right now; I won't go into all the reasons for these changes now. At least I have the requirements for the Web page, and the corporate colors finalized. So let's get going on actually producing the image!

I've got a choice of several apps that I could use to start creating the logo, but I'll use Adobe Photoshop. Much of the creation, editing, special effects, and image manipulation can be done within Photoshop, but I could use separate programs for at least some of the steps.

Within Photoshop I open a new file and set the resolution to 96 dpi and the color mode to RGB. I'll accept the default image size and white background for now; those details can be easily changed later.

I know the logo has mostly text with special effects on a background of a specific color, so I'll make sure I use layers with the text and background. That way I can edit the text and background separately, which will save me lots of time when I'm experimenting.

First I select the Text tool and then set the font, size, and attributes such as anti-aliased and bold. Go ahead and type my text; if the size looks right, let's go with it and move on.

I know I want the text to have a texture rather than a solid color or gradient. I've used a golden marble texture for some Web page backgrounds that looks really good; let's go with that. To get the texture to replace the text color, there are several ways to go. Depending on the program you might be using you could, for example, literally paint the texture onto the selected text.

Using Photoshop, I'll first get the texture onto the clipboard by opening the texture image and then copying it. Next, select the text using the selection tool. We're almost there; all you have to do now is Paste Into… Presto! You have the texture inside the selection and can now move the texture around until you get the exact effect you want.

OK, now comes some cool stuff; special effects! Several different special-effects packages such as MetaTools Kai's Power Tools and Alien Skins Black Box 2.0 or Eye Candy 3.0 are available, and most work with any of the popular image editing packages that accept plug-in filters.... I want to give my logo a 3-D effect, so I'll use the Eye Candy filters.

I'm using the Magic Wand selection tool to select only the text. Now I'll select the special-effect filter that gives the effect I want; let's go with Inner Bevel. I'll play with the bevel controls to get the beveled effect I want, all the time watching the image preview to see exactly what the changes do to the logo. I could play with these options all day but I'm running out of time, so this is fine for now.

Almost there; let's get the background finished and then we can add the logo. First I'll make a selection the size and shape of the background to hold the text. Now I'll set the color to match the corporate requirements. Because I already know the Pantone process color number, this is going to be easy. Select a custom color and then pick the Pantone color number; now fill in the previously selected background shape.

Only a few more steps and I might make it out of here after all! Because the text and the background are on separate layers, it's easy to re-position the text on the background. Once that's done, all that's left is to merge the two layers and apply one more effect that gives the overall 3-D look that I'm looking for.

Select the background color with the Magic Wand and then inverse the selection. Now, using the Outer Bevel special effects, I'll play with the settings until the depth, shadow, and overall 3-D look are just what I want. There, that's it!

Up to now I've been working in high color, which means I've got a color selection of over 65,000 colors. None of the browsers currently display this many colors, and the image would be too big anyway. So I'll reduce the palette and then save the logo in the file format for display in our Web pages.

Photoshop allows me to prepare an indexed palette of 256 colors, which gives a very good representation of high color (16-bit color) or true color (24-bit color) originals. I could also do this in other separate programs such as DeBabelizer or HiJaak. All I have to do is select Indexed color under the Image mode menu; presto! it looks great.

Only two steps left: set the final size of the logo and save it. I could have set the final size at the beginning when I opened a new page, but I wanted room to experiment. So I'll copy the logo and then open a new page with the dimensions of my final logo; paste it and presto! ready to save it.

Because this image is for a Web page I have limited options for file formats. Basically my choices are GIF and JPEG if I want the best compatibility between different browsers. As Photoshop doesn't allow me to save directly as a JPEG if I use an indexed 256 color palette, I'll save the logo as a GIF. If I want to change file formats later, there are several other programs that can do that for me, or I could save the logo as an RGB mode JPEG, and if necessary convert it to 256 colors in another program. But for now, we're done!

6:30 P.M. Thursday:

One key point I want to make here is the similarity between many of the steps I just went through when creating a new image and what I would have done if the image already existed. In fact, many times you'll use existing and new images and composite them to get the look you want.

Acquiring and Licensing Images:

As discussed earlier, there are two common methods of acquiring images; creation of new images and the re-purposing of existing images. It is becoming much easier to obtain existing images; professional stock photo services license and sell literally thousands of images in all common file formats. There is also a huge repository of public-domain images that can generally be used without licensing issues. For example, many of the materials the United States government produces are in the public domain, and numerous image public-domain libraries exist on the Internet. However, it is not always easy to be certain that a multimedia element is in the public domain.

Generally, if you produce your own new content or hire a professional design service to do the work for you, you will own the copyright to your creations and can use them as you wish. With the increasing availability of scanners and digital cameras, it is becoming very easy to "create" new images. However, remember that just because you go to the trouble of scanning an image or taking a digital picture doesn't mean you have legal right to that image. For example, you generally should get a legal release from anyone who appears in your pictures. You can't just take someone's picture and use it!

Numerous legal issues exist when dealing with any multimedia element, be it images, audio, video, or animation. This discussion is not meant to be a comprehensive coverage of legal issues. nor is it intended to provide legal advice; use the information only as a guideline for some of the issues you should be aware of when you're using existing media elements.

In order to use existing content in your multimedia project, three general types of arrangements are possible.

1. Copyright permissions and releases

How the content is to be used or altered has an impact on the legal steps you'll need to take to use the media. Permission letters or release forms signed by the owner of the copyright might suffice if the media is to be used unaltered. If you plan on editing the media, a more comprehensive legal agreement may be necessary. Either way, legal advice is important before you find out that one of the parties interested in your multimedia creation is interested for reasons other than your artistic creativity!

2. Individual or customer releases

Some states have laws against using any person's name and likeness (for example, a photograph) for commercial purposes without prior written consent. "Commercial purposes" usually doesn't just mean selling your multimedia product; it also means promoting any of your company's products. It is wise, if not legally required depending on the state where you reside, to get a signed release from the company and individual if you use any image that is identifiable as that person or company.

3. Trademark agreements

This can be a complex area, since your multimedia project may appear to promote or sponsor a trademarked product or media element. Again, it is important to gain legal advice before using any content that is clearly associated with another company's image or product line.

Creating Audio

Planning

You should ask the same types of questions when you're working with audio and video as you did with images:

What will be the delivery mechanism for the audio?
Will the audio be played from Web pages delivered over a dial-in modem, over a high speed corporate network, from a CD-ROM drive, or from a user's local hard disk?

Again, the answers to these questions are to be found in the concepts of bandwidth and data rate. These questions guide you in determining many of the characteristics of the audio that you will work with, such as the sampling rate, the overall quality of the audio, and whether the audio is mono or stereo. All of these factors determine how quickly a user hears the audio after their request and how good the quality of the audio is once they hear it.

As discussed in other sections, bandwidth is defined as the amount of information a network can carry in a certain time period. For example, a 28.8 modem connection refers to the ability to transmit and receive 28,800 bits of data/information per second (28.8 Kbps), while a high-speed local area network in a corporation might be able to carry 10 million bits (10 Mbps) or more of data per second. Data rate in the context of this section refers to the amount of data or bits transmitted in a certain time period and is usually expressed as KB Kilobytes (KB), Kilobits (Kb), Megabits (Mb), or Megabytes (MB). This table gives an idea of the amount of audio data associated with different perceived qualities of sound.

Audio Type	Sample rate (frequency)	Bits per sample	Mono or Stereo	Data rate/minute
Telephone -quality speech	11 kHz	8	Mono	662 KB
High -quality speech	11 kHz	16	Mono	1.32 MB
Music	22 kHz	16	Mono	2.65 MB
Music	22 kHz	16	Stereo	5.3 MB
CD-quality	44.1 kHz	16	Stereo	10.6 MB

Platforms and Browsers

Once the issues are understood for determining how fast the audio will be delivered to the user, the content creator must consider the user's (client's) hardware and software capabilities. For example, the computer operating system, the type and version of Internet browser, and the audio capabilities of the sound card and speakers determine the type and quality of audio the user hears. Microsoft's Internet Explorer version 3.01 or later allows a content producer to use both .wav and .mid audio files, giving a much greater richness to audio delivery on the Web.

Audio source

You've heard this story before; two major sources determine where you'll get your audio content; existing audio and newly created audio.

Existing audio

Existing audio includes sound clips from "sound clip" packages, stock photo art, or any other audio clip that already exists in digital form. Working with audio that already exists generally involves five main activities:

Obtaining the audio in digital form.
Normalizing the audio track.
Editing the audio for effects (optional).
Compressing the audio track.
Converting the file format for Web delivery.

If the audio requires editing to change the audio quality and fidelity, these steps generally are the same as those used on newly created audio material.

Creation of new audio material
Does the audio that is to be captured exist in an analog form, or is the audio to be created entirely in a digital domain?

Digital: The creation of new digital audio material can be summarized as:

Capturing audio directly to the computer through audio recording software and a microphone. In this process, the audio is never stored in an analog form that requires conversion to digital.

Using music creation software to create musical pieces for themes, background music, and so on. Generally, these are much longer pieces and involve first steps using MIDI file formats or other specialized audio formats. At some point following creation, the audio files may be converted to .wav for use in multiple Internet browsers for compatibility.

Examples of software packages that can develop entire musical scores
Microsoft Music Producer	Microsoft Corp.
Master Tracks Pro	Passport Designs, Inc.
ConcertWare+	Great Wave Software
Deluxe Music Construction Set	Electronic Arts
MidiSoft Music Set	MidiSoft Corp.

Analog: If the audio exists in analog form, the equipment used to record the audio is generally the same as that used for playback and recording of digital audio on your computer: a multimedia sound board. The most common source analog audio is videotape or a cassette recording. Using a variety of audio recording hardware, audio clips are converted to digital form and are directly transferred to the computer for storage and later editing.

This list of audio recording hardware is by no means comprehensive, nor does it recommend any specific component. It is meant to give a representative range of equipment offerings available for recording and playing back audio.

Audio recording boards
Sound Blaster AWE32/64	PC	Creative Labs	Playback and recording
Ensoniq	PC		Playback and recording
TARGA 1000 and 1000 Pro	PC/Mac	Truevision Corp	Recording only
TARGA 2000	PC/Mac	Truevision Corp	Recording only
miroVideo DC10-30 series	PC	Mi¢ ro	Recording only
Videum	PC	Winnov	Playback and recording
Turtle Beach Tahiti+	PC/MAC	Turtle Beach	Playback and recording

Production

"How audio works"

When the "tree falls in the forest, but no one is around," sound waves are generated in the air. These sound waves, or variations in air pressure are detected by the human ear or a microphone and are converted to electrical impulses. These impulses are then received and recognized by the brain as sound. Audio equipment such as a cassette tape recorder receives these varying electrical impulses from the microphone and stores them to be heard later as "sound." This sound is referred to as analog sound and is most commonly stored on magnetic tape.

The amount of change in air pressure is perceived as loudness; no change equals silence, while the more pressure, the louder the sound. The technical term for loudness is amplitude and the unit of measure is decibel. A reference point for "normal" sound is 0 dB; negative numbers represent a lower volume, while positive numbers represent more loudness.

The rate at which the sound wave changes is perceived as pitch, or what we think of as how high or low the tone is. The more technical term for pitch or the rate of the sound wave is frequency, commonly expressed as cycles per second or hertz (Hz). For example, the range of human hearing is generally said to be between 20 to 16-20,000 Hz. An audio system's bandwidth is its ability to reproduce a range of sound frequencies.

Different types of sounds have varying bandwidths; for example, speech is generally between 200 Hz and 5,000 Hz, while a full, rich audio CD will have a much wider bandwidth, possibly encompassing nearly the entire audible range.

Graphically, a sound wave looks like:

Ok, what does this have to do with creating audio for multimedia use? When you record or edit audio files, you'll see references to sample frequency, audio bandwidth, and sampling rate. Also, recording and editing processes with computers are based on digital rather than analog sound. Having a general understanding of what these terms mean gives you a better chance of getting the quality of audio that best fits your needs.

Audio file formats

Several different file formats exist for audio files. These formats can be conveniently divided into two categories: a digital format such as .wav, and music "information" files known as MIDI. Wave files are digital files that are commonly used for sound effects, sound clips, voice, and recorded music. MIDI, which stands for Musical Instrument Digital Interface files, are not sampled sounds in a digital form, but rather computer-synthesized sounds. MIDI sounds are a set of instructions that reproduce the pitch, tone, and duration of a sound. MIDI files are much smaller than digitized files such as .wav, but require more complex software and hardware for recording and playback.

Common digital audio file formats are:
.aif	A standard format for the Apple Macintosh.
.asf	The file format for Microsoft's NetShow streaming audio. This format can contain multiple data types in addition to audio only. This file format supports many compression schemes and works extremely well on low-bit-rate network connections; in other words, 14.4 and 28.8 Kbps.
.au	The standard format for sound files on the NeXT and Sun Sparc computers.
.avi	While not specifically a digital audio file, an .avi file can contain only an audio track or an audio track interleaved with video content and supports many compression schemes.
.ra	The file format for Progressive Network's streaming audio. It is optimized for low bit rates on 14.4 and 28.8 network connections.
.snd	This audio file format actually has many variations running on Macintosh, NeXT, Sun, and software-specific platforms.
.voc	A common sound format for PCs created by Creative labs. This file format supports both 8 and 16-bit data.
.vox	A file format used with specialized voice boards supporting a 4-bit ADPCM compression scheme, which expands to 16-bit on playback.
.wav	A standard Windows audio format that supports many compression schemes. This file format supports 8 and 16-bit data as both mono and stereo audio tracks.
Raw	The native 8 or 16-bit digital sound file format that is not compressed; this is commonly referred to as PCM audio.
.avi, .mov	While these are not specifically an audio format, these video file formats can contain audio, and in fact, can be made up of only audio content. In this case, the audio is optimized according to the audio codec used to compress the audio track.

Only a small discussion of MIDI is given in this section. Many excellent references exist if you are interested in this type of audio. MIDI is not actually a sound, but rather a control for electronic musical instruments such as synthesizers and drum machines. A .mid audio file is simply another type of digital data that can be stored, edited, and replayed from a computer.

MIDI format is not supported in Windows or Macintosh video files (.avi and .mov, respectively). Currently only Microsoft's Internet Explorer version 3.0 or later supports .mid files as a source of audio in a browser. So at this point, let's move on.

Recording audio sounds or digitizing audio is usually one of two processes:

Analog audio is captured (commonly referred to as sampled ) and is then converted to digital form.
Audio is sampled with digital equipment and no conversion to digital form is required.

In either case, the audio signal can be recorded and converted to a stereo or mono audio track. One important point is that if the original audio is not in stereo, saving the file as stereo doesn't make for a true stereo experience. Granted, the way the original stereo file was recorded and mixed has a great impact on it's stereo sound quality. However, merely splitting a mono file into two sound tracks to make it stereo is not the same as creating a quality, mixed stereo file.

When an analog audio file is digitized or sampled, two properties of the analog audio are recorded; the amplitude, or loudness of the audio, and the timespan of the audio signal. When audio is sampled, instantaneous recordings of the sound wave are made over time. The number of samples taken per second is referred to as the sampling rate. The sampling rate MUST be at least two times the highest audio frequency to be reproduced. Now here's where those things called "hertz" become real!

If you are to reproduce audio frequencies up to 10,000 Hz (10 kHz), a sampling rate of at least 20,000 (20 kHz) is required. Most sound recording boards for computers have the capability of recording sound at various sampling rates, generally 8 kHz, 11 kHz, 22 kHz, and 44.1 kHz. One trade-off of sampling at a high rate is sample size; the more samples you take per second, the larger the size of the audio file.

Now that a series of individual sampling points have been based on the sampling rate, one more step is required before we have a true digital audio sample. This step is referred to as quantization. Think of each individual sampling as having a number of levels or "richness". Sound recording boards are either 8-bit or 16-bit. This means that if audio is recorded as 8-bit sound, then there are 256 levels of richness. If the audio is recorded as 16-bit sound, the number of levels of richness increases to 65,536.

This means that a sound recorded as 16-bit audio has 256 times the amount of information as the same sound recorded as 8-bit audio. The overall quality of a 16-bit recording over an 8-bit one is very noticeable; the human ear is very sensitive to sound quality! This is a key point when recording audio; always record at the highest quality possible for your source material. You can always degrade an audio sample, but you can't make it better than what you started with. If you can get 44.1 kHz, 16-bit audio samples as your source material, do it. Everything will sound much better after you start editing and compressing the sounds later.

Compressing Audio Content

As presented earlier in this audio section, audio files can get quite large depending on their sample rate (frequency), the number of bits per sample (in other words, their perceived quality or richness), and whether they are stereo or mono. For example, a 22kHz, 16 bit mono music sample can require a data rate of 2.65 MB per minute to hear it as it was intended when it was digitized. The key to playing audio files over a network of limited bandwidth or storing audio files on a computer hard disk is compression with codecs. Many codecs exist for differing audio needs. Some codecs are optimized for voice, while others are best suited to low-to-high-bit-rate music samples. This list of codecs is not comprehensive, but provides a summary of the commonly used audio codecs in multimedia content.

Audio Codecs

Software Codec	Company	Best Used For
DSP Group TrueSpeech	DSP Group, Inc.	Low-to-mid-bit-rate voice-oriented sound; excellent all around audio codec.
Microsoft Network (MSN) Audio codec	Microsoft Corp.	Low-to-mid-bit-rate audio, both voice and music.
Lernout & Hauspie CELP 4.8kbit/s	Lernout & Hauspie	Low-bit-rate voice audio; not optimized for music.
Microsoft PCM Converter	Microsoft Corp.	Uncompressed, high-quality audio for higher bit-rate content.
Microsoft Adaptive Delta Pulse Code Modulation (ADPCM)	Microsoft Corp.	High-quality compressed audio for higher bit-rate content; good for audio stream associated with high-bit-rate video.
Microsoft Interactive Multimedia Association (IMA) ADPCM	Microsoft Corp.	High-quality compressed audio for higher bit-rate content.
Fraunhofer IIS MPEG Layer 3	Fraunhofer	High-quality audio with low bit-rates; mono and stereo; NetShow and Shockwave use FHG for audio; works better with "mixed audio signals" than pure voice; excellent all around audio codec.
G.723	Intel Corp.	High-quality audio for low bit rates; good voice and music reproduction.
Voxware	Voxware, Inc..	High-quality speech for low bit-rates
Microsoft Groupe Special Mobile (GSM) 6.10	Microsoft Corp.	Mid-to-high-bit-rate voice-oriented sound.
Microsoft Consultative Committee for International Telephone and Telegraph (CCITT) G.711 A-Law and u-Law	Microsoft Corp.	Provided for compatibility with telephone standards for Europe and North America.

Sample Content Creation Scenario

Just as with images and video, depending on your hardware requirements and budget, you can produce a range of sounds from beeps and clicks to full musical scores. You can find audio software as freeware or shareware, or you can pay thousands of dollars to get professional audio recording and editing capabilities. This section will not cover high-end professional audio production.

This content creation scenario is an overview of the planning and processes that go into creating a new audio clip. It is not meant to be comprehensive, nor is it a tutorial on individual software packages. Consult the documentation for your specific software package if you need help on detailed usage. This scenario briefly presents the processes involved in several separate production areas. Many different tools are available to do this work; I've told the story behind just one way of creating a new image. The example covers:

Recording a new sound from videotape
Audio editing
Audio file preparation

Several months ago, the company I work for made a video for Human Resources recruitment efforts. For this video, a professional music sound track was created. The project I'm working on right now is to develop an illustrated audio using NetShow; you can learn all about this by checking out the section on Creating Illustrated Audio using NetShow in the NetShow Content Creation Authoring Guide. Simply put, illustrated audio is a synchronized audio track with images, text captions like closed captioning on TV, and, if desired, different VB Script or JavaScript features, depending on how much scripting you want to do.

I already have the images I'll use; I just need some cool background music to listen to during various parts of the presentation. The sound track on our recruitment video would be exactly what I'm looking for. I just have to figure out what I actually need for the illustrated audio project and then how I get the audio from videotape to the format I need.
First, I have to determine what type of audio file I need for the final presentation, the browser, and the bandwidth of the network over which the content will be played. The browser and bandwidth issues are easy; this is for internal use in our company and IE 3 is our corporate standard. The network issue is determined for me; our information services group allows delivery of content up to 100 Kbps. The NetShow server acts as a gateway for this point; the server can be set up by an administrator to deliver content only if it is less than a predetermined data rate.
Because the audio track is on videotape, I have to digitize it from the tape source. This can be done in two general ways, one being much better than the other. The first way is to merely play the videotape and record the sound using a microphone plugged into my sound card. Sure, this is easy, but the quality would be terrible. I'd get too much background noise from the air conditioner, people talking in the hall, and so on. So that's not a viable option.
The other method, and the one I'll use, is to record the audio directly from the videotape into the sound card. This way I get a sample as close to the original as possible using a standard computer sound card and a VHS tape deck.
I won't go into the exact cable hookups between the VCR and the sound card, but basically the left and right audio-out terminals on the VCR are connected to the line-in or microphone terminal on the sound card. The next step of setting up and balancing the recording volume actually is the best determining factor for which sound card connection to use.
Balancing your recording levels means optimizing the recording volumes so that you get the highest possible distortion-free volume without getting extraneous background noise from other devices.
Open the Properties dialog box for the sound card in your machine and select the recording properties. This should display the master volume recording panel. Some trial and error is involved, so for starters, place the master volume level at mid-range and set the line-in or microphone level near the maximum setting.
You might see other devices such as, CD-ROM or Synthesizer (MIDI), in the recording panel. Mute all non-essential devices; this cuts down feedback and background noise.
Now open the sound recording program that you'll be using. One of the simplest is Sound Recorder, which comes with the accessories when Windows is installed. More sophisticated sound recording packages, such as Sound Forge from Sonic Foundry, should be used if you want to record very long pieces of audio. Sound Recorder records the audio to the memory (RAM) of your computer, so unless you have LOTS of RAM you'll only be able to record short audio clips. Look for an audio recording and editing package that allows you to record to disk so that you can capture much larger pieces of audio.
Start the playback of the VCR and select the Record button in your sound recording program. You should see a wave pattern display of the audio as it is playing. What you want to do to balance your recording volume is look for the highs and lows of the audio as it is being recorded. If the volume settings are too high, you see the wave patterns being clipped off and appearing to fill the recording window. If the settings are too low, you see only very small variations in the wave pattern.
If the sound is being clipped off because the volume is too high, move the line-in or microphone volume setting down until most of the volume range is displayed without clipping.
If the sound is too low, increase the volume setting. Ok, here's the trick with the line-in or microphone input. Depending on your sound card and the volume of the audio source, you might not get enough volume using the line-in input. In this case try the microphone input, as the volume sensitivity through the microphone input is generally greater than with the line-in input.
I only had to make minor volume adjustments with this audio source. Generally, if the audio has been professionally recorded, part of the recording process is to balance the source and get maximum volume with minimal distortion. This simplifies the recording process on my side, as I don't have to try and adjust for wide swings in volume. Keep this point in mind when you're recording audio that will be source material for some future work.
In my sound recording program, set the recording options for the highest-quality sound possible; with most sound boards this is 44.1 kHz, 16-bit stereo sound. I'll need to change this for the audio used in the illustrated audio, because this high-quality audio would be far too large to deliver over our corporate network, not to mention the Internet bandwidth.
OK, let's rewind the tape and start the playback from the point I want to record. Hit the record button and stand by. When finished recording, save the file immediately. Remember the last time you'd didn't save the file?
In the Save dialog box, many different options are available depending on what codecs I've installed previously. At this point, I want to keep the audio as high-quality as possible. I'll select the uncompressed attribute, which saves the file as PCM, with the same options used for the recording. In this case, 44.1 kHz, 16-bit audio.
Now that I have my digital source file, I need to do some editing and then compress the file into a final delivery format. I'm going to use Sound Forge to edit and compress my file. So I reopen the file and first play it back to see that it sounds like I expect it to.
There are a few clicks, pops, and other artifact noises that I want to get rid of. Sound Forge and some other robust audio editing tools have the capabilities to analyze the audio file and remove extraneous noise. After I've done this and put a fade in and fade out at the beginning and ending, I'm ready to compress the file and degrade it to a bandwidth closer to what I'll need in the final piece.
I save the file as a 22 kHz, 16-bit stereo using the MS ASPCM codec. This does some compression, but retains very high-quality sound. The file is probably still too large for my illustrated audio, but I'll use the ASF Editor for final determination of a codec. If you'd like to get more information on the ASF Editor, take a look at the Content Creation Authoring Guide section on Creating Illustrated Audio or refer to the documentation on the ASF Editor that is available when you download the NetShow tools.
The audio file is basically ready for use in a specific project, so I'll move on to the next steps of making the illustrated audio. Talk with you later.

Acquiring and Licensing Audio Content:

As discussed earlier, there are two common methods of acquiring audio; creation of new sounds and the use of existing sound material. It is becoming much easier to obtain existing audio clips; professional audio services license and sell literally thousands of sound clips in all common file formats. There is also a huge repository of public-domain audio that can generally be used without having to deal with licensing issues. For example, many sites on the Web have free sound clips that can be used in your multimedia projects.

However, it is MUCH harder to get complete audio pieces such as music. Most music is created and composed for commercial purposes and this immediately makes the legal issues much more complex, or at least much more expensive.

Generally, if you produce your own new content or hire a professional design service to do the work for you, you will own the copyright to your creations and can use them as you wish. With the increasing availability of high-quality audio recorders, it is becoming very easy to "record" new audio material. However, remember that just because you go to the trouble, for example, of recording an interview conducted personally, or recording a concert or a movie or TV sound track, this doesn't mean you have the legal right to use that recording. For example, you should get a legal release from anyone who can be identified in your recordings and get legal permission before using any commercial presentations. This type of media has strict copyrights to prevent this kind of use and distribution.

Remember that the legal issues involved in dealing with any multimedia content that you do not own are complex. This discussion is not meant to be a comprehensive coverage of legal issues nor is it indended to provide legal advice; use the information only as a guideline for some of the issues you should be aware of when you're using existing media elements.

In order to use existing content in your multimedia project, three general types of arrangements are possible.

1. Copyright permissions and releases

2. Individual or customer releases

Some states have laws against using any person's name and likeness, such as their voice, for commercial purposes without prior written consent. "Commercial purposes" usually doesn't mean selling just your multimedia project; it also means promoting any of your company's products. It is wise, if not legally required depending on the state where you reside, to get a signed release from the company and individual if you use any image that is identifiable as that person or company.

3. Trademark agreements

Video

Before we start discussing the processes involved in creating video, a "quick and dirty" definition of video is in order. A key factor in what is perceived as video is the frames rate or frames per second (fps). The human eye is very sensitive to motion. Below about 8 fps, motion is generally seen as varying degrees of jerky slide shows. However, depending on the type of scenes and motion between scenes, there is some variation in the fps and the interpretation as video. Sometimes this is not as obvious as it seems!

For example, a close-up video of a person, commonly referred to as a "talking head," might seem to allow for a lower frame rate and still look like video because there isn't as much going on in each scene. However, when the human eye has fewer moving details to focus on and these details are largely associated with mouth movement and the synchronization of audio, watch out. The end result is often interpreted as very poor video because the lip synch with the audio is poor. Other factors, such as how close the camera is, how stable it is, and how many head movements the person is making, all impact the final video outcome.

Compare this to the fast-paced "MTV-like or sports" video that has rapid cuts to different scenes and no voice synchronization, and often high-motion video such as this can be reduced as much or more in frame rate and still be acceptable. The eye doesn't have as many details to focus on, the scenes are often presented in a jerky succession anyway, and the audio is not synchronized to lip movements. These factors make the end result look better than one might think when considering just the frame rate.

So there is still a degree of "black magic" associated with creating great-looking video. These issues and more are discussed in greater detail in the video section of Multimedia Basics and in the Creating NetShow Video section of this Web site.

Creating Video

Planning

When working with video, the initial planning questions are the same as for images or audio:

What will be the delivery mechanism for the video?
"Will the video be played from Web pages delivered over a dial-in modem, over a high speed corporate network, from a CD-ROM drive, or from a user's local hard disk?"

Determining the answers to these questions turns out to be related to bandwidth and data-rate issues similar to those encountered when working with audio. These questions guide you in determining many of the characteristics of the video, such as frame rate, data rate, frame size, and codec selection. These factors determine how quickly you see the video, the video quality, and the synchronization of audio to the video stream.

As discussed in other sections, bandwidth is defined as the amount of information a network can carry in a certain time period. For example, a 28.8 modem connection refers to the ability to transmit and receive 28,800 bits of data/information per second (28.8 Kbps), while a high-speed local area network in a corporation might be able to carry 10 million bits (10 Mbps) or more of data per second. Data rate in the context of this section refers to the amount of data or bits transmitted in a certain time period and usually expressed as Kilobits (Kb), Kilobytes (KB) Megabits (Mb), or Megabytes (MB). This table gives an idea of the amount of video data associated with different sizes and frame rates of video.

Video Window Size	Frame rate	Bits per sample	Data rate/second	Data rate/minute
640 x 480	30	24	27.65 MB	1,660 MB
	15	24	13.83 MB	830 MB
320 x 240	15	24	3.46 MB	207 MB
	10	24	2.30 MB	138 MB
160 x 120	15	24	865 KB	52 MB
	10	24	575 KB	34.5 MB

These video window sizes are used for example purposes only; once you start working with different video codecs you'll find that your final choice for a window size depends on your objectives for the content, an understanding of your audience, and the codec selected for preparation of the video file. The major point is that video requires huge amounts of data to be transferred even when targeting very small frame sizes. The goals of effective video production are to understand how the balance of video quality and data rates are related and to develop a process that optimizes the video quality at the targeted data rate.

Platforms and Browsers

Once the issues are understood for determining how fast the video will be delivered to the user, the content creator must consider the user's (client's) hardware and software capabilities. For example, the computer operating system, the type and version of Internet browser, and the graphic and video capabilities of the video card in the user's computer determine the overall video quality displayed to the user.

Video source

You've heard this story before; just like with audio, two major sources determine where you'll get your video content; existing video and newly created video.

Existing video

Existing video includes clips from "video clip" packages, stock video collections, or any other video clip that already exists in digital form. Working with video that already exists generally involves five main activities:

Obtaining the video in digital form.
Editing the video for frame rate, frame size, and start and end points.
Editing the video for effects (optional).
Compressing the video track.
Converting the file format for Web delivery.

Creation of new video material.
Does the video that is to be captured exist in an analog form or is the video to be created entirely in a digital domain?

If you have read the Multimedia Basics section "Creating Audio," you will quickly notice the similarity between the planning processes for audio and video. In fact, most of the time when working with video, you have an associated audio component. A good way to think of video is as a combination of three elements, (images, audio, and video). Video is a superset of concepts and processes used with images and audio.

Digital: The creation of new digital video material can be summarized as:

Capturing video directly to the computer using a digital video camera. In this process the video is never stored in an analog form that requires conversion to digital.

Using animation software such as MacroMedia's Director or Flash that allows the synchronization of multiple images along a timeline and path and then the final output as a digital video file such as an .avi or QuickTime .mov file.

Analog: If the video exists in analog form, special video capture equipment is required. In most cases this is a video capture card that is installed in a computer. The most common source of analog video is videotape, either in VHS or S-VHS format. Other videotape formats such as Hi-8 or Beta-SP are sometimes used. Another source of analog video is directly capturing a television signal such as NTSC, PAL, or SECAM.

Video capture boards
Intel Smart Video Recorder III	PC	Intel Corp.
Bravado 1000	PC/Mac	Truevision Corp
TARGA 1000 and 1000 Pro	PC/Mac	Truevision Corp.
TARGA 2000	PC/Mac	Truevision Corp.
miroVideo DC10-30 series	PC	Miro
Videum and PCMCIA Video Capture	PC	Winnov
VideoVision	Mac	Radius
Image Manipulation System PCI150	PC	Imageman
Nogatech PCMCIA Conferencing Card	PC	Nogatech
Osprey 1000	PC	Osprey
SE100	PC	Creative Labs
Wakeboard Multimedia Pro	PC	Digital Video Arts
FAST's AV Master	PC	FAST
Broadway	PC/Mac	Data Translation
Hollywood and Perception series	PC/Mac	Digital Processing Systems, Inc.
AzeenaVision 500	PC	Azeena Technologies

Digital cameras-motion
Sony	DCR-VX1000
Sharp	Viewcam VL-D500U, VL-DC1U
Panasonic	DVCPRO AJ-D700, PV-DV1000
JVC	GR-DV1 MiniDV CyberCam

Production

The term "video production" has a definitional problem! Are we referring to something analogous to Hollywood's movie production or are we talking about producing video in digital form for use on a computer? We'll address both of these scenarios since they are, from the perspective of this discussion, portions of an overall process of creating, capturing, or digitizing video for use in a computer-based environment. Whether you are creating, capturing, or digitizing video, two processes are likely to be involved:

1. Analog video is created, and then captured and converted to digital form.
2. Video is sampled or created directly into digital form using a digital camera.

In either case, the video signal is ultimately stored as a pure digital signal for editing and output using a computer and a variety of software applications.

File Formats

Several different file formats exist for working with digital video files on a computer. These formats can be conveniently divided into three categories; streaming digital video files (.asf), digital video files (.avi and .mov), and hardware-based video files (.mpg). ASF files are digital files that are used to deliver streaming video using Microsoft's NetShow server and client player. The primary advantage of this file format is that the file doesn't have to be downloaded to your computer and the file can start playing without downloading it to your hard disk. AVI and MOV are currently the most common digital video file formats. MPG is a file format based on a hardware-dependent compression scheme.

Common digital video file formats
.asf	Microsoft Active Streaming format The file format for Microsoft's NetShow streaming video. This format can contain multiple data types in addition to video. It supports many compression schemes and works well on a wide range of network connections, from low Internet bit rates of 14.4 and 28.8 Kbps to higher intranet bit rates of 56 Kbps and above.
.avi	Audio-Video interleaved A standard video format for computers operating on Microsoft Windows or IBM OS/2.
.mov	QuickTime video The file format developed by Apple Computer that displays video, audio, and animation on Macintosh and Windows.
.mpg	Motion Picture Expert Group (MPEG) A hardware-based motion video standard developed to approximate VCR-quality video at 30 fps.

Producing New Analog Video

Have you ever wanted to make movies like Steven Spielberg? Well, maybe you won't have millions of dollars to spend on your next video project, but some of the ideas we'll discuss next are part of a Hollywood director's vocabulary.

Just how involved you'll be in the actual filming process depends on the size of the team you can assemble, the equipment you have available, and the amount of time and money you can spend. Regardless of whether you hire others to do most of the work by a video film group or you work with a one or two person team, knowing something about the concepts and processes involved in producing new analog video will prove to be useful to any video project.

The actual process of recording the video or "shots" isn't as simple as merely taking a camera and shooting the scenes you want. Shooting video is a complex process made up of many components. The foundations for shooting video are camera angle, camera movement, and composition.

Camera angle is the position of the camera, which gives the viewpoint or perspective of the story to be presented by the video. The video should have a frame or visual area in which the important action occurs. A common rule is to mentally divide the visual area up into 9 equal rectangles. Then align the action at one of the four intersections. This provides a visual reference point for action, making it easier for the viewer to keep focused on the action and as you'll see later, helps keep the amount of camera motion to a minimum. This point is important later when you have to be concerned about the overall motion in a scene and how the effectively the codecs interpret this motion.

Camera movement is a very important point when shooting video for compression into low or medium data rates. The details of codec selection are discussed in the following section, "Compression and Storage." Suffice it for now to say that the movements of all objects within a video scene, whether intentional or unintentional, impact the overall video compression.

However, camera movement might be a vital part of the video shoot; so let's cover some of the important types of camera movements and when to use or not use them. One of the most common mistakes in shooting video is allowing excess movement, be it intentional or unintentional. Intentional movement, while planned, might actually detract from the video content. For example, the viewer could have a hard time focusing on the main point of the video shot, or you might find that the video can't be efficiently compressed because of the way the codec interprets motion. This topic is discussed in more detail in the "Video Codec" section later in this article. Unintentional motion is defined here as movement resulting from an unstable camera, such as when you're holding the camera and walking.

If I were to give just one suggestion for shooting video, it would be to use a tripod for the camera. Even though some newer consumer video cameras have "auto-stabilizing" capabilities, there is only so much they can do to counteract the "jittery hand-held camera." If you can't use a tripod, try to position the camera on some stable object, such as against a wall or on a table.

An overview of commonly used camera movements in video recording is:

Panning

The camera remains stationary and follows the action horizontally. This technique is used in presenting a panorama of a landscape.

Tracking

The camera moves on a horizontal plane across the scene following the action.

Tilting

The camera remains stationary but moves up and down (for example, when showing the height of a tall tree).

Craneing

The camera moves vertically following the action.

Zooming

The camera remains stationary, but an object you are focused on increases and decreases in size. A common example is when a scene with a person in it "zooms" in to show a close-up of the person's head.

Dollying

The camera moves in and out from the scene (for example, if the video is taken from the perspective of a person walking down a trail).

The important point to realize in all these camera movements is exactly that-movement. Anytime the individual frames of video are changing their perspective, in addition to motion occurring in the scene itself, the complexity is greatly increased. This increased scene complexity makes it much more difficult to compress the video into a high-quality, low or medium bit rate.

Composition is closely associated with camera movement. Composition in the context of this discussion refers to the ratio between the amount of action and the overall view or size of the scene. For example, if the key action in a scene is a person standing in front of fountain and the overall view is from far away, this would be referred to as a long shot. If the person, from head to foot, took up most of the frame, this would be a medium shot. A close-up, as you can guess, is when the person's face fills the frame.

The important point in composition, relative to producing video that works well in a compressed form delivered over a network, is that rapid or numerous changes between different types of shots add to the complexity of a scene.

Production equipment

As discussed earlier, a variety of video capturing and recording devices exist. In this discussion of producing new video, we're focusing on the actual video recording process. Therefore, two main categories of video cameras should be mentioned: analog video cameras and digital video cameras. We will not discuss specific brands and features of either type of video camera; we will mention several points related to quality and ease of use that you should consider in creating a multimedia presentation .

The old saying "garbage in, garbage out" is especially true when producing video for use in multimedia presentations. One of the main keys to great video is great equipment. The money you spend when buying or renting a high-end video camera (rather than just using your consumer camera that's been sitting in the closet) pays off tremendously in the end result. With that said, here are a few guidelines regarding video camera equipment.

Currently, analog video cameras are much more common than digital video cameras. This is likely to change in the future, but for now you're more likely to work in the analog realm while doing video recording. Two general categories of analog cameras are consumer-level and commercial-level. This division is based not so much on cost as it is on quality.

Consumer-level cameras commonly record in VHS or 8mm format. The commercial-level cousins are S-VHS and Hi-8 format, respectively. Even higher-quality commercial cameras use the Beta format. Based on the premise that your video and audio source should be as high quality as possible, the best choices for an analog camera are:

Betacam
S-VHS or Hi-8
VHS or 8-mm

Once you have the video recorded, remember that you still have to go through a conversion to digital form. This requires three important equipment-related items: first, a video player capable of working with your video source format; secondly, the VCR with output connections of the appropriate type for your capture card; and lastly, a video capture card that can accept the video format of your recording. A key point, often overlooked, is that the combination of playback, capture equipment, and video recording format are incompatible. For example, you can record in high quality Beta format, but if your VCR can't play Beta tapes or your capture card doesn't have suitable connections for a Beta VCR, you're in big trouble. In summary, plan, plan, and then plan again!

Using a digital video camera removes many of the issues described above for analog video cameras. First, you don't have to do an analog-to-digital conversion, which can decrease video quality even with the best equipment and processes. Second, you don't have to deal with video format compatibility issues between the camera, video player, and capture card.

Capturing and converting existing video

Once you have your video, the next steps depend on whether the video is in analog or digital form. If it is already in a digital form, the capture/digitization process, described next, is unnecessary. Your next concern would be the editing and compression steps. However, if you have an analog video source, we have to discuss the capture/digitization process necessary to convert your analog video into digital video.

Equipment

One of the first steps necessary for digitizing video is getting a video capture card. As with most things in life and, as you're finding out, with multimedia, money plays a big part. In general, video capture boards that are capable of very high-quality video capture cost considerably more than the "consumer-level" video capture boards. To put this in perspective: consumer-level video capture boards can be purchased for a sum ranging from a few hundred dollars up to around $1,000, while the production-level video capture boards can easily run up to $3,000 or $4,000. If you want to go with a dedicated video system for video capture and video editing, be prepared to spend $10,000 or more.

This list of video and image capture hardware is by no means comprehensive, nor does it recommend any specific component. It is meant to give a representative range of equipment costs available for video and image capture. For a more complete list of video capture equipment, see the earlier section in this article.

Video capture boards
Under $1000
Intel Smart Video Recorder III	PC	Intel Corp.
Bravado 1000	PC/Mac	Truevision Corp.
miroVideo DC10, 20 & 30 series	PC	Miro
Videum	PC	Winnov
Broadway	PC/Mac	Data Translation
FAST's AV Master	PC	FAST
Over $1000
Hollywood and Perception series	PC/Mac	Digital Processing Systems, Inc.
TARGA 1000 and 1000 Pro	PC/Mac	Truevision Corp.
TARGA 2000 and 2000 DTX	PC/Mac	Truevision Corp.
VideoVision	Mac	Radius

Process

Once you have installed a video capture board, turn your attention to the software that you'll use to actually perform the video capture. Video capture software can be divided into two groups: the software accompanying your video capture board, and video editing software that can communicate with your video capture board. Generally, dedicated video editing software has more functionality than hardware-specific software provided with your board.

Several commonly used video capture software packages are:

Adobe Premiere
Microsoft VidCap
Asymetrix Digital Video Producer
Ulead Systems Media Studio Pro
Corel Lumiere

After the video stream has been digitized, the main objective is to set the desired frame rate, frame size, and data rate. These factors have direct impact on the success of your delivery. Improper selection of any these video attributes causes quality and bandwidth problems. If the video requires editing such as removing unwanted frames or adding transitions, most video editing packages allow you to do this at the same time that you set the size and the frame and data rates.

Setting these attributes is directly linked to the compression scheme you select. Certain codecs are more effective when, for example, certain frame sizes are selected. VDOWave is optimally designed to work best with 160 x 112 frame sizes, while MPEG-4 allows a much wider selection of frame sizes. Each attribute has its own characteristics that you need to understand in order to select the most effective codec for your content. The next section, "Compression and Storage," should be read carefully to answer many of your questions concerning this production phase.

If you want a general overview of the video capture through editing and conversion process, read the "Sample Content Creation Scenario" section later in this article.

Compressing Video Content

As discussed earlier in other sections, video files can get quite large depending on their frame rate, frame size, and color depth. For example, a 320 x 240, 15-fps video clip captured at 24-bit color can require a data rate of 3.46 MB per second to see and hear it. The key to playing video files over a network of limited bandwidth or storing video files on a computer hard disk is compression with codecs. Many codecs are available for differing video needs. For example, some codecs are optimized for low bit rates, while others are optimized for high bit rates. Others have optimal frame sizes and frame rates. As part of the planning process, it's important to determine what these factors are for your content, and with this understanding you'll be better able to select a codec optimal for your video needs. This list of codecs is not comprehensive, but provides a summary of video codecs commonly used when creating multimedia content.

Video Codecs

Note: In the context of this discussion of codecs, bit rate refers to the approximate network data rates of:

Low < 40 Kbps or lower

Medium 50 to 150 Kbps

High > than 150 Kbps

As a point of reference, a 2x CD-ROM drive has a data transfer rate of approximately 300 Kbps or 2.4 MBps; and we think a CD-ROM drive is slow!

Software Codec	Company	Best Used For
Indeo Video Interactive R4.1	Intel Corp.	Full motion, 24-bit video at mid-to-high-bit-rates; slow compression times even on fast machines; higher-quality video than Indeo 3.2, Microsoft Video, or Microsoft RLE; video displays best on fast processors.
Indeo Video R3.2	Intel Corp.	Useful for 24-bit video at mid-to-high-bit-rates; best used on raw video source media that hasn't been previously compressed with another lossy compressor; has low CPU utilization; quality comparable to Cinepak with lower bit-rates.
VDOnet VDOwave	VDOnet Corp.	Low-to-mid-bit-rate video; small window sizes; optimized for Internet delivery of high-quality, low-rate video.
H.263	Intel Corp.	Video telephony standard designed for low-bit-rate video over 28.8 Kbps connections.
MPEG-4	Microsoft Corp.	A limited implementation of the MPEG-4 video standard; excellent for low-to-mid bit-rate video delivery.
TrueMotionÒ RT (Duck)	The Duck Corp.	Full motion, mid-to-high-bit-rate video. Provides excellent video quality and playback performance.
ClearVideo	Iterated Co.	Low-bit-rate video delivery for Video for Windows and QuickTime platforms.
Cinepak	Radius Corp.	Full motion, high-bit-rate video. Provides good video quality with good playback performance.
Microsoft Video 1	Microsoft Corp.	Full motion, moderate quality video with low CPU overhead, 320 x 240 or smaller, 15 fps or less. Supports only 8-bit (256) color.
Microsoft Run-Length Encoding (RLE)	Microsoft Corp.	Intended for compressing clean graphic images such as bitmaps. It has a low CPU overhead, but does not handle rapid, complex scene changes well.
Indeo Video Raw (YVU9C)	Intel Corp.	Useful for capturing uncompressed video of high quality. This is NOT the same as capturing with no compression, in other words, raw video. Large files and high bit rates, but excellent image quality. This is the BEST source, along with Raw video, of video content to be compressed by other methods later.
Hardware Codecs
Motion-JPEG	ISO and Consultative Committee, International Telegraph and Telephone	Intended for compressing a series of JPEG images. No audio capabilities are available with Motion JPEG. Motion JPEG is generally quicker in displaying images than MPEG; however, the file size is two to three times larger than an equivalent MPEG video.
MPEG-1	ISO and Consultative Committee, International Telegraph and Telephone	Intended for delivery of high-quality, 30-fps motion video at a frame size of 352 x 240 compressed to a data rate of approximately 150 Kbps (in other words, equal to single-speed CD-ROM performance).
MPEG-2	ISO and Consultative Committee, International Telegraph and Telephone	Intended as a broadcast video standard providing 720 x 480 playback at 30 fps. To achieve this high quality, the data rate is very high, ranging from about 500 Kbps to greater than 2 MBps. Because of this high data rate, MPEG-2 is currently best suited for dedicated video servers.
DVI (Digital Video Interactive)	Intel Corp.	Based on a chip set developed by Intel and used by IBM for video and audio compression and decompression. The software portion of DVI requires this special, proprietary hardware, hence the term hardware codec. To date, this codec has not received widespread use; however, more recent hardware advances might change this scenario. Currently this codec is unlikely to be part of the content producer's arsenal for compressing video.

Sample Content Creation Scenario

The creation of video content is the compilation of work with images, audio, and video. Video creation may involve only video capture and conversion steps, or you might want to do additional video and audio editing. Even basic video creation commonly involves a wide variety of content creation software applications, from image creation and editing, to audio and video creation and editing. This section will not cover high-end professional video production.

This content creation scenario is an overview of the planning and processes that go into creating a new video clip. Just as with the image and audio content creation scenarios, it is not meant to be comprehensive, nor is it a tutorial on individual software packages. Consult the documentation for your specific software package if you need help on detailed usage. This scenario briefly presents the processes involved in producing a video clip. Many different tools are available to do this work; this story tells about one way of creating a new video clip from videotape. The example covers:

Capturing video from videotape
Video editing
Audio editing
Video file compression

As I described in the image creation scenario, Human Resources recently had produced videotape to help their recruitment efforts. An outside video production company was hired to develop the script, do the actual video shoot, provide the talent, and ultimately deliver a finished videotape in both S-VHS and VHS format. Human Resources wants to use Microsoft NetShow to develop a series of video recruitment clips for the Internet; the creation of these Internet ready video clips is my task. This scenario doesn't describe the details of producing the NetShow delivered content; you can learn more about this by checking out the section on Creating Video using NetShow in the Content Creation Authoring Guide. This section focuses on the steps through the production of the NetShow streaming file format, ASF.

The videotape is available in two formats; S-VHS and the more common, VHS. A couple of decisions are important here. First, I should select the highest-quality source available, which in this case is the S-VHS videotape. However, this format requires an S-VHS player and a video capture card with S video connectors. So far, so good, as my equipment does meet both requirements.

First, I have to get the S-VHS deck hooked up to the video capture board in my computer. I'm using a Truevision Targa 1000 video capture board so the cable hookup is going to be different if you have another type video capture board. But never fear, the general idea is the same. OK, where did I put those S-Video and audio cables?

Because the audio track is on videotape, I have to digitize it from the source tape. This can be done in two general ways, one being much better than the other. The first way is to merely play the videotape and record the sound using a microphone plugged into my sound card. Sure, this is easy, but the quality won't be what I want. I'd get too much background noise from the air conditioner, people talking in the hall, and so on. So that's not an option.

The other method, and the one I'll use, is the same as described in the audio content creation scenario; record the audio directly into the sound card from the videotape. This way the sample comes out as close to the original as is possible using a standard computer sound card. The one difference from the audio only scenario is that the audio is captured during the video capture process, not as a separate recording step.

One of the most important early steps when working with audio is balancing and optimizing your recording levels. This means optimizing the recording volumes so you get the highest possible distortion-free volume without getting extraneous background noise from other devices.

One way to do this is to open the master volume recording panel and select the recording properties option. Some trial and error is involved, so for starters, place the master volume level at mid-range and set the line-in or microphone level near the maximum setting.

You might see other devices such as CD-ROM or Synthesizer (MIDI) in the recording panel. You should mute all non-essential devices; this cuts down feedback and background noise.

Even though I won't be capturing the audio in a separate step, I need to use a sound recording application to judge the optimal volume setting. One of the simplest is Sound Recorder, which comes with the accessories when Windows is installed. More sophisticated sound recording packages, such as Sound Forge from Sonic Foundry, should be used if you want to record very long pieces of audio. Sound Recorder records the audio to the memory (RAM) of your computer, so unless you have LOTS of RAM you'll only be able to record short audio clips. For this step, that's not a problem; this generally takes only a few minutes.

Start the playback of the VCR and select the Record button in your sound recording program. You should see a wave pattern display of the audio as it is playing; a volume display also is shown in the master volume control panel. What you want to do to balance your recording volume is look for the highs and lows of the audio as it is being recorded. If the volume settings are too high, you see the wave patterns being clipped off and appearing to fill the recording window. If the settings are too low, you see only very small variations in the wave pattern. In the master volume control panel you'll see red when the volume is too high and green when the volume is within normal level.

If the sound is being clipped off because the volume is too high, decrease the line-in or microphone volume setting until most of the volume range is displayed without clipping.

If the sound is too low, increase the volume setting. Ok, here's the trick with the line-in or microphone input. Depending on your sound card and the volume of the audio source, you might not get enough volume using the line-in input. In this case try the microphone input, as the volume sensitivity through the microphone input is generally greater than with the line-in input.

Because I've worked with this videotape before, I had an idea of the settings I would need. Generally, if the audio has been professionally recorded, part of the recording process is to balance the source and get maximum volume with minimal distortion. This simplifies the recording process on my side, as I don't have to try and adjust for wide swings in volume. Keep this point in mind when you're recording audio that will be source material for some future work.

The audio and video is captured using Adobe Premiere. Other video editing programs can also capture video and in most cases the process is very similar to that which I will describe here. Once Premiere is started, I go to the File menu and select Capture>Movie Capture.

I want to make sure the video signal is actually passing through the video capture board. Just start the videotape playing and I should see the video in the Preview window. The first thing to check if you don't see the video is the Video Source dialog box. Notice that after you select the Capture menu item a new menu appears, Movie Capture. This menu has all the settings for video, audio, and video source. In the Video Source dialog box you'll see two choices; VHS and S-Video. In my case I'm working with an S-VHS deck so I've selected the S-VHS setting. If this setting is right and you still don't get any video signal, check the cables. But today things are going great; I see and hear my video! At this point I'm ready to set my video and audio recording options to suit my project needs.

Under the Movie Capture menu I see Recording Options and the Audio Recording Options. Most of the choices in these dialog boxes are straightforward, but it's always a good idea to check each setting every time you capture video. The main point is that I want to capture the video and audio at the highest quality possible and then do any editing and compression after I have a high-quality digital video source.

The general settings I use for digitizing source video and audio are as follows. Remember that these options are dependent on the video capture board you are using; these are optimized for the Truevision Targa 1000.

Looks like I'm ready to do the actual video and audio capturenow. I've already determined what portion of the videotape I want to capture. I can always do final frame levels edits after I get the captured video saved to disk. Let's rewind the video and start Play; when I get to my predetermined point to start the capture, press the Record button in the Preview window in Premiere. As soon as I get to the end of my capture, I stop the process by hitting Esc or clicking the mouse button.

I replay the preview of my capture to see if it looks and sounds OK. That wasn't too complicated; it didn't drop any frames and it's pretty good quality; I'd better save it and move on to the next steps.

The file format used by the Targa 1000 for the original video capture is a hardware dependent JPEG that can only be viewed using the codec provided with the video capture board. Not too convenient if I want anyone else to view this file; also the size and high data rate result in a huge file. Two minutes of this video at my settings results in a source .avi file of nearly 600 MB!

So the next steps I need to take are to convert this file to a more convenient file format for digital editing and portability. Premiere allows me to set up a Pre-Set configuration for working with files having a common configuration. This allows me to set all the necessary parameters such as frame size, frame rate, codec, and so on. Having done this recently, I have my Source Video pre-set ready to go. All I have to do is open a new file and select the Source Video pre-set.

To convert the video capture file I have to import the file by means of File>Import>File. Once the video file is displayed in the Project window, I drag-and-drop the video clip onto video track 1 of the Construction window. OK, that's done.

This series of steps prepares for the conversion of the video file to the output and compression configuration that is necessary for my final video source file. Go to the Make menu and select Make Movie. After I give my file a name and home, I need to set the output and compression options. These dialog boxes give an idea of the most common settings I use for this stage of video capture, conversion, and compression. Again these settings might be different from the ones you commonly use, but considering my hardware setup, they work quite well in preparing a file to be the video source for all future work.

Once the file is compressed and saved, I'm ready to do any fine tuning of the video source. There's a couple small edits I want to make so I open a new project, import the compressed source file, and then drag-and-drop it onto the first video track of the construction window. This is the same process I just did with the original captured video file.

I'm not going to have to do much editing other than adding a fade-in from black and fade-out to black transition. This is easily done by selecting the Cross Dissolve transition from the Transitions windows and dragging it to the beginning and ending frame of the video file. Premiere has a Transitions track in the Construction window so you can't go wrong with this step. I want the transition to play for one second on the fade-in and take 1 second for the fade-out. All I have to do is position the transitions below the appropriate times on the video track.

Now I have to set the final output and compression options that are needed for the HR recruitment Web site. Because this content is to be delivered on the Internet, I have to target a low-bit-rate of 28.8 Kbps. This narrows my options; the frame size is going to be small, around 160 x 112 or 176 x 144 depending on the specific codec I choose. That choice is also narrowed down because of the low data rate necessary to deliver video over the Internet. The codec choices depend on what has been installed on my system; yours might be different.

The current solutions for delivering video over the Internet are commonly based on streaming technology. The one I've used and I'll talk about more is Microsoft's NetShow software. One of NetShow's delivery targets is the Internet so it includes some optimized video and audio codecs. Excuse the digression; let's get back to selecting the optimal codec for the compression phase of this work. Basically I have two codecs that are optimized for high-quality, low-bit-rate video delivery; Microsoft's MPEG-4 and VDO's VDOWave. Both work well when compressing video for Internet delivery. The MPEG-4 codec gives a few more options and I'm more familiar with all the in's and out's of this codec, so that's what I'll go with.

Once I select the MPEG-4 codec, the Configure button becomes active. This is where I set the final Internet target data rate and image quality settings. The main thing to remember is that while the Internet is discussed in terms of 28.8 Kbps, it rarely if ever, can deliver that much data consistently. I'll bet most of the time you connect to the Internet, you don't get a 28.8 Kbps connection! With the NetShow-based streaming technology I target between 20 and 24 Kbps. Remember that's the TOTAL bandwidth including video, audio, error correction, and any other data that insures a more error free transmission. So figuring backwards from this number, the actual video setting should start around 16 Kbps. The key factors for how much you can vary this number is the amount of motion in the video and the quality of the audio that you require. The better the audio, the less bandwidth you have for video. And the more motion in a video, the more bandwidth you need to allocate to the video to get good image and motion quality. It's a balancing act between audio and video and a small bandwidth.

Looks like I'm ready to go; I've got my frame size and frame rate set and the audio and video codecs are selected and configured. To get details on the options available to you when using NetShow for developing Internet or intranet-based audio and video content, look at the "Creating NetShow Audio" and "Creating NetShow Video" sections of this guide.

After the video is compressed I can preview the video clip; looking good! One main step remains; that is to convert the .avi file into the NetShow streaming file format, ASF, and confirm that the final data rate is close to my target rate. This is the easy part.

All I have to do is open an MS-DOS-based command window and navigate to the directory location where I saved my video file. Then type a simple command that converts the .avi file to an .asf file ready for the Web. Here's the way it looks for this project

Vidtoasf -leadtime 5000 -in filename.avi

The new .asf file looks great when previewed and the final data rate as shown in the Properties>Detail dialog box is right on at 23,724 bps. If I hurry I can get this file delivered to our Web designers before they disappear for the day, and I'm out of here too. Bye now….

Acquiring and Licensing Audio Content

As discussed earlier, there are two common methods of acquiring audio; creation of new sounds and the use of existing sound material. The same issues are relevant when dealing with video acquisition and licensing. It is becoming easier to obtain existing video clips. Professional audio services license and sell thousands of video clips in the common file formats; most often, .avi and .mov (QuickTime). There is also public-domain video that can generally be used without encountering licensing issues. For example, there are numerous companies developing graphic software that offer CDs of free video clips that can be used in your multimedia projects.

Generally, if you produce your own new content or hire a professional design service to do the work for you, you will own the copyright to your creations and can use them as you wish. With the increasing availability of high-quality video recorders and video capture equipment, it is becoming much easier to "record" new video material. However, remember that just because you go to the trouble, for example, of recording an interview conducted personally doesn't mean that you have the legal right to use that video. For example, you generally should get a legal release from anyone that can be identified in the video. The best, high-quality video is from videotapes of commercial movies; this is always a big No, No! Generally, it is nearly impossible to get permission to use commercial movies, or if you do it will be very expensive.

A important point to remember; the legal issues involved in dealing with any multimedia content that you do not own are complex. This discussion is not meant to be a comprehensive coverage of legal issues, nor does it attempt to provide legal advice; use the information only as a guideline for some of the issues you should be aware of when you're using existing media elements.

In order to use existing content in your multimedia project, three general types of arrangements are possible.

1. Copyright permissions and releases

2. Individual or customer releases

Some states have laws against using any person's name or likeness, such as their voice, for commercial purposes without prior written consent. "Commercial purposes" usually doesn't mean just selling your multimedia product; it alo means promoting any of your company's products. It is wise, if not legally required depending on the state where you reside, to get a signed release from the company and individual if you use any image that is identifiable as that person or company.

3. Trademark agreements