From Game Tech Wiki
Jump to navigation Jump to search

In 1982, Mattel introduced a new peripheral for the Intellivision: The Intellivoice, a voice synthesis device which produces speech when used with certain games. The Intellivoice was original in two respects: not only was this capability unique to the Intellivision system at the time (although a similar device was available for the Odyssey2), but the speech-supporting games written for Intellivoice actually made the speech an integral part of the gameplay. Unfortunately, the amount of speech that could be compressed into a 4K or 8K ROM cartridge was limited, and the system did not sell as well as Mattel had hoped.


It connected to the Master Component like any cartridge, but provided a new port to insert games, and a dial to control the vocal volume. The idea was to give their "graphically superior" games yet another advantage over their Atari cousins - understandable, human sounding voice.

The signal that was generate from the speech synthesizer was integrated with the standard Intellivision data bus using other components of the Intellivoice - an interface chip as well as an amplifier. The result was games designed for the Intellivoice took advantage of the extra hardware, and those that didn't would simply ignore it. Plugging a voice-enabled game directly into the master component (bypassing the speech synthesis hardware), worked as well - however you lost and voice generated by the game. In general, this rendered the games virtually unplayable.

Voice editing was crucial, since each game cartridge could only hold 4 to 8K of voice data. Words had to be digitized at the lowest possible sampling rate at which they could be understood; often, the sampling rate would be changed three or four times within the same word - lower for vowels, higher for consonants - to save space.

Despite these space-saving efforts, the number of words that could be fit into a voice game was extremely limited, which probably contributed to the Intellivoice's failure. While orders for the initial voice game releases were around 300,000 each, orders for the fourth game, TRON Solar Sailer, released later, hit only 90,000.

A restyled Intellivoice, designed to match the Intellivision II, appeared in the January 1983 Mattel Electronics catalog; a working prototype, however, was never built. The module shown in the catalog was merely a carved and painted block of wood.



The Intellivoice was designed around the General Instruments SP-0256 chip. This was the same company that produced the microprocessor that served as the heart of the Intellivision. This speech synthesis chip was remarkable for its time. It contained two digital filters, a ROM to hold speech data (simple words and phases spoken using a male voice that could be used in a wide variety of games) and a microcontroller that could assemble the base data into phrases as the game was being played. The filters could be used to simulate the actual way the human voice is produced, so that the simple phrases that were used could be made to sound differently in various situations.

The Intellivoice (model 3330) consists of a VLSI speech synthesizer, an LSI buffer/interface chip, an active audio filter/amplifier section, and provisions for current assistance to the Master/Keyboard Component's +5V power supply.


The speech synthesizer is the General Instruments SP-0256 Orator. The SP-0256 incorporates four basic functions:

  • A software programmable digital filter that can be made to model a VOCAL TRACT.
  • A 16K ROM which stores both speech data (Resident ROM or RESROM) and instructions (the program).
  • A Microcontroller which controls the data flow from the ROM to the digital filter, the assembly of the "word strings" necessary for linking speech elements together, and the amplitude and pitch information to excite the digital filter.
  • A Pulse Width Modulator that creates a digital output which is converted to an analog signal when filtered by an external low pass filter.

The SP-0256 can also accept serial speech data from an external source.

For the 3330, the RESROM contains a variety of words and phrases that may be useful in video games. The PROGRAM consists of 17 different parameters used by the VOCAL TRACT model to imitate human speech patterns.


The buffer/interface chip (General Instruments SPB-640) contains logic required to interface the speech synthesizer to the Master/Keyboard Component cartridge bus.

Controlling input to the buffer/interface chip is primarily from the Master Component Microprocessor bus signals. Other controlling inputs are generated by the speech synthesizer during speech production.

The buffer/interface chip has three methods of transmitting data to the speech synthesizer and peripherals connected to the stacking connector.

The first speech-oriented data transference method causes the synthesizer to produce speech segments contained in its internal ROM (RESROM): the buffer/interface chip allows the address of the desired speech segment to pass onto an 8-bit peripheral data bus connecting the buffer/interface and synthesizer chips, and sets the proper control lines for the synthesizer to generate the segment.

The second method of moving speech data allows the Master/Keyboard Component to load custom speech data into the synthesizer: data from the game cartridge is loaded into the buffer/interface chip's 640-bit FIFO array and converted to serial data, and the buffer/interface chip sets the proper control lines for the synthesizer to read the serial data and convert it to speech.

Finally, the buffer/interface chip also allows moving data to and from peripherals through the top-mounted stacking connector: the buffer/interface chip sets the proper control lines for the peripheral bus to carry bi-directional microprocessor data.


The output of the speech synthesizer is not conventional audio, but is a 40 KHz digital Pulse Width Modulated (PWM) signal. When viewed on an oscilloscope, this appears to be a square wave whose edges rapidly expand and contract as speech generation takes place.

A series of filters (an LM-324C Quad OP Amp and related components) converts the PWM signal to conventional audio which is then amplified (an LM-358C Dual OP Amp and related components, including volume control) and fed to the Master Component.

The effective passband for the speech signals is from 150 Hz to 5KHz. Within this is also a 3db/octave bass pre-emphasis.


The stacking connector has its connections arranged in such a way as to allow a future power supply to fill the 3330 and game cartridge power requirements, and boost the power capability of the Master/Keyboard Component's power supply.

Power supply boosting can be accomplished by allowing power input to pin six of the stacking connector. This unregulated voltage is applied through an 8.2 Ohm, 2W resistor to the Master/Keyboard Component Vcc on pin 43 of the cartridge port to supply an approximately 270mA boost.