| RTL reference|Glossary|Tips/Tricks|FREE App/VCL|Best'O'Net|Books|Link To |
| Converting WAV to MP3 and back | ||||||||||||||||||
| A brief introduction to the Windows Audio Compression Manager API. | ||||||||||||||||||
Article submitted by: Peter Morris From humble beginnings
GOOD NEWS ! That was a long time ago, and since then things have come a long way (and phone calls are also much cheaper). With the ever increasing popularity of the internet, media has become higher quality yet smaller in size. There are now numerous streaming audio formats around, and even streaming video, and all this has been made accessible to people on very low bandwidths. That's not all, not only have these formats (Quicktime, RealAudio, and even MP3) become more popular, they have also become more accessible to the developer.
Codecs
Note:
So, what is the point of a codec ? Well, a codec is a little bit like an ActiveX component. ActiveX components allowed developers to implement functionality within their applications without having to write all of the code involved (eg, embedding a Word document). Codecs do the same sort of thing but concentrate on converting media formats into other media formats. For example, if you wanted to write an application which took audio data from an audio CD and then converted it into an MP3, the only work you would need to do yourself would be
1. Extract the audio data from the track ACM and API The ACM really belongs in MMSystem.pas (which handles Windows multimedia) but, for some reason, it has been omitted. The first task therefore is to find a copy of MSACM.pas, which is an API conversion of this API. The one I found most useful was the conversion by Francois Piette, which was posted on Project Jedi (www.Delphi-Jedi.org).
ACM requires the developer to undertake the following steps in order to convert media between formats.
1. You must decide on your Input and Output format. This is based on the TWaveFormatEX record but, be warned, this record structure is not actually large enough to store the required information of most codecs. It is due to this that I used my own TACMFormat record, which is no more than a TWaveFormatEX record with a few extra bytes tagged on at the end. You really have no way (that I know of) of finding out what these extra bytes mean, or how they should be set. My solution to this problem was to use the acmFormatChoose to allow the developer to choose the formats at design time, and then have these values streamed within the IDE as a custom property (more on this later).
2. You must then open an ACM stream. This is done by calling acmStreamOpen, passing the Input and Output formats so that the ACM is aware of what is required of it. At this point the ACM will either return a valid handle to a stream, or will return an error code such as ACMERR_NotPossible to indicate that the conversion requested cannot be performed (more on this later).
3. The next item to be performed is to determine the size of the output buffer. Calling acmStreamSize will inform the ACM of how many bytes you intend to supply it with each time, it then will return the required size of the output buffer (this will nearly always be over estimated to ensure that you supply a large enough buffer).
4. The final preparation step is to prepare a header. All we need to do here is to call acmStreamPrepareHeader passing the stream handle we received from acmStreamOpen. The header that we prepare will also tell the ACM the address of our "source" buffer, and the address of our "destination" buffer (The ACM does not allocate this memory for us, we need to allocate it ourselves).
At this point, all of our preparation work is done. All we need to do now is actually request that our data is converted. Since all of our preparation is complete, this is actually a very simple step. It is achieved by calling acmStreamConvert. This routine requires us to supply the Stream Handle (So that it knows which formats we are working with), and our Header Handle (so that it knows the size, location of the source and destination buffers). This routine will specify the actual number of bytes used in the conversion by setting cbDstLengthUsed within the header. Your ACM session is now ready for another chunk of data !
Once you have finished with your ACM session it is time to free the resources we have used. This is the simplest part of all. The header is released using acmStreamUnprepareHeader, and the stream is released using acmStreamClose.
Choosing a format The idea here is that we can still access the TWaveFormatEX data by referring to the Format part of the record, yet RawData supplies us with enough room for any additional data required for any of the individual codecs.
Although we do not know either the size of this additional data, or what it represents, it is still a simple task to acquire it. This is done using acmFormatChoose.
AcmFormatChoose requires only one parameter of type TACMFormatChooseA. This parameter is a simple structure holding the following (relevant) information.
. pwfx A pointer to a TWaveFormatEX structure to receive the result (we actually pass TACMFormat)
Note: A bit of Chinese whisper The first two to note (simply because they are the easiest to explain) are
1. The codec you specified on your machine, may not be available on a client machine The final one is a little more complicated, and warrants the phrase "Chinese Whisper".
Not all ACM formats are interchangeable, for example (I am just making these up, so if they actually work don't write saying that I was wrong) you may not be able to convert
You need to find a "middle man". This is quite often a PCM format, as most (if not all) codecs were designed to convert PCM into a more suitable format. The above example would therefore be achieved like so.
Converting to "MP3 16BIT STEREO" would probably require yet another step (between the PCM and MP3, in order to convert 8 bit PCM to 16 bit PCM).
I think you will now understand why this section is called "Chinese Whisper". (If anyone can tell me the meaning of that phrase I would appreciate it !)
The hidden talents of ACM Imagine a simple application which takes input from your microphone, compresses it to a suitable format for streaming over a very low bandwidth, and then transmits it to a destination PC over TCP/IP. While at the same time receives compressed data, decompresses it, and then plays it out of your speaker (aka, a simple internet phone). Did I say simple ? Well, actually, yes ! This sounds like a lot of work, and probably is (except with the components supplied it is actually very simple).
This is where the hidden talents of the ACM come into play. Quite a few of the ACM codecs are Wave-mappable. Which basically means that they may be treated as a standard WAVE device when playing / recording audio.
For example. It is quite easy to open an input for a GSM sound source. Once you receive a buffer of data from the wave input device it is already compressed and ready for transmitting. On the other hand, as soon as data is received through your TCP/IP socket, it is possible to play this data directly through a wave out device.
1. Data in from MIC The standard PCM data would be far too large for real-time streaming over a modem. Whereas GSM 6.1 can be transmitted as low as 1.5k/second, and MP3 16BIT MONO can be streamed at a mere 2k/second.
Apart from Win2K not being able (or even allowed) to create MP3, there is something else worth mentioning about this format. Although it can quite easily be treated as a wave-out device, it did not seem to work as a wave-in device. Which is why I found it necessary to convert the PCM data to MP3 manually (which turned out to make quite a nice demonstration project)
Components, demos and source code For this reason I have included three components, and two demonstrations (demos were compiled in Delphi 5)
Components
1. TACMConvertor : This really serves two purposes. Firstly, it converts data between 2 different media formats. Secondly, even if you do not intend to manually convert the raw data, this component comes in useful for specifying input/output formats of ACM streams. (The right-click component editor allows you to select the formats from the acmFormatChoose dialog at designtime)
2. TACMIn : This component is used for receiving data from your microphone. You can specify a standard PCM format, or you can specify any format capable of being mapped through the WaveIn device.
3. TACMOut : This component is used replaying audio through your audio output. Again, you may select to output in PCM format, or any other format capable of being mapped through your WaveOut device. The NumBuffers property specifies how many buffers you want filled before you start to play. This is not much use when you want instant audio (internet telephones) but can come in useful when you want to do audio broadcasting over the internet, and want to buffer some extra audio just in case your connection speed fluctuates.
Demos
The first demo is really quite simple. The TACMConvertor is used only to specify the input and output formats. This demo opens an ACMIn and an ACMOut at the same time. Audio in is piped almost immediately back out, but with a slight delay, making you sound a little like Elvis Prestley (Although I am not an Elvis fan, "All shook up" was the first song that sprang to mind when I tested it)
The second demo is a little more complicated and comes in two parts.
The first part (Demo2.dpr) acts as a server. It has a server socket listening on port 6565 for new connections. At the same time it takes audio in from the MIC, converts it into MP3 16BIT 8Khz MONO (2k/second) and pipes it out to every connected client.
The second part (Demo2Client.dpr) acts as a client. The first edit box requires the IP address of the server, whereas the second (SpinEdit) input is the number of additional buffers that you require. Once you click connect (and the requested number of buffers has been filled) you will start to hear the audio from the server. MP3 16BIT 8Khz MONO is surprisingly good quality, and also surprisingly low bandwidth.
Components and demos are available from the downloads page.
Well, that just about completes this article. I hope you have enjoyed reading about it much more than I enjoyed having to work it all out (hehe).
Don't forget to post your questions, concerns, views and comments to this article on the Delphi Programming Forum.
|
||||||||||||||||||
All graphics (if any) in this feature created by Zarko Gajic.
| More Delphi |
|
· Learn another routine every day - RTL Quick Reference. · Download free source code applications and components. · Talk about Delphi Programming, real time. · Link to the Delphi Programming site from your Web pages. · Tutorials, articles, tech. tips by date: 2001|2000|1999|1998 or by TOPIC. |
|
· NEXT ARTICLE:
Database table to XML and back. Creating XML files from Paradox (or any DB) tables using Delphi. See how to export the data from a table to a XML file and how to import that data back to the table. |
| Stay informed with all new and interesting things about Delphi (for free). |
|
|
| Got some code to share? Got a question? Need some help? |

