Proper AVAudioRecorder Settings for Recording Voice?

29,583

Solution 1

You'll want to read the iPhone Application Programming Guide section titled Using Sound in iPhone OS, and the Audio Queue Services Programming Guide. (Edit: These links are outdated, the Using Sound in iPhone OS has been edited out of the current Application Programming Guide, but the Audio Queue Services Programming Guide is updated and moved.)

Most sounds in human voices are in the middle range of human hearing. Recorded speech is easily understood even when digitized with very low data rates. You can stomp all over a voice recording, yet still have a useful file. Therefore, your ultimate use for these recordings will guide your decisions on these settings.

First you need to choose the audio format. Your choice will be determined by what you want to do with the audio after you record it. Your current choice is IMA4. Maybe you'll want a different format, but IMA4 is a good choice for the iPhone. It's a fast encoding scheme, so it won't be too taxing for the limited iPhone processor, and it supplies 4:1 compression, so it won't take up too much storage space. Depending upon the format you choose, you'll want to make further settings.

Your current sample rate, 44.1 kHz, is the same as the standard for CD audio. Unless you're after a high fidelity recording, you don't need this high of a rate, but you don't want to use arbitrary rates. Most audio software can only understand rates at specific steps like 32 kHz, 24 kHz, 16 kHz, or 12 kHz.

Your number of channels is set to 2, for stereo. Unless your using additional hardware, the iPhone only has one microphone, and 1 mono channel should be sufficient. This cuts your data needs in half.

The three Linear PCM settings you are using seem to be just for Linear PCM format recordings. I think they have no effect in your code, since you are using the IMA4 format. I don't know the IMA4 format well enough to tell you which settings you'll need to make, so you'll have to do some additional research if you decide to continue using that setting.

Solution 2

One thing to consider is that for a long time the traditional land-line voice companies--since going digital--used 8-bit, 7KHz sampling. This is why trunk lines come in the sizes they come in. A T1 20 64k channels, which leaves a little overhead for the 56k of voice data coming through plus whatever management metadata they need.

So if you want POTS quality, 8b/7KHz should be fine. Adjust up based on your needs.

Share:
29,583

Related videos on Youtube

TechZen
Author by

TechZen

Cowboy Apple Developer

Updated on April 24, 2020

Comments

  • TechZen
    TechZen about 4 years

    I am adding a voice memo capability using AVAudioRecorder and I need to know the best settings for the recorder for recording voice.

    Unfortunately, I know nothing about audio to the extent I am not even sure what terms to google for.

    Currently, I am using the following which I copied from somewhere for testing purposes:

    recorderSettingsDict=[[NSDictionary alloc] initWithObjectsAndKeys:[NSNumber numberWithInt:kAudioFormatAppleIMA4],AVFormatIDKey,
                            [NSNumber numberWithInt:44100.0],AVSampleRateKey,
                            [NSNumber numberWithInt: 2],AVNumberOfChannelsKey,
                            [NSNumber numberWithInt:16],AVLinearPCMBitDepthKey,
                            [NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey,
                            [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
                            nil];
    

    or:

    defaultSettings =     {
        AVFormatIDKey = 1768775988;
        AVLinearPCMBitDepthKey = 16;
        AVLinearPCMIsBigEndianKey = 0;
        AVLinearPCMIsFloatKey = 0;
        AVNumberOfChannelsKey = 2;
        AVSampleRateKey = 44100;
    };
    

    This works but I don't know if it's optimal for voice in terms of quality, speed, file size etc.

    The AVAudioRecorder Class Reference list many settings constants but I have no clue which ones to use for voice.

    Baring that, if someone knows of a good "AudioFormats for Dummy's" resource I will take that as well. (Note:I've been through the Apple Docs and they assume a knowledge base in digital audio that I do not posses.)

    • TechZen
      TechZen about 14 years
      Man, I'm thinking it was a tactical error to post this on the day the iPad was announced.
  • TechZen
    TechZen about 14 years
    Thanks for taking the time to write such an answer but as I said in the OP, the apple documentation does not address what settings are optimal for voice. The information about the channels and sample rate is useful.
  • Mr. Berna
    Mr. Berna about 14 years
    OK, if I were making voice memos to be used in the app recording the memos, I'd set AVFormatIDKey to kAudioFormatAppleIMA4, AVSampleRateKey to 16000.0, AVNumberOfChannelsKey to 1, and leave everything else at the defaults.
  • lambmj
    lambmj over 12 years
    Excellent answer, thanks. Fwiw, there are some very good session from WWDC 2010 that cover this topic as well. In particular the Fundamentals of Digital Audio is full of good info. The slide deck from that presentation is also very useful. Look at slides 51 & 52 in particular.
  • Guntis Treulands
    Guntis Treulands over 11 years
    I would still suggest 2 channels, as user can use headphones to listen audio recording, not to mention file sending to computer via email.
  • TahoeWolverine
    TahoeWolverine over 7 years
    kiok45 - replying 6 years after a post and complaining about broken links isn't productive; perhaps share the updated links! Otherwise, pointing it out and moving on is your best bet; it's no less correct, just Apple changed their pages.
  • Lance Samaria
    Lance Samaria over 2 years
    @GuntisTreulands what happens if there is only 1 channel and the user emails the audio file to someone?
  • Guntis Treulands
    Guntis Treulands over 2 years
    @LanceSamaria it has been soo long since I worked on this, and now re-reading the answers/question, I would now suggest that 2 channels does not matter so much for audio recording. If I understand correctly - if only 1 channel is recoded, it will simply sound identical to both left/right side (if listening on headphones for example). So, there is no problem, when emailing it to someone. It will be still playable and sound pretty good.
  • Guntis Treulands
    Guntis Treulands over 2 years
    @LanceSamaria one of the reasons to doubt my choice (about 2 channels), as the iPhones only until recently (X or XS was the first ? ) recorded audio in solo anyways. When testing XS recording in a windy outside, quality was much worse, so I am now always turning off stereo recording, and recording it in mono. From sound perspective (when recording conversations) mono sounds better when using headphones (I guess I am not used to listening when one person talks only on left side, and other one - from other side. weird. I like hearing them in the middle.)
  • Lance Samaria
    Lance Samaria over 2 years
    @GuntisTreulands thank you for your response. I'm not an audio expert so I had to piece together my audioPlayer using a bunch of different answers from SO. I noticed that 90% of the answers used 2 channels. Thanks of the info and Happy Coding!