Microphone Component

The microphone component provides a way to record sound during an experiment. You can even transcribe the recording to text! Take a look at the documentation on Creating a Google Cloud Speech API key to get started with that.

When using a mic recording, specify the starting time relative to the start of the routine (see start below) and a stop time (= duration in seconds). A blank duration evaluates to recording for 0.000s.

The resulting sound files are saved in .wav format (at the specified sampling frequency), one file per recording. The files appear in a new folder within the data directory (the subdirectory name ends in _wav). The file names include the unix (epoch) time of the onset of the recording with milliseconds, e.g., mic-1346437545.759.wav.

It is possible to stop a recording that is in progress by using a code component. Every frame, check for a condition (such as key ‘q’, or a mouse click), and call the mic.stop() method of the microphone component. The recording will end at that point and be saved.

Categories:

Responses

Works in:

PsychoPy, PsychoJS

Parameters

Basic

The required attributes of the stimulus, controlling its basic function and behaviour

Name

Everything in a PsychoPy® experiment needs a unique name. The name should contain only letters, numbers and underscores (no punctuation marks or spaces).

Start

When the Microphone Component should start, see Defining the onset/duration of components.

Expected start (s)

If you are using frames to control timing of your stimuli, you can add an expected start time to display the component timeline in the routine.

Start type

How do you want to define your start point?

Options:

  • time (s)

  • frame N

  • condition

Stop

When the Microphone Component should stop, see Defining the onset/duration of components.

Expected duration (s)

If you are using frames to control timing of your stimuli, you can add an expected duration to display the component timeline in the routine.

Stop type

The duration of the recording in seconds; blank = 0 sec

Options:

  • duration (s)

Device

Information about the device associated with this Component. Keyboards, speakers, microphones, etc.

Device label

A label to refer to this Component’s associated hardware device by. If using the same device for multiple components, be sure to use the same label here.

Device

What microphone device would you like the use to record? This will only affect local experiments - online experiments ask the participant which mic to use.

Channels

Record two channels (stereo) or one (mono, smaller file). Select ‘auto’ to use as many channels as the selected device allows.

Options:

  • Auto

  • Mono

  • Stereo

Sample rate (hz)

How many samples per second (Hz) to record at

Exclusive control

Take exclusive control of the microphone, so other apps can’t use it during your experiment.

Max recording size (kb)

To avoid excessively large output files, what is the biggest file size you are likely to expect?

Transcription

Transcribe audio

Whether to transcribe the audio recording and store the transcription

Transcription backend

What transcription service to use to transcribe audio?

Options:

  • Google: Uses Google’s cloud based speech-to-text engine, requiring a key from Google to use. We highly recommend taking a look at the documentation on Creating a Google Cloud Speech API key to get started.

  • Whisper (OpenAI): Uses an open-source speech recognition AI. Requires the psychopy-whisper plugin to be installed, and will work better with a dedicated graphics card (as the model uses GPU to speed up processing)

Transcription language

What language you expect the recording to be spoken in, e.g. en-US for English

Expected words

Set list of words to listen for - if blank will listen for all words in chosen language.

If using the built-in transcriber, you can set a minimum % confidence level using a colon after the word, e.g. ‘red:100’, ‘green:80’. Otherwise, default confidence level is 80%.

Speaking start / stop times

Tick this to save times when the participant starts and stops speaking

Whisper model (if :ref:`microphonecomponent-transcribeBackend` is “Whisper”)

Which model of Whisper AI should be used for transcription? Details of each model are available here

Options:

  • tiny

  • base

  • small

  • medium

  • large

  • tiny.en

  • base.en

  • small.en

  • medium.en

Whisper device (if :ref:`microphonecomponent-transcribeBackend` is “Whisper”)

Which device to use for transcription?

Options:

  • auto

  • gpu

  • cpu

Data

What information about this Component should be saved?

Save onset/offset times

Store the onset/offset times in the data file (as well as in the log file).

Sync timing with screen refresh

Synchronize times with screen refresh (good for visual stimuli and responses based on them)

Output file type

What file type should output audio files be saved as?

Options:

  • default

  • aiff

  • au

  • avr

  • caf

  • flac

  • htk

  • svx

  • mat4

  • mat5

  • mpc2k

  • mp3

  • ogg

  • paf

  • pvf

  • raw

  • rf64

  • sd2

  • sds

  • ircam

  • voc

  • w64

  • wav

  • nist

  • wavex

  • wve

  • xi

Full buffer policy

What to do when we reach the max amount of audio data which can be safely stored in memory?

Options:

  • Discard incoming data

  • Clear oldest data

  • Raise error

Trim silent

Trim periods of silence from the output file

Testing

Tools for testing, debugging and checking the performance of this Component.

Disable Component

Disable this Component


Back to top