This Streamlit app allows users to record audio from their microphone, send it to the OpenAI API for processing, and receive both text and audio responses. It leverages OpenAI's gpt-4o-audio-preview
model to transcribe and respond to audio input.
- Audio Input: Record audio directly from your microphone.
- OpenAI API Integration: Send audio to OpenAI for transcription and response generation.
- Text and Audio Output: Receive a text transcription of your audio and an optional audio response.
- User-Friendly Interface: Simple and intuitive interface powered by Streamlit.
Before running the app, ensure you have the following:
- Python 3.9 or higher: The app is built using Python.
- OpenAI API Key: You need an API key from OpenAI to use their services. Sign up at OpenAI if you don't have an account.
- Streamlit: The app is built using the Streamlit framework.
-
Clone the repository (if applicable):
bash
Copy
git clone https://github.com/skitsanos/streamlit-audiogpt.git cd streamlit-audiogpt
-
Install dependencies:
bash
Copy
pip install streamlit openai
-
Set up your OpenAI API key:
-
Create a
.env
file in the root directory of the project. -
Add your OpenAI API key to the
.env
file:Copy
OPENAI_API_KEY=your-api-key-here
-
Alternatively, you can set the API key directly in the app code (not recommended for production).
-
-
Navigate to the project directory:
bash
Copy
cd your-repo-name
-
Run the Streamlit app:
bash
Copy
streamlit run app.py
-
Open the app in your browser:
- Streamlit will provide a local URL (e.g.,
http://localhost:8501
). - Open the URL in your browser to use the app.
- Streamlit will provide a local URL (e.g.,
- Record Audio:
- Click the "Speak into your microphone" button to start recording.
- Stop recording when you're done.
- Process Audio:
- The app will send your audio to the OpenAI API for processing.
- A spinner will indicate that the app is working on your request.
- View Results:
- The app will display:
- A text transcription of your audio.
- A text response from the OpenAI model (if applicable).
- An audio response (if available).
- The app will display:
- Repeat:
- Record new audio and repeat the process as needed.
- Model: You can change the OpenAI model in the
app.py
file by modifying themodel
parameter in theclient.chat.completions.create
call. - Voice: The app uses the
alloy
voice for audio responses. You can change this by modifying theaudio
parameter in the API call. - Styling: You can customize the app's appearance by modifying the CSS in the
st.markdown
section.
- API Key Issues: Ensure your OpenAI API key is correctly set in the
.env
file or app code. - Audio Recording Problems: Make sure your microphone is working and that the app has permission to access it.
- Network Errors: Check your internet connection if the app fails to communicate with the OpenAI API.
If you'd like to contribute to this project, feel free to open an issue or submit a pull request. Contributions are welcome!