30 May, 2024

Integrating Azure Cognitive Services with OpenAI for interactive Voice-based AI Responses 🤖🎤

In today's fast-moving digital world, mixing top-notch AI tech is opening up endless chances for cool experiences. When we combine Azure Cognitive Services, which are really good at understanding speech, with OpenAI, which is great at understanding natural language, magic happens. It lets us make AI systems that talk back to us in real time, understanding what we say like it's second nature. Jump into the world where Azure meets OpenAI, and see how it's changing the way we interact with computers.

In this blog post, we’ll explore a Python script that achieves this integration, enabling a fully interactive voice-based AI system.


Overview

The script leverages Azure’s Speech SDK to recognize spoken language and OpenAI’s language model to generate responses. The process involves several key steps:

  1. Loading Environment Variables: Using dotenv to manage API keys securely.
  2. Configuring Azure Speech SDK: Setting up speech recognition and synthesis.
  3. Invoking OpenAI API: Using the recognized speech to generate a response.
  4. Synthesizing AI Response: Converting the AI-generated text back to speech.

Let’s delve into the details.

Prerequisites

Before you start, ensure you have the following:

  • Azure Cognitive Services subscription with Speech API.
  • OpenAI API key.
  • Python environment with necessary libraries installed (azure-cognitiveservices-speech, python-dotenv, langchain_openai).

Code Walkthrough

Below is the complete script with comments to guide you through each section:

import os
from dotenv import load_dotenv
import azure.cognitiveservices.speech as speechsdk
from langchain_openai import OpenAI

# Load environment variables from a .env file
load_dotenv()

# Retrieve API keys from environment variables
AZURE_SPEECH_KEY = os.getenv("AZURE_SPEECH_KEY")
AZURE_SPEECH_REGION = os.getenv("AZURE_SPEECH_REGION")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

# Initialize the OpenAI language model with the API key
llm = OpenAI(api_key=OPENAI_API_KEY)

# Configure Azure Speech SDK for speech recognition
speech_config = speechsdk.SpeechConfig(subscription=AZURE_SPEECH_KEY, region=AZURE_SPEECH_REGION)
speech_config.speech_recognition_language = "en-US"

# Set up the audio input from the default microphone
audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)

print("You can speak now. I am listening ...")

# Start speech recognition
speech_recognition_result = speech_recognizer.recognize_once_async().get()

# Check the result of the speech recognition
if speech_recognition_result.reason == speechsdk.ResultReason.RecognizedSpeech:
output = speech_recognition_result.text
print(output) # User question

# Invoke the OpenAI API with the recognized speech text
result = llm.invoke(output)
print(result) # AI Answer

# Configure audio output for speech synthesis
audio_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)

# Synthesize the AI response into speech
speech_synthesizer_result = speech_synthesizer.speak_text_async(result).get()

elif speech_recognition_result.reason == speechsdk.ResultReason.NoMatch:
print("No speech could be recognized: {}".format(speech_recognition_result.no_match_details))

elif speech_recognition_result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = speech_recognition_result.cancellation_details
print("Speech Recognition canceled: {}".format(cancellation_details.reason))
if cancellation_details.reason == speechsdk.CancellationReason.Error:
print("Error details: {}".format(cancellation_details.error_details))

Step-by-Step Breakdown

1. Loading Environment Variables

Using the dotenv package, we load sensitive API keys from a .env file, ensuring they are not hard-coded in the script.

2. Configuring Azure Speech SDK

We set up the Azure Speech SDK with the necessary subscription key and region. This configuration allows the SDK to access Azure’s speech recognition services.

3. Speech Recognition

The SpeechRecognizer object listens for speech input via the default microphone and processes the speech to text asynchronously. Upon recognition, it checks the result and extracts the recognized text.

4. Invoking OpenAI API

The recognized text is then passed to OpenAI’s language model, which generates a relevant response. This integration allows for dynamic interaction, where the AI can understand and respond contextually.

5. Speech Synthesis

Finally, the AI-generated text response is synthesized back into speech using Azure’s SpeechSynthesizer, providing a spoken response through the default speaker.

Error Handling

The script includes basic error handling for different outcomes of the speech recognition process:

  • RecognizedSpeech: When speech is successfully recognized.
  • NoMatch: When no speech is recognized.
  • Canceled: When the recognition process is canceled, potentially due to errors.

Conclusion

Integrating Azure Cognitive Services with OpenAI creates a powerful platform for developing interactive voice applications. This combination leverages the robust speech recognition and synthesis capabilities of Azure with the advanced natural language understanding of OpenAI. Whether for virtual assistants, customer support, or other innovative applications, this integration showcases the potential of modern AI technologies.

Feel free to experiment with the code and adapt it to your specific use case. The possibilities are endless when combining these advanced tools in creative ways.

28 May, 2024

Creating a Fake Data Generator with Python 🤖📅

In the world of data science and machine learning, having access to diverse datasets for experimentation is crucial. However, real-world data can be scarce, sensitive, or simply not available. This is where generating fake data becomes invaluable. In this post, I’ll walk you through creating a fake data generator using Python’s Faker library and displaying this data in a well-styled table using pandas.

Python output
HTML Output

What You’ll Need

Before we dive in, ensure you have the following installed:

  • Python 3.x
  • pandas library: pip install pandas
  • Faker library: pip install faker

Step-by-Step Guide

Let’s build a script that generates a set of fake personal data and displays it in a structured format.

1. Import Necessary Libraries

First, import the Faker and pandas libraries:

from faker import Faker
import pandas as pd

2. Initialize Faker

Create an instance of the Faker class:

fake = Faker()

3. Generate Fake Data

Define a dictionary to store the fake data:

data = { 
'First Name': [],
'Last Name': [],
'DOB': [],
'Email': [],
'City': [],
'Country': [],
'ZipCode': [],
}

Use a loop to populate the dictionary with fake data: say 10 records

for i in range(11):
data['First Name'].append(fake.first_name())
data['Last Name'].append(fake.last_name())
data['DOB'].append(fake.date_of_birth())
data['Email'].append(fake.email())
data['City'].append(fake.city())
data['ZipCode'].append(fake.zipcode())
data['Country'].append(fake.country())

4. Create a DataFrame

Convert the dictionary to a pandas DataFrame:

df = pd.DataFrame(data)

Print the DataFrame to see the generated data:

print(df)

5. Style the DataFrame (Optional)

To display the DataFrame with borders, define a style and save the styled DataFrame to an HTML file:

styles = [
{'selector': 'table', 'props': [('border-collapse', 'collapse')]},
{'selector': 'th, td', 'props': [('border', '1px solid black'), ('padding', '5px')]},
]

styled_df = df.style.set_table_styles(styles)
styled_df.to_html('styled_table.html')

6. Open the Styled Table (Optional)

The styled DataFrame is saved as styled_table.html. You can open this file in a web browser to view the data in a neat, tabular format with borders.

print("Styled table saved as 'styled_table.html'. Open this file in a web browser to view the table with borders.")

Full Code

Here’s the full script:

from faker import Faker
import pandas as pd

fake = Faker()

data = {
'First Name': [],
'Last Name': [],
'DOB': [],
'Email': [],
'City': [],
'Country': [],
'ZipCode': [],
}

for i in range(11):
data['First Name'].append(fake.first_name())
data['Last Name'].append(fake.last_name())
data['DOB'].append(fake.date_of_birth())
data['Email'].append(fake.email())
data['City'].append(fake.city())
data['ZipCode'].append(fake.zipcode())
data['Country'].append(fake.country())

df = pd.DataFrame(data)

print(df)

# Generate a styled DataFrame and view it by saving the DataFrame to an HTML file and then opening that file in a web browser
# Define the table style to add borders
styles = [
{'selector': 'table', 'props': [('border-collapse', 'collapse')]},
{'selector': 'th, td', 'props': [('border', '1px solid black'), ('padding', '5px')]},
]

styled_df = df.style.set_table_styles(styles)
styled_df.to_html('styled_table.html')

print("Styled table saved as 'styled_table.html'. Open this file in a web browser to view the table with borders.")

Conclusion

Using Python’s Faker library, you can easily generate realistic fake data for various purposes, from testing to developing machine learning models. By combining it with pandas, you can structure this data in a DataFrame and apply styles to make it more readable and visually appealing. This approach is not only practical but also enhances the presentation of your data, making it easier to analyze and share.

Open the styled_table.html file in your web browser, and you'll see a beautifully formatted table containing the fake data, ready for use in your projects. Happy coding!