Complete Setup & Deployment Guide
The Hausa AI Chatbot is a comprehensive conversational AI system that enables fluent communication in the Hausa language using advanced GPT models integrated with Google Cloud's Speech-to-Text and Text-to-Speech APIs.
Key Objectives:
┌─────────────────┐
│ Frontend UI │ (hausa_chatbot.html)
│ - HTML/JS │
│ - Tailwind │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Backend API │ (Flask Application)
│ - app.py │
│ - Routes │
└────────┬────────┘
│
┌────┴────┐
▼ ▼
┌────────┐ ┌──────────┐
│ GPT │ │ Google │
│ API │ │ Cloud │
└────────┘ └──────────┘
Components:
• Frontend: Single-page application with voice recording capabilities
• Backend: Flask API handling GPT and Google Cloud integrations
• GPT Model: Fine-tuned or prompted for Hausa language
• Google Cloud: Speech-to-Text and Text-to-Speech services
# Clone the repository
git clone https://github.com/adab-tech/adab-tech.github.io.git
cd adab-tech.github.io/backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env with your API keys
Steps:
# Get your API key from https://platform.openai.com
# Add to .env file:
OPENAI_API_KEY=sk-your-api-key-here
Prepare data in conversation format:
{
"messages": [
{"role": "system", "content": "You are a helpful Hausa assistant."},
{"role": "user", "content": "Sannu, ina kwana?"},
{"role": "assistant", "content": "Lafiya lau. Na gode. Yaya kake?"}
]
}
python data_preprocessing.py
# Or use programmatically:
from data_preprocessing import HausaDataPreprocessor
preprocessor = HausaDataPreprocessor()
data = preprocessor.load_from_csv('hausa_data.csv')
preprocessor.validate_data(data)
preprocessor.save_for_finetuning(data, 'training.jsonl')
Ensure your data is in JSONL format with at least 10 examples (recommended: 100+)
import openai
# Upload training file
with open("hausa_training.jsonl", "rb") as f:
response = openai.File.create(
file=f,
purpose='fine-tune'
)
file_id = response.id
print(f"File uploaded: {file_id}")
# Create fine-tuning job
response = openai.FineTuningJob.create(
training_file=file_id,
model="gpt-3.5-turbo"
)
job_id = response.id
print(f"Fine-tuning job created: {job_id}")
# Check status
status = openai.FineTuningJob.retrieve(job_id)
print(f"Status: {status.status}")
Note:
Fine-tuning can take several hours to complete. Monitor the status regularly. Once complete, update the model ID in your backend configuration.
from google.cloud import speech
client = speech.SpeechClient()
config = speech.RecognitionConfig(
encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=16000,
language_code="ha-NG", # Hausa (Nigeria)
alternative_language_codes=["en-US"],
enable_automatic_punctuation=True
)
from google.cloud import texttospeech
client = texttospeech.TextToSpeechClient()
voice = texttospeech.VoiceSelectionParams(
language_code="ha-NG",
name="ha-NG-Standard-A"
)
audio_config = texttospeech.AudioConfig(
audio_encoding=texttospeech.AudioEncoding.MP3,
speaking_rate=0.9,
pitch=0.0
)
Available Hausa Voices:
# Start the backend server
cd backend
python app.py
# Server will run on http://localhost:5000
# Open hausa_chatbot.html in browser
# Using AWS Elastic Beanstalk
eb init -p python-3.8 hausa-chatbot
eb create hausa-chatbot-env
eb deploy
# Or using Docker
docker build -t hausa-chatbot .
docker run -p 5000:5000 hausa-chatbot
The frontend is already configured for GitHub Pages. Simply push to the main branch.
git add hausa_chatbot.html
git commit -m "Add Hausa chatbot"
git push origin main
# Access at: https://adab-tech.github.io/hausa_chatbot.html
# Test health endpoint
curl http://localhost:5000/api/health
# Test chat endpoint
curl -X POST http://localhost:5000/api/chat \
-H "Content-Type: application/json" \
-d '{"message": "Sannu", "history": []}'
Issue: API Key Not Working
Solution: Verify your API key is correctly set in .env and has proper permissions
Issue: Google Cloud Authentication Failed
Solution: Ensure GOOGLE_APPLICATION_CREDENTIALS points to valid JSON credentials file
Issue: Microphone Access Denied
Solution: Enable microphone permissions in browser settings. Use HTTPS for production.
Issue: CORS Errors
Solution: Ensure flask-cors is installed and properly configured in backend
For support and updates:
GitHub Repository