engineeringAI/ML Speech-to-Text Python Engineer
48 Commercial Plaza, Cavalry Ground Extension, Lahore
About IRG Global
IRG Global is a leading technology company specializing in software development, AI-powered solutions, digital transformation, and BPO services. With 8+ years of experience delivering high-performance tech solutions worldwide, IRG Global is now expanding its AI division and seeking top talent to build next-generation speech technologies.
About the Role
IRG Global is hiring an AI/ML Engineer with strong expertise in Speech-to-Text (STT) technologies and Python development. You will work on building advanced speech recognition systems, integrating cutting-edge models, and optimizing performance for real-time applications.
Key Responsibilities
- Build, train, and maintain Speech-to-Text (ASR) models using state-of-the-art frameworks.
- Develop Python-based STT pipelines and integrate them into IRG Global’s products and platforms.
- Fine-tune ASR models such as OpenAI Whisper, Wav2Vec2, DeepSpeech, or similar.
- Preprocess and manage audio datasets for training and evaluation.
- Optimize inference for low latency, high accuracy, and scalability.
- Deploy models using APIs, Docker, and cloud or on-prem infrastructure.
- Collaborate with product, development, and QA teams for end-to-end solution delivery.
- Research emerging speech technologies and propose improvements.
Required Skills & Qualifications
Bachelor’s or Master’s degree in Computer Science, AI, Data Science, or a related field.
Strong command of Python and ML frameworks (PyTorch, TensorFlow, NumPy, scikit-learn).
Experience working with Speech-to-Text frameworks:
- OpenAI Whisper
- DeepSpeech
- Wav2Vec2 (Hugging Face)
- Or other ASR engines
- Understanding of audio processing, feature extraction, and signal processing.
- Experience working with ASR datasets (LibriSpeech, CommonVoice, etc.).
- Hands-on experience with model deployment (APIs, Docker, cloud/on-prem).
- Solid grasp of machine learning concepts, model tuning, and performance evaluation.
Preferred Skills
- Experience with real-time audio streaming and WebSocket integrations.
- Familiarity with NLP, text normalization, and acoustic modeling.
- Experience with GPU optimization or CUDA (plus point).
- Knowledge of LLMs and multimodal AI systems.
Soft Skills
- Strong analytical and problem-solving mindset.
- Ability to collaborate with cross-functional teams.
- Good communication skills and a proactive approach.
Why Join IRG Global?
- Work in an innovative, fast-growing AI environment.
- Opportunity to build transformative speech and AI technologies.
- Competitive salary, growth opportunities, and a collaborative work culture.
- Exposure to global clients and major industry projects.
SALARY:
PKR 80,000 to PKR 100,000
Apply to this job!