Training Azerbaijani language models on Amazon SageMaker AI
TL;DR
Azercell Telecom partnered with AWS to build a production-ready Azerbaijani LLM on Amazon SageMaker AI for telecom use cases and a customer-facing chatbot. The team had to adapt foundation models to a morphologically rich language with limited training data and no proven blueprint. A six-week collaboration with the AWS Generative AI Innovation Center delivered the production framework.
Nauti's Take
Building an Azerbaijani LLM in six weeks is a real step forward and a clear opportunity for low-resource languages to escape being an afterthought to English models. The risk: such models stay tied to one vendor (here AWS Bedrock) and to limited training data, which can deepen lock-in and bias.
Telcos and agencies with local languages should treat this AWS stack as a blueprint, not the only recipe.