Achieving 95% Accuracy in GenAI-based Classification
SMS Country, a global communications platform, automated the classification of millions of daily message routing decisions using a GenAI-based classification agent—achieving 95% accuracy with full explainability.
The Challenge
“SMS Country processes over 50 million messages daily across 200+ countries, each requiring routing decisions that optimize for delivery rate, cost, and regulatory compliance by destination country. Manual rule-based routing systems, built over years, had grown to thousands of overlapping rules that were brittle to new operator relationships, inconsistent in edge cases, and impossible to explain to enterprise customers who questioned routing decisions. The operations team spent 30+ hours per week maintaining routing rules and resolving classification disputes.”
The Solution
Eficens built a GenAI-based routing classification agent that replaced the rule-based system with a model-driven approach. The classifier uses a fine-tuned LLM to analyze each message routing request, considering destination country, message type (transactional, promotional, OTP), sender profile, historical delivery data, and current operator status to produce a ranked routing recommendation with a plain-language explanation of the decision logic.
Implementation
Training Data Preparation
The foundation of the classifier was a curated training dataset of 2.8 million historical routing decisions, annotated by routing specialists with the "correct" routing decision and the reason for it. Data from the prior six months was excluded from training (reserved for validation), and the remaining data was cleaned to remove periods of known system instability. Feature engineering extracted 47 input features from each routing request, including country-operator reliability scores computed from rolling 30-day delivery statistics.
Model Development and Validation
The classifier was developed in three iterations. The first iteration used a standard fine-tuned classification model, achieving 88% accuracy on the validation set—a substantial improvement over the rule-based system's 76% but below the 95% target. Analysis of misclassifications revealed that the most common failure mode was insufficient reasoning about regulatory constraints for specific destination countries. The second iteration added a retrieval step that augmented each classification request with the latest regulatory guidance for the destination country, improving accuracy to 93%. The third iteration added an adversarial dataset of historically disputed routing decisions, fine-tuning the model on cases where the initial decision was incorrect, reaching 95.3% accuracy.
Production Deployment and Monitoring
The classifier was deployed in a shadow mode first—running in parallel with the rule-based system for 30 days without affecting live routing decisions—allowing direct comparison of the two systems' recommendations. The shadow evaluation confirmed the accuracy improvement and identified a small set of message categories where the classifier underperformed, which were excluded from the initial production rollout and addressed in the next training iteration. Full production deployment followed a phased rollout: 10% of traffic on day 1, 50% on day 7, 100% on day 21.
Related Resources
View allDeterministic Validation: Ensuring AI Outputs Meet Strict JSON Contracts
LLMs are probabilistic. Enterprise systems are not. Closing this gap requires deterministic validation—a set of strict contracts that every AI output must satisfy before it's allowed to act on the world.
Managed Autonomy: Balancing Supervised and Autonomous Agent Execution
Full autonomy isn't always the goal. The most reliable enterprise AI deployments use a dynamic autonomy spectrum—knowing precisely when agents should act and when they must ask.
From Chatbots to Agentic AI: Why Orchestration is the New Standard
The shift from reactive chatbots to proactive agentic systems is not an upgrade—it's a fundamental architectural rethink. Here's why orchestration is the only path forward for enterprise AI.