Human-in-the-Loop Active Learning for Continuous Model Improvement in Enterprise AI Pipelines

Authors

  • Avneet Bansal

Keywords:

human-in-the-loop, active learning, generative AI pipelines, Amazon A2I, SageMaker Ground Truth, confidence calibration, document extraction

Abstract

Enterprise document extraction systems built on large language models accumulate months of human correction data without feeding those corrections back to the extraction pipeline. The result is a structural learning gap: the same errors recur across successive batches while the correction history that would resolve them sits unread in QA logs. This article describes a closed-loop Human-in-the-Loop (HITL) active learning architecture that eliminates this gap. The framework captures extraction events, correction events, and prompt configuration change events in a unified event store; applies active learning query strategies — principally uncertainty sampling and disagreement sampling — to concentrate human review on the instances most likely to produce informative corrections; and propagates validated correction signals through a prompt evolution policy that enforces regression protection before changes are deployed. The implementation layer uses Amazon Augmented AI (A2I) for review workflow management and Amazon SageMaker Ground Truth for annotation, auto-labeling, and labeling model training. Evaluated across production enterprise document extraction deployments covering 130+ field types and six document categories, the framework produced a 38% reduction in false-positive review traffic and improved the self-healing rate from 41% to 81% over eight consecutive production batches. The architecture is domain-agnostic and generalizes to any AI pipeline in which humans correct model outputs at scale and prompt-level intervention is the primary optimization lever

Downloads

Download data is not yet available.

References

B. Settles, "Active Learning Literature Survey," Computer Sciences Technical Report 1648, Univ. of Wisconsin–Madison, 2009. [Online. . Available: https://burrsettles.com/pub/settles.activelearning.pdf

B. Settles and M. Craven, "An Analysis of Active Learning Strategies for Sequence Labeling Tasks," in Proc. EMNLP 2008, pp. 1070–1079. doi: 10.3115/1613715.1613855. Available: https://www.biostat.wisc.edu/~craven/papers/settles.emnlp08.pdf

R. Monarch, Human-in-the-Loop Machine Learning: Active Learning and Annotation for Human-Centered AI. Manning Publications, 2021. [Online. . Available: https://www.manning.com/books/human-in-the-loop-machine-learning

E. Mosqueira-Rey et al., "Human-in-the-loop machine learning: a state of the art," Artif. Intell. Rev., vol. 56, pp. 3005–3054, 2023. doi: 10.1007/s10462-022-10246-w

Yu Xia et al., "A Survey of LLM-based Active Learning," in Proc. ACL 2025. [Online. . Available: https://aclanthology.org/2025.acl-long.708.pdf

Nils Reimers and Iryna Gurevych, "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks," in Proc. EMNLP-IJCNLP 2019, pp. 3982–3992. [Online. . Available: https://arxiv.org/abs/1908.10084

Amazon Web Services, "Amazon SageMaker Ground Truth: Build Highly Accurate Datasets and Reduce Labeling Costs by up to 70%," AWS Blog, 2018. [Online. . Available: https://aws.amazon.com/blogs/aws/amazon-sagemaker-ground-truth-build-highly-accurate-datasets-and-reduce-labeling-costs-by-up-to-70/

Rahul Pandey et al., "Modeling and Mitigating Human Annotation Errors to Design Efficient Stream Processing Systems with Human-in-the-Loop Machine Learning," arXiv:2007.03177, 2020. [Online. . Available: https://arxiv.org/abs/2007.03177

K. Goel et al., "LLMs Accelerate Annotation for Medical Information Extraction," in Proc. ML4H, PMLR vol. 225, 2023. [Online. . Available: https://proceedings.mlr.press/v225/goel23a/goel23a.pdf

Hamidreza Rouzegar & Masoud Makrehchi, "Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation," in Proc. LAW, ACL 2024. [Online. . Available: https://aclanthology.org/2024.law-1.10.pdf

Nataliia Kholodna et al., "LLMs in the Loop: Leveraging Large Language Model Annotations for Active Learning in Low-Resource Languages," arXiv:2404.02261, 2024. [Online. . Available: https://arxiv.org/abs/2404.02261

Aritra Hota et al., "Exploring Large Language Models in Active Learning for Annotating Physical Sensing Data," in IEEE Conf. Publication, 2025. [Online. . Available: https://ieeexplore.ieee.org/document/11038626/

Cristian Cardellino et al., "Information Extraction with Active Learning: A Case Study in Legal Text," in Proc. LNCS, Springer, 2015. doi: 10.1007/978-3-319-18117-2_36. Available: https://scispace.com/pdf/information-extraction-with-active-learning-a-case-study-in-4uh3sfmqfb.pdf

Dong Zhao et al., "Reflect then Learn: Active Prompting for Information Extraction Guided by Introspective Confusion," arXiv:2508.10036, 2025. [Online. . Available: https://arxiv.org/abs/2508.10036

P. Lewis et al., "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks," Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 33, pp. 9459–9474, 2020. [Online. . Available: https://arxiv.org/abs/2005.11401

Y. Gao et al., "Retrieval-Augmented Generation for Large Language Models: A Survey," arXiv:2312.10997, 2023. [Online. . Available: https://arxiv.org/abs/2312.10997

Ahmad Dawar Hakimi et al., "Do We Still Need Humans in the Loop? Comparing Human and LLM Annotation in Active Learning for Hostility Detection," arXiv:2604.13899, 2026. [Online. . Available: https://arxiv.org/html/2604.13899v2

Ekaterina Artemova et al., "Hands-On Tutorial: Labeling with LLM and Human-in-the-Loop," arXiv:2411.04637, 2024. [Online. . Available: https://arxiv.org/html/2411.04637v3

Anuj Maharjan and Umesh Yadav, "Chunking, Retrieval, and Re-ranking: An Empirical Evaluation of RAG Architectures," arXiv:2601.15457, 2025. [Online. . Available: https://arxiv.org/abs/2601.15457

Wenqi Fan et al., "A Survey on RAG Meeting LLMs," ACM SIGKDD 2024. [Online. . Available: https://dl.acm.org/doi/10.1145/3637528.3671470

Chaitanya Sharma, "Retrieval-Augmented Generation: A Comprehensive Survey," arXiv:2506.00054, 2025. [Online. . Available: https://arxiv.org/html/2506.00054v1

Downloads

Published

14.02.2026

How to Cite

Avneet Bansal. (2026). Human-in-the-Loop Active Learning for Continuous Model Improvement in Enterprise AI Pipelines. International Journal of Intelligent Systems and Applications in Engineering, 14(1s), 1490–1498. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/8376

Issue

Section

Research Article