Chapter Efficient Data Curation Using Active Learning for a Video-Based Fire Detection
Language
EnglishAbstract
Video-based fire detection is a crucial object detection problem that relies on accurate and reliable data to detect fires. However, collecting and labeling fire-related data can be time-consuming and expensive, making it difficult to obtain sufficient data for training machine learning models. To address this challenge, uncertainty-based active learning techniques can be used to iteratively select the most informative samples for labeling. This can reduce the amount of labeled data needed to achieve high model performance and has the potential to even prune the training data with fewer informative samples. The traditional sampling-based uncertainty estimation methods are computationally expensive. Hence, an efficient prior network-based ensemble distillation State-of-the-Art approach is evaluated on an internal dataset which still requires relatively higher overhead computation making it difficult for production deployment. A biased softmax differencing-based uncertainty approach and a feature-based hard data mining approach are proposed and compared with the distillation approach. The novel approaches are found to have a very low overhead uncertainty estimation time compared to the ensemble distillation approach and traditional sampling techniques. The methods are evaluated in the context of curating the unlabeled pool data and improving the training data. For completeness, the experiments are performed on three different data sizes, and overall, the frame-wise selection strategy is proved to be better than the sequence-wise querying strategy. The Principal Component Analysis (PCA)-based hard data mining outperformed other methods and improved the model performance by 16.33% with AUC2% metric when compared with the random selection of data. The approach even outperformed the main network trained on full data by 7.33%, henceforth improving the training data by using informative 26.39% data. The results indicate that novel data mining provides efficient training and pool data curation
Keywords
Uncertainty Estimation; Active Learning; Object Detection; Outlier Detection; Feature-based cluster analysis; Video-based Fire DetectionDOI
10.36253/979-12-215-0289-3.60ISBN
9791221502893, 9791221502893Publisher
Firenze University PressPublisher website
https://www.fupress.com/Publication date and place
Florence, 2023Series
Proceedings e report, 137Classification
Computing and Information Technology