|
2024-04-22 20:38:57 INFO {'title': 'A weighted multi-source domain adaptation approach for surface defect detection', 'authors': 'Bing Hu\n\nJianhui Wang\n\nSchool of Information Science and Engineering, Northeastern University, Shenyang, China\n\n', 'abstract': , 'sections': [{'heading': 'Appendix A', 'text': 'The beginning of Appendix A section.'}, {'heading': '1 Introduction', 'text': }, {'heading': '2 Related Method', 'text': 'The beginning of 2 Related Method section.'}, {'heading': 'Surface defect detection', 'text': 'Previous research on surface defect detection, edge detection [23, 24], clustering [25], and other image processing methods are commonly used. However, applying these methods requires image pre-processing by adjusting parameters based on the experience and knowledge of the inspector to obtain smoother images and more distinct defect features. These methods are time-consuming, labour-intensive, and subjective.\n\nDeep learning methods, such as support vector machines, CNN, and random forests, have also been used to detect defects (anomalies). As is mentioned above, one of the main problems of supervised learning methods based on deep learning is that training data collection is usually expensive in terms of time and resources. And for some unsupervised learning methods, such as deep autoencoder (DAE) and generative adversarial network (GAN) [26], also have limitations when dealing with surface defect detection in complex textures. The implementation of GANs may not be reliable because the reconstruction results are unpredictable. Although DAE can obtain better reconstruction quality, we need to care if the defect and the texture have a similar potential feature. The too powerful reconstruction ability of the decoder will reconstruct the defective area, which makes the defective indistinguishable in the template contrast operation.\n\nIn this paper, we focus on defects detection for surfaces with complex textures. Unlike ordinary product surfaces, these products have particular patterns on the surface, which often interfere with identifying defects. Therefore, it is difficult to detect faults using traditional threshold segmentation methods. Template matching is a standard method for defect detection on complex textures [22]. However, the background features are not constant due to the irregular surface, uneven illumination, and other factors, making matching difficult. Therefore, none of these methods can meet our requirements. Many researchers have focused on the task of defect detection in complex textures. In [27], a transfer learning approach used the pre-trained networks and obtained promising results. In [28], a Bayes classifier that can be adapted to changing conditions is proposed. This classifier can achieve good performance by training with a small sample set. In [29], a particularly robust method named adjacent evaluation completed local binary patterns is proposed, improving the recognition rate of the hot-rolled steel strip surface defects.\n\n'}, {'heading': 'Domain adaptation', 'text': 'Transfer learning aims to achieve a classifier trained from a label-rich domain (i.e. source domain) to have good results in a label-scarce domain (i.e., target domain). Domain adaptation (DA) is a typical example of transfer learning methods. It tackles the problem that the source and target domains have the same feature space and category space, and only the feature distribution is inconsistent. Some previous methods achieve this purpose by minimizing explicit domain discrepancy metrics. Maximum mean-variance (MMD) is the most commonly used to reduce distribution offset [30, 31]. Such methods also include correlation alignment [32], Kullback-Leibler (KI) divergence [33], and \\(\\mathcal{H}-\\)divergence [34]. Another widely used method in DA is generative adversarial networks. It uses a domain discriminator to confuse the target domain with the source domain to learn invariant features between different domains [35, 36, 37, 38].\n\nMulti-source domain adaptation (MSDA) assumes that data is collected from multiple source domains with different distributions. Compared with single-source domain adaptation, this is a more realistic scenario. Ben-David et al. [34] express the target distribution as a weighted combination of multiple source distributions. The deep cocktail network (DCTN) [39] proposed a \\(\\boldsymbol{k}\\)-way domain discriminator and class classifier for digital classification and real-world object recognition. Peng et al. [40] proposed an approach with moment matching for MSDA which aims to transfer knowledge from multiple labelled source domains to an unlabelled target domain.\n\n'}, {'heading': '3 Proposed method', 'text': 'In MSDA, there are \\(m\\) source domains \\(S_{1}\\), \\(S_{2}\\),..., \\(S_{m}\\) and a target domain \\(T\\). The domain \\(S_{j}=\\{(x_{j}^{S_{j}},y_{j}^{S_{j}})\\}_{j=1}^{N_{S_{j}}^{N_{S_{j}}}}\\) is characterized by \\(N_{S_{j}}\\) i.i.d. labelled samples, where\\(J_{j}^{S_{j}}\\in\\{1,2,...,K\\}\\) (\\(K\\)is the number of classes) and \\(x_{j}^{S_{j}}\\) follows one of the source distributions \\(X^{S_{j}}\\). Similarly, the target domain \\(T\\ =\\{x_{j}^{T}\\}_{j=1}^{N_{T}}\\) is represented by \\(N_{T}\\) i.i.d. unlabelled samples, where \\(x_{j}^{T}\\) follows target distributions \\(X^{T}\\). The MSDA problem aims to train the model using samples of multiple source domains and target domains, minimizing the testing error of the target \\(T\\).\n\nIn MSDA, samples from multiple source domains can provide richer feature information of the objects for the target domain. Based on more supporting data, the decision boundary of the features can be further refined. However, the different distribution of different source domains increases the difficulty of learning domain invariant features. In the task of defect detection, the size and proportion of texture features in samples are much larger than that of defect features, so they are inevitably represented as salient features during feature extraction. Therefore, aligning all domains without considering the correlation and consistency of different source domains and target domains is unreliable. To address this, our idea is to prioritize the alignment of the target domain with those source domains that are more difficult to separate samples from the target. Inspired by the work in [39] and [41], we use a weighted adaptation network to solve the issue.\n\nThe overview of the proposed method is shown in Figure 1, and it is based on the domain adversarial training framework. There are three subnets in the network, feature extractor, (multi-source) domain discriminator, and (multi-source) classifier. There are two unshared weights feature-extraction functions \\(F_{i}\\) and \\(F_{i}\\) in our network, which are employed by source domains and the target domain, respectively. We build \\(m\\) discriminators \\(D=\\{D_{S_{j}}\\}_{j=1}^{m}\\) and \\(m\\) classifiers \\(C_{S}=\\{C_{S_{j}}\\}_{j=1}^{m}\\). For each source domain \\(S_{j}\\), the specific domain discriminator \\(D_{S_{j}}:F\\rightarrow\\{0,1\\}\\) distinguishes the input feature that comes from the source domain \\(S_{j}\\) and the target domain \\(T\\). Similarly, \\(m\\) classifiers accept features \\(F_{i}^{\\prime}(\\alpha)\\) or \\(F_{i}^{\\prime}(\\alpha)\\) and output the probability that the sample belongs to each class with the softmax function. The discriminator and classifier of the source domain \\(S_{j}\\) are independent of other sources.\n\nWe first pre-train the network to get each source domain classifier \\(C_{S_{j}}\\) and source feature extractor \\(F_{i}^{\\prime}(\\alpha)\\). In the pre-training phase, the classifier \\(C_{S_{j}}(F_{i}(\\alpha))\\) loss can be described as follow:\n\n\\[\\min_{F_{i}^{\\prime},C_{i}}\\mathbf{\\mathcal{L}}_{oi}\\%\\left(C,F_{i} \\right)= -\\sum_{j}^{m}\\mathbb{E}_{(x_{j})\\sim\\left(X^{S_{j}},Y^{S_{j}} \\right)}\\text{log}C_{S_{j}}\\left(F_{i}^{\\prime}\\left(\\alpha\\right)\\right)\\] \\[+\\ (1-y)\\log\\left(1-C_{S_{j}}\\left(F_{i}^{\\prime}\\left(\\alpha \\right)\\right)\\right) \\tag{1}\\]\n\nSince each source domain is trained on a supervised model, it can obtain the best representations of the classifier and feature extractor. Then we use adversarial training to reduce the distance between target domains and source domain distributions. Intuitively, a transfer network of a source domain can provide better performance if the source domain distribution is closer to the target. Conversely, the distribution of the source domain farther away from the target will reduce the transfer performance. We use the domain discriminators to indicate the distribution distance between the target domain and each source domain.\n\nWe fixed \\(F_{i}\\) and \\(C_{S}\\) to \\(\\bar{F_{i}}\\) and \\(\\bar{C_{S}}\\) when the network converges, and then optimize the domain discriminator \\(D\\) and target feature extractor \\(F_{i}\\), the objective is as follows:\n\n\\[\\min_{F_{i}}\\max_{D}\\mathbf{\\mathcal{L}}_{oh}\\left(D,F_{i}\\right)= \\frac{1}{M}\\sum_{j}^{m}\\mathbb{E}_{x_{N}\\sim X^{S_{j}}}\\left[ \\log D_{S_{j}}\\left(\\bar{F_{i}}\\left(\\alpha\\right)\\right)\\right]\\] \\[+\\mathbb{E}_{x_{N}\\sim X^{T}}\\left[\\log\\left(1-D_{S_{j}}\\left(F_{ i}^{\\prime}\\left(\\alpha\\right)\\right)\\right)\\right]\\]\n\nIt is worth noting that if the target feature extraction \\(F_{i}\\) and the source feature extraction \\(F_{i}\\) share weights (\\(F_{i}=F_{i}\\)) or both change during the adversary training, which will lead to oscillation. To solve this problem, reference [36] used domain confusion to replace the adversarial objective. In this paper, \\(F_{i}\\) and \\(F_{i}\\) have different parameters and \\(F_{i}\\) has been fixed. Therefore, we can use Equation (2) to only update \\(F_{i}\\) and \\(D\\) during the adversary training, similar to the original GAN. In the meantime, to avoid the disappearance of the gradient in the initial training, we use \\(\\bar{F_{i}}\\) to initialize \\(F_{i}\\).\n\nSimultaneously, when the domain discriminator \\(D\\) has converged to the optimal value of the current feature extractor, we use it to indicate the probability of samples from the source or target domain distribution. It is difficult to determine which domain the samples belong to if the score is close to 0.5. And these samples are more likely to come from the source domain\n\nFigure 1: The overview of the proposed method. (\\(F\\): feature extractor, \\(C\\): classifier, \\(D\\): domain discriminator, \\(GRL\\): gradient reversal layer, \\(L\\): Equation (7) weighted domain loss)\n\n[MISSING_PAGE_FAIL:5]\n\nand optimize it with learning \\(\\ell\\gamma=2\\mathrm{e}^{-4}\\) and momentums \\(\\beta 1=0.5,\\beta 2=0.999\\).\n\nIt is worth noting that although in the experiments we used gray image as a standard input for training and testing to speed up the computation, our method also works with colour samples. Benefiting from convolution-based ResNet, when using RGB image as input, it is only needs to adjust the depth of convolution kernels corresponding to it.\n\n'}, {'heading': 'Experiments on digit recognition', 'text': 'The Digits-five dataset is widely used in the performance evaluation of MDA. The dataset consists of samples from five different sources, namely MNIST [47], MNIST-M [48], SVHN [49], USPS and Synthetic Digits [48]. Following [39], for MNIST, MINST-M, SVHN, and Synthetic Digits, we sample 25,000 images for training and 9000 for testing in each dataset. And choose the entire 9298 images in USP as a domain.\n\nWe compared our method with four state-of-the-art domain adaptation methods: Deep adaptation network (DAN) [50], Domain adversarial neural network (DANN) [35], Deep cocktail network (DCTN) [39] and moment matching for multi-iSource (MSDA) [40]. For Source Only and single-source method experiments, we follow the source combine setting in [40]. All source domains data are combined into a single source. For a fair comparison, all the deep learning models are used ResNet-50 as the backbone. We run each experiment five times to take the average and deviation.\n\nThe results are shown in Table 1. Our proposed method achieves a 91.6% average accuracy, outperforming other baselines by a large margin.\n\n'}, {'heading': 'Experiments on DAGM', 'text': 'The DAGM 2007 dataset covers many types of manufacturing material surfaces in the industry. The samples are shown in Figure 2. It comprises 8050 training images and 8050 testing images with a size of 512 \\(\\times\\) 512 and 8-bit grayscale PNG format. There are ten classes of artificially generated surfaces with specific textures in DAGM. The dataset provides 2112 ground-truth images to identify the defect region. In each experiment, we set one of the classes as the target domain and the rest as source domains.\n\nAs shown in Figure 3, we cropped each image used for training and testing into 64 patches with a size of 77 \\(\\times\\) 77 (the\n\n\\begin{table}\n\\begin{tabular}{l c c c c c} \\hline \\hline\n**Methods** & **Acc** & **mAP** & **AUC** & **F1** \\\\ \\hline Faster R-CNN (source combine) & 0.89 & 0.66 & 0.87 & 0.68 \\\\ Faster R-CNN (source only) & 0.80 & 0.43 & 0.64 & 0.56 \\\\ AnoGAN & 0.71 & 0.51 & 0.57 & 0.56 \\\\ AE (SSIM) & 0.76 & 0.54 & 0.70 & 0.60 \\\\ Our method & 0.91 & 0.78 & 0.92 & 0.82 \\\\ \\hline \\hline \\end{tabular}\n\\end{table}\nTable 2: Comparison with related work on the DAGM dataset (for MAP, ROC AUC, F1-measure)\n\nFigure 3: The example of segmentation sample images from DAGM\n\nFigure 2: Examples of sample images from DAGM\n\nfirst row and the first column patches size is 64 \\(\\times\\) 64) before inputting our network. Each neighbour patch has a 20% overlap to avoid only the defects edge in the patch. According to the smallest sum of abnormal pixels contained in the corresponding ground truth picture, we divide the patches into abnormal (positive) and normal (negative). The threshold is calculated based on half of the most petite side length of the defect in the dataset. At last, pixel values of all patches are normalized into a range of [\\(-\\)1,1] to avoid excessive deviations in the calculation. We need such pre-processing due to reasons: (1) Industrial cameras usually have a considerable high resolution in industrial surface defect detection. Cropping can reduce computation and solve insufficient training data problems. (2) Image-level annotation is more efficient than pixel-level annotation in the network training stage. And in the inference stage, we can locate the defect position based on image prediction. (3) The scaling operation does not affect network performance due to the CNN as the feature extractor.\n\nIt is not enough to perform a single accuracy score on the unbalanced data set in defect detection. Therefore, we have adopted several comprehensive indicators, such as ACC, Precision, and F-Score. These indicators are shown in Equations (12), (13), and (15).\n\n\\[\\text{Accuracy }=\\frac{\\textit{TP}+\\textit{TN}}{\\textit{TP}+\\textit{TN}+ \\textit{FP}+\\textit{FN}} \\tag{12}\\]\n\n\\[\\text{Precision }=\\frac{\\textit{TP}}{\\textit{TP}+\\textit{FP}} \\tag{13}\\]\n\n\\[\\text{Recall }=\\frac{\\textit{TP}}{\\textit{TP}+\\textit{FN}} \\tag{14}\\]\n\n\\[F_{1}=\\frac{2\\times\\text{Precision }\\times\\text{Recall}}{\\text{ Precision }+\\text{ Recall}} \\tag{15}\\]\n\nThe prediction results are divided into true positive (TP), true negative (TN), false positive (FP), and false negative (FN). TP represents the correct prediction of normal pictures, TN represents the correct prediction of defective pictures, FP represents the wrong prediction of normal pictures, and FN represents the wrong prediction of defective pictures. AUC represents the two-dimensional measurement area under the receiver operating characteristic (ROC) curve, which is used for the performance evaluation of a classification model.\n\nWe compare our proposed method with three state-of-the-art defect detection algorithms. These methods include one supervised learning method and two unsupervised learning methods. Faster-RCNN as a supervised method has an excellent performance in the defect detection task [18, 51]. This method can perform well by fine-tuning the pre-trained cross-domain model when the target domain samples are unbalanced. Fine-tuning can be seen as a simple transfer learning method.\n\nThe experiment results on DAGM are shown in Table 2. We obtained the results by binary classification of cropped images in the test dataset. The results show that the model performance which uses source combined setting for pre-training is better than using single-source domains. The feature of defects is more generalized when referencing multi-source domain data. However, because different texture features in different domain samples interfere with the extraction of defect features, equal treatment of this interference will not obtain accurate decision boundaries. Therefore, the method has insufficient defect detection capabilities under complex textures.\n\nIn the two unsupervised learning methods, GANomaly [21] and SSIM Autoencoder [52], Only the target domain normal samples are used for training. As shown in Table 2, the two methods have a common shortcoming: the performance varies in different domains. It is due to these two methods depending on the quality of the reconstruction image. When the target sample is a relatively stable structural texture feature, the methods perform well. Once this structural feature is destroyed, the model cannot reconstruct the texture feature from the training data. As described in the previous section, this is not applicable in detecting surface defects of industrial products.\n\nWe visualize data distribution to two-dimensional features as Figure 4. Red points indicate defect data, and green points indicate normal data. We can see that the feature boundary between normal and defective samples is more apparent after using our method for domain adaptation.\n\nFigure 4: T-SNE visualization of the features mapped from the well-trained network, (i) source only (b) our method (normal: green; defective: red)\n\nConclusion\n\nIn this paper, we propose a multi-domain adaptation method for detecting surface defects of industrial products. The method uses a reweight adversarial domain adaption. More weight is assigned to the target-related source domain in the adaptation process and achieves a better adaption performance. The method can well solve the issue of sparse or unbalanced target data in surface defect detection. And it can deal with the interference of detection defects with complex textures. It is concluded from the experiments that our proposed method has more advantages than previous domain adaptive methods on Digits-five datasets and has satisfactory results in defect detection, especially for complex textures. We will continue developing our approach and applying it to various material surface defect detection tasks in future work.\n\n'}, {'heading': 'Data Availability Statement', 'text': }, {'heading': 'Funding Information', 'text': 'None.\n\n'}, {'heading': 'Conflict of Interest', 'text': 'The authors declare that they have no conflict of interest.\n\n'}, {'heading': 'References', 'text': '* [1] Sun X., Gu J., Tang S., Li J.: Research progress of visual inspection technology of steel products--a review. Appl. Sci. 8(11), 2195 (2018)\n* [2] Aghdam S.R., Amid E., Imani M.F.: A fast method of steel surface defect detection using decision trees applied to LBP based features. In 2012 7th IEEE Conference on Industrial Electronics and Applications (ICIEA), Singapore, 18-20 July 2012\n* [3] Kim C.-W., Koivo AJ.: Hierarchical classification of surface defects on dusty wood boards. Pattern Recogn. Lett. 15(7), 713-721 (1994) /07/1/1994\n* [4] Tsamakas J.A., Chrysostomou D., Botsaris PN., Gasteratos A.: Fault diagnosis of photovoltaic modules through image processing and Canny edge detection on field thermographic measurements. Int. J. Sustainable Energy 34(6), 351-372 (2015) /07/03 2015\n* [5] Mak K.L., Peng P. Yu, K.Fcc: Fabric defect detection using morphological filters. Image Vision Comput. 27(10), 1585-1592 (2009)\n* [6] Heylarzadeh M., Nourani M.: A two-stage fault detection and isolation platform for industrial systems using residual evaluation. Ieee Trans. Instrum. Meas. 65(10), 2424-2432 (2016)\n* [7] Bai X., Fang Y., Lin W., Wang L., Ju B.: Saliency-based defect detection in industrial images by using phase spectrum. IEEE Trans. Ind. Inf. 10(4), 2135-2145 (2014)\n* [8] Wang H., Zhang J, Tian Y., Chen H., Sun H., Liu K.: A simple guidance template-based defect detection method for strip steel surfaces. IEEE Trans. Ind. Inf. 15(5), 2798-2809 (2019)\n* [9] Liu W., et al.: SSD: Single Shot MultiBox Detector. p. arXiv:1512.02325Accessed on: December 01, 2015 [Online]. Available: [https://ui.adshs.harvard.edu/abs/2015arXiv151202325L](https://ui.adshs.harvard.edu/abs/2015arXiv151202325L)\n* [10] Redmon J., Divvala S., Girshick R., Farhadi A.: You only look once: unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27-30 June 2016\n* [11] Ren S., He K., Girshick R., Sun J.: Faster R-CNN: Towards real-time object detection with region proposal networks. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, R. Garnett. (eds.) Advances in Neural Information Processing Systems vol. 28, pp. 79-99. Curran Associates (2015)\n* [12] He K., Gkioxani G., Dollar P., Girshick R.: Mask R-CNN. p. arXiv:1703.06870 Accessed on: March 01, 2017 [Online]. Available: [https://ui.adshs.harvard.edu/abs/2017arXiv170306870f](https://ui.adshs.harvard.edu/abs/2017arXiv170306870f)\n* [13] Lin H., Li B, Wang X., Shu Y., Niu S.: Automated defect inspection of LED chip using deep convolutional neural network. J. Imell. Manuf. 30(6), 2525-2534 (2019)\n* [14] Hu B., Wang J.: Detection of PCB surface defects with improved faster-renn and feature pyramid network. IEEE Access 8, 108335-108345 (2020)\n* [15] Lin T.-Y., Goyal P., Girshick R., He K., Dollar P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22-20 October 2017\n* [16] Kervadec H., Bouchith J., Desrosiers C., Granger E., Dolz J., Ayed I.B: Boundary loss for highly unbalanced segmentation. In International Conference on Medical Imaging with Deep Learning, Zurich, Switzerland, 6-8 July 2019\n* [17] Ke M., Lin C., Huang Q.: Anomaly detection of Lego images in the mobile phone using convolutional autoencoder. In: 2017 4th International Conference on Systems and Informatics (ICSAI), Hangzhou, China, 11-13 November 2017\n* [18] Haselmann M., Gruber D.P., Tabatabai P.: Anomaly detection using deep learning based image completion. In 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, 17-20 December 2018\n* [19] Christoph B, Benedikt W, Shadi A., Nassir N.: Deep autoencoding models for unsupervised anomaly segmentation in brain mr images. arXiv preprint arXiv:1804.04488, 2018. URL. Available: [http://arxiv.org/abs/1804.04488](http://arxiv.org/abs/1804.04488)\n* [20] Akcay S., Atapour-Aharghouai A., Breckon T.P.: GANomaly: Semi-supervised anomaly detection via adversarial training. p. arXiv:1805.06725 Accessed on May, 2018 [Online]. Available: [https://ui.adsabs.harvard.edu/abs/2018arXiv180506725A](https://ui.adsabs.harvard.edu/abs/2018arXiv180506725A)\n* [21] Shimodaira H.: Improving predictive inference under covariate shift by weighting the log-likelihood function. J. Stat. Plann. Infer. 90(22), 227-244 (2000)\n* [22] Zadrozny B.: Learning and evaluating classifiers under sample selection bias. Paper presented at the Proceedings of the twenty-first international conference on machine learning, Banff, Alberta, Canada, 4 July 2004 [https://doi.org/10.1145/10153301015425](https://doi.org/10.1145/10153301015425)\n* 32nd Annual Conference on IEEE Industrial Electronics, Paris, France, 6-10 November 2006\n* [24] Abdel-Qader L., Auddyeh O., Michael E.K.: Analysis of edge-detection techniques for crack identification in bridges. J. Comput. Civil Eng. 17(4), 255-263 (2003)\n* [25] Wu S., Wu Y., Cao D., Zheng C.: A fast button surface defect detection method based on Siamese network with imbalanced samples. Multimedia Tools Appl. 7(8), 3647-36468 (2019)\n* [26] Goodfellow I., Pouget-Abadie J., Mirza M., et al. Generative adversarial nets. In NIPS, Montreal, Canada, 8-13 December 2014\n* [27] He Y., Song K., Meng Q., Yan Y.: An end-to-end steel surface defect detection approach via fusing multiple hierarchical features. IEEE Trans. Instrum. Meas. 69(4), 1493-1504 (2020)\n* [28] Xiao M., Jiang M., Li G., Xie L., Yi L.: An evolutionary classifier for steel surface defects with small sample set. EURASIP J. Image Video Process. 2017(7), 48 (2017)\n* [29] Song K., Yan Y.: A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl. Surf. Sci. 285, 858-864 (2013)\n* [30] Long M., Zhu H., Wang J., Jordan M.L.: Deep transfer learning with joint adaptation networks. Paper presented at the Proceedings of the 34thInternational Conference on Machine Learning, Proceedings of Machine Learning Research, Sydney, Australia, 6-11 August 2017) [http://proceeding.mlmlpress.v/7/009/17a.html](http://proceeding.mlmlpress.v/7/009/17a.html)\n* [31] Ghifary M, Bastian Klein W, Zhang M.: Domain adaptive neural networks for object recognition. p. arXiv:1409.6041 (2014) Accessed on: September 1 [Online]. Available: [https://ui.iadsabs.harvard.edu/abs/2014arXiv1409.60416](https://ui.iadsabs.harvard.edu/abs/2014arXiv1409.60416)\n* [32] Sun B, Feng J, Saenko K. Return of frustratingly easy domain adaptation. In: Proceedings of the AAAI Conference on Artificial Intelligence, Phoenixs, Arizona 12-17 February 2016\n* [33] Zhuang F, Cheng X, Luo P, Pan S.J, He Q: Supervised representation learning Transfer learning with deep autoencoders. In: Proceedings of the 24th International Conference on Artificial Intelligence, Buenos Aires, Argentina, 25-31 July 2015\n* [34] Ben-David S, Blitzer J, Cannner K, Kolesza A, Pereira F, Vaughan JW. A theory of learning from different domains. Mach. Learn. 79(1), 151-175 (2010) 2010/05/01\n* [35] Ganin Y, Lempitsky V.: Unsupervised domain adaptation by backpropagation. In: Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, Lille, France, 6-11 July 2015. [Online]. Available: [http://proceedings.mlr.press/v37/gnanin15.html](http://proceedings.mlr.press/v37/gnanin15.html)\n* [36] Tzeng E, Hoffman J, Seenho K, Darrell T.: Adversarial discriminative domain adaptation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21-26 July 2017\n* [37] Bousmalis K, Silberman N, Dohan D, Erhan D, Krishnan D: Unsupervised pixel-level domain adaptation with generative adversarial networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21-26 July 2017\n* [38] Sanarkaramanyan S, Balaji Y, Castillo CJ, Chellappa R.: Generate to adapt: Aligning domains using generative adversarial networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18-23 June 2018\n* [39] Xu R, Chen Z, Zuo W, Yan J, Lin L.: Deep cocktail network: multi-source unsupervised domain adaptation with category shift. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18-23 June 2018\n* [40] Peng X, Bai Q, Xia X, Huang Z, Seenho K., Wang R. Moment matching for multi-source domain adaptation. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 27 October-2 November 2019\n* [41] Zhang J, Ding Z, Li W, Ogunbona P.: Importance weighted adversarial nets for partial domain adaptation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18-23 June 2018\n* [42] Zhang W, Xu D, Ouyang W, Li W: Self-paced collaborative and adversarial network for unsupervised domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 43(6), 2047-2061 (2021)\n* ECCV 2020. Cham, Springer International Publishing, 608-624, (2020)\n* [44] Jager M, Knoll C, Hamprecht F.A.: Weakly supervised learning of a classifier for unusual event detection. IEEE Trans. Image Process. 17(9), 1700-1708 (2008)\n* [45] He K, Zhang X, Ren S, Sun J: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27-30 June 2016\n* [46] Kingma DP, Ba J: Adam: A method for stochastic optimization. arXivcoring (2017)\n* [47] Lecun Y, Bottou I., Bengio Y, Haffner P: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278-2324 (1998)\n* [48] Ganin Y, et al.: Domain-adversarial training of neural networks. in domain adaptation in computer vision applications. In: C. Gabriola, (ed.) Advances in Computer Vision and Pattern Recognition: Cham, Springer (2017)\n* [49] Netzer Y, Wang T, Coates A., Bissacco A., Wu B, Ng A.Y.: Reading digits in natural images with unsupervised feature learning. in NIPS Workshop on Deep Learning and Unsupervised Feature Learning, Granada, Spain, 16-17 December 2011\n* [50] Mingheng L., Yue C., Jianmin W, Michael J: Learning transferable features with deep adaptation networks. In: Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 6-11 July 2015\n* [51] Sun X, Gu J, Huang R, Zou R, Giron Palomares B.: Surface defects recognition of web label based on improved faster R-CNN. Electronics 8(5), 481, (2019)\n* [52] Bergmann P, Lopez S, Fauser M., Sateloger D., Stager C.: Improving unsupervised defect segmentation by applying structural similarity to autoencoders. In: Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Prague, Czech Republic, 25-27 February 2019\n\nHow to cite this article: Hu, B, Wang, J.: A weighted multi-source domain adaptation approach for surface defect detection. IET Image Process. 16, 2210-2218 (2022). [https://doi.org/10.1049/ipr2.12484](https://doi.org/10.1049/ipr2.12484)\n\n'}, {'heading': 'Appendix A A1 | Ablation study', 'text': 'Compared with the original MSDA method using adversarial training, there are two improvements to our proposed approach. One is that we utilize different weights \\(\\mathbf{\\pi}_{Y_{j}}\\) for different source domains in multi-source domain adversarial training. The other is to use pseudo-labels to update the target feature extractor. We designed a set of ablation experiments on DAGM to verify the importance of each part. In the first experiment, we set the same \\(\\mathbf{\\pi}_{Y_{j}}\\) for all source domains. The target classifier was also composed of a combination of source classifiers fairly. Table 3 shows that the model is unavailable when the target domain is aligned to each source domain. In fact, during the training process, the network can hardly converge which proves that the negative transfer will reduce model performance when the target domain is aligned to a wrong source domain. And when the target data is shifted, applying pseudo-labels to constrain the feature extractor can efficiently avoid confusion in the classification system and thus improve model performance.\n\n'}, {'heading': '| Learning', 'text': 'In this paper, we choose to use the GRL for solving the minimax game between \\(F_{j}\\) and \\(\\bar{D}\\). It works by inserting a gradient reversal layer (GRL) to multiply the gradient of \\(F_{j}\\) by -1 to learn \\(F_{j}\\) and \\(\\bar{D}\\) simultaneously. The algorithm flow has been summarized in Algorithm 1. Another solution would be iteratively training the two objectives. For our method, This adversarial learning algorithm is represented by the Algorithm 2.\n\n'}]} |