Abstract
Machine learning as a tool for automation has grown exponentially in the past two decades. Growth has come from innovations in hardware, such as powerful graphics processing units (GPUs) and cloud computing. Along with hardware advances, there has been an explosion in software packages, algorithms, and tools for performing machine learning. This innovation has resulted in a landscape full of vendor offerings with bold claims about accuracy and precision while not offering much subject matter expertise. The corrosion industry is no exception and faces unique risks in evaluating machine learning tools. Unlike many consumer-grade tools, the cost of corrosion detection and estimation errors are unbalanced; the cost of a false negative (the algorithm reports no corrosion; however, corrosion is present) is higher than in consumer applications. In addition, corrosion has a number of mechanisms and morphologies, so operators must be cautious and understand the limitations of using any one-size-fits- all machine learning tool. For any algorithm, the output is only as good as the data used to train it, so operators must be aware of possible sources of error and bias in training datasets. This work considers the above constraints and offers a checklist for algorithm developers to consider when assessing corrosion with machine learning.