Abstract
Microbiologically influenced corrosion (MIC) is a significant source of pitting corrosion affecting oil and gas pipelines, wells, and a variety of surface facilities. Understanding of MIC is greatly enhanced through DNA and protein sequencing technologies. This paper highlights the need to understand the methods used to generate the data, the data quality, and the limitations associated with data interpretation through a case study involving the metagenomics and proteomic analysis of pig envelope debris and seawater samples from various locations within a seawater injection system suspected to be suffering from MIC. In this study, sequencing was performed both with and without 16S rDNA gene amplification. Following bioinformatics testing, the resulting data showed dramatically different results when comparing the 16S sequence data to the shotgun-based sequence data. We also showed that the difference between using a RefSeq (NCBI) downloaded in 2013 versus an updated database (2015) significantly impacted data interpretations. One particular organism, Sedimenticola selenatireducens, was found to dominate the relative abundance of the samples when the updated database was used, while it was not identified when the 2013 database was used. Further, proteomic information was used to confirm the presence and abundance of particular organisms and expressed genes.