, , , ,
Structural Health Monitoring (SHM) is crucial for ensuring the safety and reliability of civil infrastructures, particularly bridges and viaducts. In this paper, we introduce a novel approach using Transformer neural networks with a Masked Auto-Encoder architecture as Foundation Models for SHM. These models demonstrate the ability to learn generalizable representations from multiple large datasets through self-supervised pre-training, outperforming traditional methods on tasks such as Anomaly Detection (AD) and Traffic Load Estimation (TLE). We delve into the background of Transformer Neural Networks, highlighting their significance in NLP applications and deep learning. The Transformer architecture's innovation lies in self-attention mechanisms that capture long-range dependencies efficiently. We adapt an architecture derived from the Vision Transformer (ViT) for processing vibration data by breaking it into fixed-size patches and applying attention-based Transformers. Masked Autoencoders play a vital role in our approach, serving as scalable self-supervised learners for Computer Vision applications. By reconstructing image patches with missing information during training, these autoencoders encourage robust visual feature representation learning. Our specific masked autoencoder architecture is detailed to showcase its effectiveness in downstream tasks. Our Foundation Models achieve state-of-the-art performance on AD and TLE tasks across three operational viaducts. For AD, we achieve near-perfect accuracy within a short monitoring time span compared to traditional methods like Principal Component Analysis (PCA). On TLE tasks, our models outperform existing approaches significantly on evaluation metrics like R$^2$ score, MAE%, and MSE%. The paper is structured to provide necessary background information in Section 2 before delving into SHM literature overview in Section 3. We describe the viaducts considered for data collection and labeling processes in Section 4. Our foundation model approach is detailed in Section 5, covering the processing pipeline, model architecture, and training procedure. Experimental results are presented in Section 6 while conclusions are drawn in Section 7. Overall, our work showcases a promising direction for ML research in SHM by leveraging transformer-based masked autoencoders for effective anomaly detection and traffic load estimation. Future research avenues may include exploring larger pre-training datasets and optimizing model hyperparameters for enhanced performance. The code and pre-trained models are open-sourced for further exploration at https://github.com/eml-eda/tle-supervised.
- - Structural Health Monitoring (SHM) is crucial for ensuring safety and reliability of civil infrastructures, especially bridges and viaducts.
- - Transformer neural networks with a Masked Auto-Encoder architecture are used as Foundation Models for SHM, outperforming traditional methods in tasks like Anomaly Detection (AD) and Traffic Load Estimation (TLE).
- - The innovation of Transformer architecture lies in self-attention mechanisms that efficiently capture long-range dependencies.
- - Masked Autoencoders serve as scalable self-supervised learners for Computer Vision applications, encouraging robust visual feature representation learning.
- - The Foundation Models achieve state-of-the-art performance on AD and TLE tasks across three operational viaducts, showing near-perfect accuracy on AD and significant improvement on evaluation metrics like R^2 score, MAE%, and MSE% for TLE tasks.
Summary1. Checking if bridges and viaducts are safe is very important.
2. New computer models called Transformer neural networks help with this by finding problems and estimating traffic.
3. Transformers work well because they can understand far-away connections easily.
4. Another type of model, Masked Autoencoders, helps computers learn to see better.
5. These models are really good at finding issues in bridges and viaducts and estimating traffic accurately.
Definitions- Structural Health Monitoring (SHM): Making sure buildings like bridges are safe.
- Transformer neural networks: Computer models that can find problems efficiently.
- Masked Auto-Encoder: A type of model that helps computers learn patterns without being told the answers.
- Anomaly Detection (AD): Finding things that are not normal or expected.
- Traffic Load Estimation (TLE): Guessing how much weight a bridge or road can handle safely.
Introduction
Structural Health Monitoring (SHM) is a critical aspect of ensuring the safety and reliability of civil infrastructures, particularly bridges and viaducts. The ability to detect anomalies and accurately estimate traffic load can help prevent potential disasters and ensure the longevity of these structures. In recent years, there has been a growing interest in using machine learning (ML) techniques for SHM due to their ability to handle large datasets and learn complex patterns.
In this paper, we introduce a novel approach that utilizes Transformer neural networks with a Masked Auto-Encoder architecture as Foundation Models for SHM. These models demonstrate superior performance compared to traditional methods on tasks such as Anomaly Detection (AD) and Traffic Load Estimation (TLE). Our work showcases the potential of leveraging deep learning techniques for effective SHM.
Background
Transformer neural networks have gained significant attention in natural language processing (NLP) applications due to their ability to capture long-range dependencies efficiently through self-attention mechanisms. This innovation has also shown promising results in computer vision tasks, leading to the development of Vision Transformers (ViT). ViTs break images into fixed-size patches and apply attention-based Transformers, making them suitable for processing vibration data from civil infrastructures.
Masked Autoencoders are another crucial component of our approach. They serve as scalable self-supervised learners for Computer Vision applications by reconstructing image patches with missing information during training. This encourages robust visual feature representation learning, making them ideal for SHM tasks.
Literature Overview
The use of ML techniques in SHM is not new; however, most existing approaches rely on hand-crafted features or require extensive manual labeling of data. Recent studies have explored deep learning methods for SHM but often face challenges like limited data availability or lack generalizability across different structures.
Our approach addresses these limitations by leveraging Transformer-based masked autoencoders for SHM. This combination allows our models to learn generalizable representations from multiple large datasets through self-supervised pre-training, making them more effective in handling real-world data.
Data Collection and Labeling
To evaluate the performance of our approach, we collected vibration data from three operational viaducts. The data was labeled by experts based on the presence or absence of anomalies and the corresponding traffic load at the time of measurement. This process ensured that our models were trained on high-quality and accurately labeled data.
Foundation Model Approach
Our foundation model approach consists of a processing pipeline, model architecture, and training procedure.
Processing Pipeline
The first step in our pipeline is breaking the vibration data into fixed-size patches. These patches are then fed into a Masked Autoencoder for feature representation learning. The output features are then passed through an attention-based Transformer network to capture long-range dependencies effectively.
Model Architecture
Our specific masked autoencoder architecture is designed to handle image patches with missing information during training effectively. It consists of an encoder network that maps input images to latent representations and a decoder network that reconstructs images from these representations.
The Transformer network used in our approach is adapted from Vision Transformers (ViT) and consists of multiple layers with self-attention mechanisms for capturing long-range dependencies efficiently.
Training Procedure
Our models are trained using self-supervised learning techniques on large unlabeled datasets before fine-tuning on labeled SHM datasets. This pre-training step helps in learning generalizable representations that can be applied to different structures without extensive manual labeling efforts.
Experimental Results
We evaluated our Foundation Models' performance on AD and TLE tasks across three operational viaducts. Our models achieved near-perfect accuracy within a short monitoring time span compared to traditional methods like Principal Component Analysis (PCA). On TLE tasks, our models outperformed existing approaches significantly on evaluation metrics like R$^2$ score, MAE%, and MSE%.
Conclusion
Our work showcases the potential of leveraging Transformer-based masked autoencoders for effective SHM. Our Foundation Models demonstrate superior performance on AD and TLE tasks, making them a promising direction for ML research in SHM. Future research can explore larger pre-training datasets and optimize model hyperparameters for enhanced performance. The code and pre-trained models are open-sourced for further exploration at https://github.com/eml-eda/tle-supervised.
In conclusion, our approach highlights the importance of incorporating deep learning techniques in SHM to improve anomaly detection and traffic load estimation accuracy. With the increasing availability of data from civil infrastructures, we believe that our approach has significant potential in enhancing the safety and reliability of these structures.