Modern drug discovery is a time-consuming and complex process, often hindered by the large volume of molecular data and complicated molecular properties. However, recent advancements in machine learning algorithms have shown promise in automating drug discovery by predicting molecular properties through virtual screening. While graph neural networks and recurrent neural networks have demonstrated high accuracy in this domain, they are computationally intensive and memory-intensive due to operations like feature embeddings or deep convolutions. In this paper, the authors propose an alternative approach to neural network classifiers called MoleHD. MoleHD is based on brain-inspired hyperdimensional computing (HDC) for molecular property prediction. The method begins by transforming the SMILES presentation of molecules into feature vectors using SMILE-PE tokenizers pretrained on the ChEMBL database. Then, HDC encoders are developed to project these features into high-dimensional vectors that are used for training and inference. To evaluate the effectiveness of MoleHD, the authors conduct an extensive evaluation using 30 classification tasks from three widely-used molecule datasets. They compare MoleHD with 10 baseline methods, including six state-of-the-art neural network classifiers. The results demonstrate that MoleHD outperforms all baseline methods on average across the 30 classification tasks while significantly reducing computing costs. Notably, this paper presents the first HDC-based method for drug discovery. The promising results showcased in this study could potentially pave the way for a novel path in drug discovery research. It is worth mentioning that this work is currently under review for NeurIPS 2021 conference. In addition to traditional machine learning algorithms such as random forest, support vector machine (SVM), k nearest neighbors (KNN), and gradient boosting being applied in drug discovery applications initially, more sophisticated deep learning techniques are increasingly being utilized to predict molecular properties from vast amounts of data. These techniques leverage complex structural information within molecules to enhance their predictive capabilities; however there is still room for improvement as these models often overlook deep and complex structural information leading to subpar performance in predicting molecular properties. Overall, this paper presents an innovative approach to drug discovery using brain-inspired hyperdimensional computing which leverages high dimensional vectors and reduces computing costs while demonstrating superior performance compared to existing neural network classifiers; thus opening up new avenues for more efficient and effective prediction of molecular properties with significant potential impact on drug discovery research.
- - Modern drug discovery is hindered by large volumes of molecular data and complicated molecular properties.
- - Recent advancements in machine learning algorithms show promise in automating drug discovery through virtual screening.
- - Graph neural networks and recurrent neural networks are accurate but computationally and memory-intensive.
- - The authors propose MoleHD, an alternative approach based on brain-inspired hyperdimensional computing (HDC) for molecular property prediction.
- - MoleHD transforms molecules into feature vectors using SMILE-PE tokenizers pretrained on the ChEMBL database.
- - HDC encoders project these features into high-dimensional vectors for training and inference.
- - MoleHD outperforms all baseline methods across 30 classification tasks while reducing computing costs.
- - This paper presents the first HDC-based method for drug discovery, potentially paving the way for new research paths.
- - The work is currently under review for NeurIPS 2021 conference.
Key points
1. Drug discovery is difficult because there is a lot of data and complicated properties to consider.
2. New machine learning algorithms can help with drug discovery by using virtual screening.
3. Some accurate algorithms are slow and use a lot of memory.
4. MoleHD is a different approach that uses brain-inspired computing to predict molecular properties.
5. MoleHD transforms molecules into special vectors for training and testing.
Definitions
- Drug discovery: The process of finding new medicines or treatments for diseases.
- Machine learning algorithms: Computer programs that can learn from data and make predictions or decisions.
- Virtual screening: Using computer simulations to test how well a drug might work before actually making it.
- Molecular properties: Characteristics or qualities of molecules, such as their size, shape, or chemical makeup.
- Hyperdimensional computing (HDC): A type of computing inspired by the human brain that uses high-dimensional vectors instead of traditional binary code.
- Feature vectors: Special representations of data used in machine learning that capture important characteristics or features of the data.
- Computing costs: The amount of time, memory, and resources needed to perform calculations on a computer system.
Exploring the Potential of Hyperdimensional Computing in Drug Discovery Research
Drug discovery is a complex and time-consuming process, often hindered by the large volume of molecular data and complicated molecular properties. In an effort to automate drug discovery, machine learning algorithms have been utilized to predict molecular properties through virtual screening. While graph neural networks and recurrent neural networks have demonstrated high accuracy in this domain, they are computationally intensive and memory-intensive due to operations like feature embeddings or deep convolutions. To address these issues, researchers from the University of California San Diego recently proposed an alternative approach to neural network classifiers called MoleHD for predicting molecular properties using brain-inspired hyperdimensional computing (HDC).
Background on Machine Learning Algorithms for Drug Discovery
Traditional machine learning algorithms such as random forest, support vector machine (SVM), k nearest neighbors (KNN), and gradient boosting have long been applied in drug discovery applications; however more sophisticated deep learning techniques are increasingly being utilized to predict molecular properties from vast amounts of data. These techniques leverage complex structural information within molecules to enhance their predictive capabilities; however there is still room for improvement as these models often overlook deep and complex structural information leading to subpar performance in predicting molecular properties.
Introducing MoleHD: A Novel Approach for Molecular Property Prediction
MoleHD is based on HDC for molecule property prediction which begins by transforming the SMILES presentation of molecules into feature vectors using SMILE-PE tokenizers pretrained on the ChEMBL database. Then, HDC encoders are developed to project these features into high-dimensional vectors that are used for training and inference. The authors conducted an extensive evaluation using 30 classification tasks from three widely-used molecule datasets comparing MoleHD with 10 baseline methods including six state-of-the art neural network classifiers. The results demonstrate that MoleHD outperforms all baseline methods on average across the 30 classification tasks while significantly reducing computing costs; thus paving a novel path in drug discovery research with potential impactful implications. This work is currently under review for NeurIPS 2021 conference.
Conclusion
This paper presents an innovative approach to drug discovery using brain inspired hyperdimensional computing which leverages high dimensional vectors and reduces computing costs while demonstrating superior performance compared to existing neural network classifiers; thus opening up new avenues for more efficient and effective prediction of molecular properties with significant potential impact on drug discovery research