MoleHD: Automated Drug Discovery using Brain-Inspired Hyperdimensional Computing

AI-generated keywords: Drug Discovery Machine Learning Neural Network Classifiers Hyperdimensional Computing Molecular Properties

AI-generated Key Points

Modern drug discovery is hindered by large volumes of molecular data and complicated molecular properties.
Recent advancements in machine learning algorithms show promise in automating drug discovery through virtual screening.
Graph neural networks and recurrent neural networks are accurate but computationally and memory-intensive.
The authors propose MoleHD, an alternative approach based on brain-inspired hyperdimensional computing (HDC) for molecular property prediction.
MoleHD transforms molecules into feature vectors using SMILE-PE tokenizers pretrained on the ChEMBL database.
HDC encoders project these features into high-dimensional vectors for training and inference.
MoleHD outperforms all baseline methods across 30 classification tasks while reducing computing costs.
This paper presents the first HDC-based method for drug discovery, potentially paving the way for new research paths.
The work is currently under review for NeurIPS 2021 conference.

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Dongning Ma, Xun Jiao

arXiv: 2106.02894v1 - DOI (cs.NE)

Under review for NeurIPS 2021

License: CC BY-NC-SA 4.0

Abstract: Modern drug discovery is often time-consuming, complex and cost-ineffective due to the large volume of molecular data and complicated molecular properties. Recently, machine learning algorithms have shown promising results in virtual screening of automated drug discovery by predicting molecular properties. While emerging learning methods such as graph neural networks and recurrent neural networks exhibit high accuracy, they are also notoriously computation-intensive and memory-intensive with operations such as feature embeddings or deep convolutions. In this paper, we propose a viable alternative to neural network classifiers. We present MoleHD, a method based on brain-inspired hyperdimensional computing (HDC) for molecular property prediction. We first transform the SMILES presentation of molecules into feature vectors by SMILE-PE tokenizers pretrained on the ChEMBL database. Then, we develop HDC encoders to project such features into high-dimensional vectors that are used for training and inference. We perform an extensive evaluation using 30 classification tasks from 3 widely-used molecule datasets and compare MoleHD with 10 baseline methods including 6 SOTA neural network classifiers. Results show that MoleHD is able to outperform all the baseline methods on average across 30 classification tasks with significantly reduced computing cost. To the best of our knowledge, we develop the first HDC-based method for drug discovery. The promising results presented in this paper can potentially lead to a novel path in drug discovery research.

Submitted to arXiv on 05 Jun. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2106.02894v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

Modern drug discovery is a time-consuming and complex process, often hindered by the large volume of molecular data and complicated molecular properties. However, recent advancements in machine learning algorithms have shown promise in automating drug discovery by predicting molecular properties through virtual screening. While graph neural networks and recurrent neural networks have demonstrated high accuracy in this domain, they are computationally intensive and memory-intensive due to operations like feature embeddings or deep convolutions. In this paper, the authors propose an alternative approach to neural network classifiers called MoleHD. MoleHD is based on brain-inspired hyperdimensional computing (HDC) for molecular property prediction. The method begins by transforming the SMILES presentation of molecules into feature vectors using SMILE-PE tokenizers pretrained on the ChEMBL database. Then, HDC encoders are developed to project these features into high-dimensional vectors that are used for training and inference. To evaluate the effectiveness of MoleHD, the authors conduct an extensive evaluation using 30 classification tasks from three widely-used molecule datasets. They compare MoleHD with 10 baseline methods, including six state-of-the-art neural network classifiers. The results demonstrate that MoleHD outperforms all baseline methods on average across the 30 classification tasks while significantly reducing computing costs. Notably, this paper presents the first HDC-based method for drug discovery. The promising results showcased in this study could potentially pave the way for a novel path in drug discovery research. It is worth mentioning that this work is currently under review for NeurIPS 2021 conference. In addition to traditional machine learning algorithms such as random forest, support vector machine (SVM), k nearest neighbors (KNN), and gradient boosting being applied in drug discovery applications initially, more sophisticated deep learning techniques are increasingly being utilized to predict molecular properties from vast amounts of data. These techniques leverage complex structural information within molecules to enhance their predictive capabilities; however there is still room for improvement as these models often overlook deep and complex structural information leading to subpar performance in predicting molecular properties. Overall, this paper presents an innovative approach to drug discovery using brain-inspired hyperdimensional computing which leverages high dimensional vectors and reduces computing costs while demonstrating superior performance compared to existing neural network classifiers; thus opening up new avenues for more efficient and effective prediction of molecular properties with significant potential impact on drug discovery research.

- Modern drug discovery is hindered by large volumes of molecular data and complicated molecular properties.
- Recent advancements in machine learning algorithms show promise in automating drug discovery through virtual screening.
- Graph neural networks and recurrent neural networks are accurate but computationally and memory-intensive.
- The authors propose MoleHD, an alternative approach based on brain-inspired hyperdimensional computing (HDC) for molecular property prediction.
- MoleHD transforms molecules into feature vectors using SMILE-PE tokenizers pretrained on the ChEMBL database.
- HDC encoders project these features into high-dimensional vectors for training and inference.
- MoleHD outperforms all baseline methods across 30 classification tasks while reducing computing costs.
- This paper presents the first HDC-based method for drug discovery, potentially paving the way for new research paths.
- The work is currently under review for NeurIPS 2021 conference.

Key points 1. Drug discovery is difficult because there is a lot of data and complicated properties to consider. 2. New machine learning algorithms can help with drug discovery by using virtual screening. 3. Some accurate algorithms are slow and use a lot of memory. 4. MoleHD is a different approach that uses brain-inspired computing to predict molecular properties. 5. MoleHD transforms molecules into special vectors for training and testing. Definitions - Drug discovery: The process of finding new medicines or treatments for diseases. - Machine learning algorithms: Computer programs that can learn from data and make predictions or decisions. - Virtual screening: Using computer simulations to test how well a drug might work before actually making it. - Molecular properties: Characteristics or qualities of molecules, such as their size, shape, or chemical makeup. - Hyperdimensional computing (HDC): A type of computing inspired by the human brain that uses high-dimensional vectors instead of traditional binary code. - Feature vectors: Special representations of data used in machine learning that capture important characteristics or features of the data. - Computing costs: The amount of time, memory, and resources needed to perform calculations on a computer system.

Exploring the Potential of Hyperdimensional Computing in Drug Discovery Research

Drug discovery is a complex and time-consuming process, often hindered by the large volume of molecular data and complicated molecular properties. In an effort to automate drug discovery, machine learning algorithms have been utilized to predict molecular properties through virtual screening. While graph neural networks and recurrent neural networks have demonstrated high accuracy in this domain, they are computationally intensive and memory-intensive due to operations like feature embeddings or deep convolutions. To address these issues, researchers from the University of California San Diego recently proposed an alternative approach to neural network classifiers called MoleHD for predicting molecular properties using brain-inspired hyperdimensional computing (HDC).

Background on Machine Learning Algorithms for Drug Discovery

Traditional machine learning algorithms such as random forest, support vector machine (SVM), k nearest neighbors (KNN), and gradient boosting have long been applied in drug discovery applications; however more sophisticated deep learning techniques are increasingly being utilized to predict molecular properties from vast amounts of data. These techniques leverage complex structural information within molecules to enhance their predictive capabilities; however there is still room for improvement as these models often overlook deep and complex structural information leading to subpar performance in predicting molecular properties.

Introducing MoleHD: A Novel Approach for Molecular Property Prediction

MoleHD is based on HDC for molecule property prediction which begins by transforming the SMILES presentation of molecules into feature vectors using SMILE-PE tokenizers pretrained on the ChEMBL database. Then, HDC encoders are developed to project these features into high-dimensional vectors that are used for training and inference. The authors conducted an extensive evaluation using 30 classification tasks from three widely-used molecule datasets comparing MoleHD with 10 baseline methods including six state-of-the art neural network classifiers. The results demonstrate that MoleHD outperforms all baseline methods on average across the 30 classification tasks while significantly reducing computing costs; thus paving a novel path in drug discovery research with potential impactful implications. This work is currently under review for NeurIPS 2021 conference.

Conclusion

This paper presents an innovative approach to drug discovery using brain inspired hyperdimensional computing which leverages high dimensional vectors and reduces computing costs while demonstrating superior performance compared to existing neural network classifiers; thus opening up new avenues for more efficient and effective prediction of molecular properties with significant potential impact on drug discovery research

Created on 03 Jan. 2024

Assess the quality of the AI-generated content by voting

Score: 0

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

56.9%

Many Ways to Be Lonely: Fine-Grained Characterization of Loneliness and Its P…

cs.CL

55.7%

Hyperbolic Molecular Representation Learning for Drug Repositioning

q-bio.BM

52.2%

HICEM: A High-Coverage Emotion Model for Artificial Emotional Intelligence

cs.CL

51.5%

Regression-based Deep-Learning predicts molecular biomarkers from pathology s…

cs.CV

51.3%

Training a Helpful and Harmless Assistant with Reinforcement Learning from Hu…

cs.CL

51.0%

Sequence-Based Nanobody-Antigen Binding Prediction

q-bio.BM

50.5%

Data-driven decomposition of brain dynamics with principal component analysis…

q-bio.QM

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.