MoleHD: Automated Drug Discovery using Brain-Inspired Hyperdimensional Computing

AI-generated keywords: Drug Discovery Machine Learning Neural Network Classifiers Hyperdimensional Computing Molecular Properties

AI-generated Key Points

  • Modern drug discovery is hindered by large volumes of molecular data and complicated molecular properties.
  • Recent advancements in machine learning algorithms show promise in automating drug discovery through virtual screening.
  • Graph neural networks and recurrent neural networks are accurate but computationally and memory-intensive.
  • The authors propose MoleHD, an alternative approach based on brain-inspired hyperdimensional computing (HDC) for molecular property prediction.
  • MoleHD transforms molecules into feature vectors using SMILE-PE tokenizers pretrained on the ChEMBL database.
  • HDC encoders project these features into high-dimensional vectors for training and inference.
  • MoleHD outperforms all baseline methods across 30 classification tasks while reducing computing costs.
  • This paper presents the first HDC-based method for drug discovery, potentially paving the way for new research paths.
  • The work is currently under review for NeurIPS 2021 conference.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Dongning Ma, Xun Jiao

Under review for NeurIPS 2021
License: CC BY-NC-SA 4.0

Abstract: Modern drug discovery is often time-consuming, complex and cost-ineffective due to the large volume of molecular data and complicated molecular properties. Recently, machine learning algorithms have shown promising results in virtual screening of automated drug discovery by predicting molecular properties. While emerging learning methods such as graph neural networks and recurrent neural networks exhibit high accuracy, they are also notoriously computation-intensive and memory-intensive with operations such as feature embeddings or deep convolutions. In this paper, we propose a viable alternative to neural network classifiers. We present MoleHD, a method based on brain-inspired hyperdimensional computing (HDC) for molecular property prediction. We first transform the SMILES presentation of molecules into feature vectors by SMILE-PE tokenizers pretrained on the ChEMBL database. Then, we develop HDC encoders to project such features into high-dimensional vectors that are used for training and inference. We perform an extensive evaluation using 30 classification tasks from 3 widely-used molecule datasets and compare MoleHD with 10 baseline methods including 6 SOTA neural network classifiers. Results show that MoleHD is able to outperform all the baseline methods on average across 30 classification tasks with significantly reduced computing cost. To the best of our knowledge, we develop the first HDC-based method for drug discovery. The promising results presented in this paper can potentially lead to a novel path in drug discovery research.

Submitted to arXiv on 05 Jun. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2106.02894v1

Modern drug discovery is a time-consuming and complex process, often hindered by the large volume of molecular data and complicated molecular properties. However, recent advancements in machine learning algorithms have shown promise in automating drug discovery by predicting molecular properties through virtual screening. While graph neural networks and recurrent neural networks have demonstrated high accuracy in this domain, they are computationally intensive and memory-intensive due to operations like feature embeddings or deep convolutions. In this paper, the authors propose an alternative approach to neural network classifiers called MoleHD. MoleHD is based on brain-inspired hyperdimensional computing (HDC) for molecular property prediction. The method begins by transforming the SMILES presentation of molecules into feature vectors using SMILE-PE tokenizers pretrained on the ChEMBL database. Then, HDC encoders are developed to project these features into high-dimensional vectors that are used for training and inference. To evaluate the effectiveness of MoleHD, the authors conduct an extensive evaluation using 30 classification tasks from three widely-used molecule datasets. They compare MoleHD with 10 baseline methods, including six state-of-the-art neural network classifiers. The results demonstrate that MoleHD outperforms all baseline methods on average across the 30 classification tasks while significantly reducing computing costs. Notably, this paper presents the first HDC-based method for drug discovery. The promising results showcased in this study could potentially pave the way for a novel path in drug discovery research. It is worth mentioning that this work is currently under review for NeurIPS 2021 conference. In addition to traditional machine learning algorithms such as random forest, support vector machine (SVM), k nearest neighbors (KNN), and gradient boosting being applied in drug discovery applications initially, more sophisticated deep learning techniques are increasingly being utilized to predict molecular properties from vast amounts of data. These techniques leverage complex structural information within molecules to enhance their predictive capabilities; however there is still room for improvement as these models often overlook deep and complex structural information leading to subpar performance in predicting molecular properties. Overall, this paper presents an innovative approach to drug discovery using brain-inspired hyperdimensional computing which leverages high dimensional vectors and reduces computing costs while demonstrating superior performance compared to existing neural network classifiers; thus opening up new avenues for more efficient and effective prediction of molecular properties with significant potential impact on drug discovery research.
Created on 03 Jan. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.