Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

AI-generated keywords: Deep Learning Theory Practice Model Architecture Curse of Dimensionality

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Thesis explores theoretical foundations of deep learning
  • Focus on relationship between architecture of deep learning models and data structures
  • Aims to address efficacy of deep learning algorithms and overcoming curse of dimensionality
  • Deep learning learns relevant representations of data by exploiting structure
  • Empirical approach combining experimental studies with physics-inspired toy models
  • Simplified models help understand complex behaviors in deep learning systems
  • Goal is to bridge gap between theory and practice in deep learning
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Leonardo Petrini

PhD Thesis @ EPFL

Abstract: Artificial intelligence, particularly the subfield of machine learning, has seen a paradigm shift towards data-driven models that learn from and adapt to data. This has resulted in unprecedented advancements in various domains such as natural language processing and computer vision, largely attributed to deep learning, a special class of machine learning models. Deep learning arguably surpasses traditional approaches by learning the relevant features from raw data through a series of computational layers. This thesis explores the theoretical foundations of deep learning by studying the relationship between the architecture of these models and the inherent structures found within the data they process. In particular, we ask What drives the efficacy of deep learning algorithms and allows them to beat the so-called curse of dimensionality-i.e. the difficulty of generally learning functions in high dimensions due to the exponentially increasing need for data points with increased dimensionality? Is it their ability to learn relevant representations of the data by exploiting their structure? How do different architectures exploit different data structures? In order to address these questions, we push forward the idea that the structure of the data can be effectively characterized by its invariances-i.e. aspects that are irrelevant for the task at hand. Our methodology takes an empirical approach to deep learning, combining experimental studies with physics-inspired toy models. These simplified models allow us to investigate and interpret the complex behaviors we observe in deep learning systems, offering insights into their inner workings, with the far-reaching goal of bridging the gap between theory and practice.

Submitted to arXiv on 24 Oct. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2310.16154v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

This thesis by Leonardo Petrini explores the theoretical foundations of deep learning, with a focus on understanding the relationship between the architecture of deep learning models and the inherent structures within the data they process. The author aims to address questions such as what drives the efficacy of deep learning algorithms and allows them to overcome the curse of dimensionality, which refers to the difficulty of learning functions in high dimensions due to the exponentially increasing need for data points. The author proposes that one reason for the success of deep learning is its ability to learn relevant representations of data by exploiting their structure. To investigate these questions, Petrini takes an empirical approach to deep learning, combining experimental studies with physics-inspired toy models. These simplified models allow for a better understanding and interpretation of complex behaviors observed in deep learning systems. By studying these behaviors, Petrini aims to bridge the gap between theory and practice in deep learning. The advancements in artificial intelligence, particularly in machine learning subfields like natural language processing and computer vision, have largely been attributed to deep learning models. Deep learning surpasses traditional approaches by effectively extracting relevant features from raw data through multiple computational layers. Petrini's research focuses on understanding why deep learning algorithms are so effective and how they overcome challenges posed by high-dimensional data. The author suggests that one key factor is their ability to learn relevant representations of data by leveraging its underlying structure. Different architectures may exploit different aspects of data structures. To explore these ideas, Petrini uses an empirical approach that combines experimental studies with physics-inspired toy models. These simplified models help shed light on complex behaviors observed in deep learning systems and provide insights into their inner workings. The ultimate goal is to bridge the gap between theory and practice in order to enhance our understanding and application of deep learning techniques. In summary, this thesis delves into the theoretical foundations of deep learning by examining how model architecture relates to inherent structures within processed data. It seeks answers regarding what makes deep learning algorithms effective and how they overcome the curse of dimensionality. The research methodology combines empirical studies with physics-inspired toy models to gain insights into the inner workings of deep learning systems, with the aim of bridging the gap between theory and practice in this field.
Created on 04 Dec. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.