An Adaptive Tangent Feature Perspective of Neural Networks
AI-generated Key Points
- Authors propose a framework for understanding feature learning in neural networks
- Linear models in tangent feature space are studied
- Features can be transformed during training and linear transformations of features are considered
- Joint optimization problem over parameters and transformations with a bilinear interpolation constraint is formulated
- Specialized analysis on neural network structures provides insights into how features and kernel function change
- Experiments conducted on real neural networks using a simple regression problem
- Adaptive feature implementation of tangent feature classification evaluated on MNIST and CIFAR-10 datasets
- Results show that adaptive feature model has lower sample complexity compared to fixed tangent feature model
- Framework introduces understanding of feature adaptivity in neural networks and insights into evolution of features and kernel functions during training
- Further research needed to fully characterize real neural networks and understand extent of adaptivity in practice.
Authors: Daniel LeJeune, Sina Alemohammad
Abstract: In order to better understand feature learning in neural networks, we propose a framework for understanding linear models in tangent feature space where the features are allowed to be transformed during training. We consider linear transformations of features, resulting in a joint optimization over parameters and transformations with a bilinear interpolation constraint. We show that this optimization problem has an equivalent linearly constrained optimization with structured regularization that encourages approximately low rank solutions. Specializing to neural network structure, we gain insights into how the features and thus the kernel function change, providing additional nuance to the phenomenon of kernel alignment when the target function is poorly represented using tangent features. In addition to verifying our theoretical observations in real neural networks on a simple regression problem, we empirically show that an adaptive feature implementation of tangent feature classification has an order of magnitude lower sample complexity than the fixed tangent feature model on MNIST and CIFAR-10.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.