PIMNet: A Parallel, Iterative and Mimicking Network for Scene Text Recognition

AI-generated keywords: Scene Text Recognition Encoder-Decoder Framework Attention Mechanism Parallel Decoding Mimicking Learning

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • The field of [redacted] has seen a surge in interest due to its wide range of applications
  • Advanced methods utilize autoregressive models with attention mechanisms for sequential text generation
  • Non-autoregressive models offer faster inference times but sacrifice accuracy compared to autoregressive models
  • PIMNet introduces a novel approach that leverages parallel attention mechanism and iterative generation for balancing speed and precision
  • PIMNet uses an additional autoregressive decoder during training alongside the parallel decoder for improved accuracy without pre-training requirement
  • Extensive experiments demonstrate the effectiveness and efficiency of PIMNet in achieving competitive performance with fast inference times
  • Code for PIMNet is available at https://github.com/Pay20Y/PIMNet for further exploration and implementation
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Zhi Qiao, Yu Zhou, Jin Wei, Wei Wang, Yuan Zhang, Ning Jiang, Hongbin Wang, Weiping Wang

Accepted by ACM MM 2021

Abstract: Nowadays, scene text recognition has attracted more and more attention due to its various applications. Most state-of-the-art methods adopt an encoder-decoder framework with attention mechanism, which generates text autoregressively from left to right. Despite the convincing performance, the speed is limited because of the one-by-one decoding strategy. As opposed to autoregressive models, non-autoregressive models predict the results in parallel with a much shorter inference time, but the accuracy falls behind the autoregressive counterpart considerably. In this paper, we propose a Parallel, Iterative and Mimicking Network (PIMNet) to balance accuracy and efficiency. Specifically, PIMNet adopts a parallel attention mechanism to predict the text faster and an iterative generation mechanism to make the predictions more accurate. In each iteration, the context information is fully explored. To improve learning of the hidden layer, we exploit the mimicking learning in the training phase, where an additional autoregressive decoder is adopted and the parallel decoder mimics the autoregressive decoder with fitting outputs of the hidden layer. With the shared backbone between the two decoders, the proposed PIMNet can be trained end-to-end without pre-training. During inference, the branch of the autoregressive decoder is removed for a faster speed. Extensive experiments on public benchmarks demonstrate the effectiveness and efficiency of PIMNet. Our code will be available at https://github.com/Pay20Y/PIMNet.

Submitted to arXiv on 09 Sep. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2109.04145v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

The field of has seen a surge in interest in recent years due to its wide range of applications. Many advanced methods currently utilize an with to generate text sequentially from left to right. While these approaches have demonstrated impressive performance, their speed is often hindered by the one-by-one decoding strategy they employ. On the other hand, non-autoregressive models offer faster inference times by predicting results in parallel; however, they tend to sacrifice accuracy compared to their autoregressive counterparts. To address this trade-off between accuracy and efficiency, this paper introduces a novel approach called the . PIMNet leverages a parallel attention mechanism for quicker text prediction and an iterative generation mechanism to enhance the accuracy of predictions. By fully exploring context information in each iteration, PIMNet aims to strike a balance between speed and precision. A key innovation of PIMNet lies in its use of during training. This involves incorporating an additional autoregressive decoder alongside the parallel decoder, where the latter mimics the autoregressive decoder by aligning its outputs with those of the hidden layer. With a shared backbone between the two decoders, PIMNet can be trained end-to-end without requiring pre-training. During inference, the branch associated with the autoregressive decoder is removed to further boost speed. Extensive experiments conducted on public benchmarks demonstrate the effectiveness and efficiency of PIMNet in achieving competitive performance while maintaining fast inference times. The authors have made their code available at https://github.com/Pay20Y/PIMNet for further exploration and implementation by interested parties. This innovative network architecture holds promise for advancing technology and opening up new possibilities for real-world applications.
Created on 04 Aug. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.