XLNet: Generalized Autoregressive Pretraining for Language Understanding

Résumés déjà disponibles dans d'autres langues : en

Auteurs : Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le

arXiv: 1906.08237v1 - DOI (cs.CL)

Pretrained models and code are available at https://github.com/zihangdai/xlnet

Résumé : With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, XLNet outperforms BERT on 20 tasks, often by a large margin, and achieves state-of-the-art results on 18 tasks including question answering, natural language inference, sentiment analysis, and document ranking.

Soumis à arXiv le 19 Jui. 2019

Posez des questions sur cet article à notre assistant IA

Vous pouvez aussi discutez avec plusieurs papiers à la fois ici.

Instructions pour utiliser l'assistant IA ?

Résultats du processus de synthèse de l'article arXiv : 1906.08237v1

Résumé Complet
Points clés
Résumé vulgarisé
Article de blog

Le résumé n'est pas encore prêt

Les points clés ne sont pas encore prêts

Le résumé vulgarisé n'est pas encore prêt

L'article de blog n'est pas encore prêt

Créé le 17 Sep. 2023

Disponible dans d'autres langues : en

Évaluez la qualité du contenu généré par l'IA en votant

Note : 0

Le résumé précédent a été créé il y a plus d'un an et peut être réexécuté (si nécessaire) en cliquant sur le bouton Exécuter ci-dessous.

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Posez des questions sur cet article à notre assistant IA

Résultats du processus de synthèse de l'article arXiv : 1906.08237v1

Articles similaires résumés avec nos outils d'IA