Transformers documentation
FLAN-T5
Get started
Tutorials
Pipelines for inferenceLoad pretrained instances with an AutoClassPreprocessFine-tune a pretrained modelDistributed training with 🤗 AccelerateShare a model
How-to guides
General usage
Create a custom architectureSharing custom modelsTrain with a scriptRun training on Amazon SageMakerConverting from TensorFlow checkpointsExport to ONNXExport to TorchScriptTroubleshoot
Natural Language Processing
Audio
Computer Vision
Performance and scalability
OverviewTraining on one GPUTraining on many GPUsTraining on CPUTraining on many CPUsTraining on TPUsTraining on Specialized HardwareInference on CPUInference on one GPUInference on many GPUsInference on Specialized HardwareCustom hardware for trainingInstantiating a big modelDebuggingHyperparameter Search using Trainer API
Contribute
How to contribute to transformers?How to add a model to 🤗 Transformers?How to convert a 🤗 Transformers model to TensorFlow?How to add a pipeline to 🤗 Transformers?TestingChecks on a Pull Request
🤗 Transformers NotebooksCommunity resourcesBenchmarksMigrating from previous packagesConceptual guides
PhilosophyGlossarySummary of the tasksSummary of the modelsSummary of the tokenizersPadding and truncationBERTologyPerplexity of fixed-length models
API
Main Classes
Auto ClassesCallbacksConfigurationData CollatorKeras callbacksLoggingModelsText GenerationONNXOptimizationModel outputsPipelinesProcessorsTokenizerTrainerDeepSpeed IntegrationFeature Extractor
Models
Text models
ALBERTBARTBARThezBARTphoBERTBertGenerationBertJapaneseBertweetBigBirdBigBirdPegasusBlenderbotBlenderbot SmallBLOOMBORTByT5CamemBERTCANINECodeGenConvBERTCPMCTRLDeBERTaDeBERTa-v2DialoGPTDistilBERTDPRELECTRAEncoder Decoder ModelsERNIEESMFLAN-T5FlauBERTFNetFSMTFunnel TransformerGPTGPT NeoGPT NeoXGPT NeoX JapaneseGPT-JGPT2HerBERTI-BERTLayoutLMLEDLiLTLongformerLongT5LUKEM2M100MarianMTMarkupLMMBart and MBart-50MegatronBERTMegatronGPT2mLUKEMobileBERTMPNetMT5MVPNEZHANLLBNyströmformerOPTPegasusPEGASUS-XPhoBERTPLBartProphetNetQDQBertRAGREALMReformerRemBERTRetriBERTRoBERTaRoFormerSplinterSqueezeBERTT5T5v1.1TAPASTAPEXTransformer XLUL2XGLMXLMXLM-ProphetNetXLM-RoBERTaXLM-RoBERTa-XLXLNetYOSO
Vision models
Audio models
Multimodal models
Reinforcement learning models
Time series models
Internal Helpers
You are viewing v4.24.0 version. A newer version v5.8.1 is available.
FLAN-T5
Overview
FLAN-T5 was released in the paper Scaling Instruction-Finetuned Language Models - it is an enhanced version of T5 that has been finetuned in a mixture of tasks.
One can directly use FLAN-T5 weights without finetuning the model:
>>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-small")
>>> tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-small")
>>> inputs = tokenizer("A step by step recipe to make bolognese pasta:", return_tensors="pt")
>>> outputs = model.generate(**inputs)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
['Pour a cup of bolognese into a large bowl and add the pasta']FLAN-T5 includes the same improvements as T5 version 1.1 (see here for the full details of the model’s improvements.)
Google has released the following variants:
One can refer to T5’s documentation page for all tips, code examples and notebooks. As well as the FLAN-T5 model card for more details regarding training and evaluation of the model.
The original checkpoints can be found here.