У нас вы можете посмотреть бесплатно Large Language Models Are State-of-the-Art Evaluators of Translation Quality или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Title: Large Language Models Are State-of-the-Art Evaluators of Translation Quality Source: https://arxiv.org/pdf/2302.14520 Summary: This paper introduces GEMBA (GPT Estimation Metric Based Assessment), a novel GPT-based metric designed for translation quality assessment that achieves state-of-the-art accuracy. GEMBA operates effectively both with and without a human reference translation, classifying it as both a quality metric and a quality estimation task. The research demonstrates that Large Language Models (LLMs), specifically GPT 3.5 and larger models, are capable of understanding zero-shot prompts for this task, with GPT-4 exhibiting the best performance. Evaluated against the WMT22 Metrics shared task data, GEMBA with GPT-4 (GEMBA-GPT4-DA) set a new state of the art in system-level pairwise accuracy for reference-based assessment (89.8%) and strongly outperformed other reference-less metrics in quality estimation mode (87.6%). The study experimented with four prompt variants—Direct Assessment (DA), Scalar Quality Metrics (SQM), Stars Ranking, and Quality Classes—finding that the least constrained templates often yielded the best results. While GEMBA shows exceptional system-level performance, its segment-level scores are slightly behind top-performing metrics, potentially due to the discrete nature of its scoring and frequent ties. The authors have publicly released their code and prompt templates to encourage external validation and reproducibility. This work offers a significant insight into the potential of pre-trained, generative LLMs for automated machine translation evaluation. #LargeLanguageModels #LLMs #GPT #MachineTranslation #TranslationQualityAssessment #TranslationEvaluation #NLP #ArtificialIntelligence #AIResearch #StateOfTheArt #WMT22 #GEMBA #QualityEstimation #ZeroShotLearning #GPT4 #ChatGPT #MicrosoftTranslator #AutomatedEvaluation #LanguageModels #MachineLearning