Dataset · Benchmark 2026

LLM Translation Benchmark: Polish → Vietnamese (2026)

Machine translation quality comparison (GPT 5.4, Claude 4.6 Sonnet, Google Translate, DeepL) for Polish → Vietnamese. Research conducted by 100 AT.

Polish → Vietnamese · 4 systems · 2 judges

Methodology

Each of 4 translation systems translated 4 source texts from Polish to Vietnamese. Translations were evaluated by 2 judges (GPT 5.4 XHigh and Claude 4.6 Sonnet) across three criteria: Accuracy, Fluency and Style, on a 1–10 scale. The table shows averaged scores.

Results: Polish → Vietnamese

Average scores (1–10 scale) given by 2 judges to 4 translation systems on Polish-Vietnamese texts.

Translation System GPT 5.4 (Judge) Claude 4.6 (Judge) Final Average
GPT 5.4 (XHigh)
8.90 9.17 9.04
Claude 4.6 Sonnet
9.30 8.92 9.11
Google Translate ref
7.40 7.75 7.58
DeepL ref
8.10 7.83 7.97

Key Findings

  1. 1

    AI models (Claude 4.6 Sonnet — 9.11, GPT 5.4 — 9.04) significantly outperform traditional translation engines.

  2. 2

    Google Translate (7.58) scores lowest on the PL→VI pair, particularly in style and naturalness.

  3. 3

    DeepL (7.97) falls in between — better than Google but significantly behind AI models.

  4. 4

    The gap between the best AI and the best traditional engine is over 1.1 points — a substantial quality difference.