Facebook’s TransCoder AI ‘Bests’ Commercial Rivals Translating Between Code Languages
Facebook AI has created a new TransCoder that translates code between different coding languages.
Researchers at Facebook say they’ve developed a new system called a neural transcompiler capable of converting code from one high-level programming language like Java, Python, or C++ into a different code.
The system is unsupervised, which means it seeks previously undetected patterns in data sets without guiding labels and a minimal degree of human supervision, reports VentureBeat.
Notably, it reportedly outperforms rule-based guidelines other systems use for code translation by a “significant” margin.
“TransCoder can easily be generalized to any programming language, does not require any expert knowledge, and outperforms commercial solutions by a large margin,” wrote the coauthors of the preprint study. “Our results suggest that a lot of mistakes made by the model could easily be fixed by adding simple constraints to the decoder to ensure that the generated functions are syntactically correct, or by using dedicated architectures.”
Moving an existing codebase to a modern and more efficient language like C++ or Java takes serious expertise in both source and target languages — a typically pricey process. Commonwealth Bank of Australia spent roughly $750 million in a five-year timespan to convert its platform from COBOL to Java script. While Transcompilers are technically of help here — they cut out the need to rewrite new code from scratch — they’re also difficult to build because disparate languages have varying syntax and use distinctive platform APIs, variable types, and standard-library functions, reports VentureBeat.
Called TransCoder, Facebook’s new system can translate between Java, C++, and Python — completing difficult tasks without the supervision such projects typically require. The new system is first initialized with cross-lingual language model pretraining — a process that maps partial code expressions whose meanings overlap to identical representations independent of programming language.