JavaScript is not enabled in your browser
Objexx Engineering

Why is YAML Slow?    |    Objexx Labs

Posted 2012/02/20

We are using YAML style input files for a number of applications because they are easy to read and edit outside of a GUI. We were dismayed to find that loading times were quite slow for files of modest size (5K lines, 200KB) when using stock YAML parsers such as yaml-cpp and libyaml.

Digging into the specification and stock implementations reveals why parsing YAML is slow: YAML supports many features and syntax variations that complicate parsing and can require super-linear look ahead processing. Here's an example from the YAML 1.2 specification:

!!map {
  ? !!str "implicit block key"
  : !!seq [
    !!map {
      ? !!str "implicit flow key"
      : !!str "value",

Clearly, the readability and elegance of YAML has been lost along with good performance.

Realizing that we do not want or need syntactic support for higher level capabilities (since we are building that into our file structure semantics where it belongs) we found that very simple and fast (linear time) parsers could be written (in C++ and Python for our purposes) for this simplified YAML. These custom parsers are now in use in ObjexxSISAME and a Python-based wrapper developed as part of the reengineering/modernization of a legacy FEM application.