Typical Arguments


For as long as I've followed this debate, I've noticed the same arguments tend to be perpetuated, and whilst there is merit to some, others are either flawed, short-sighted or invalid, making it difficult to reach any sort of logical conclusion.

HTML is a Markup Language not a Programming Language

Many will use this as the default argument because HTML stands for Hyper Text Markup Language. The default assumption is that HTML is only for "marking up" textual content, such as headings, paragraphs, quotes, etc. In my opinion, this is a very naive and short-sighted view of what HTML is. When you consider the logical structure of a document, including things like navigation bars, side bars, sections, articles, images, forms and multimedia, you soon realise that HTML's feature set is incredibly rich, and in the context of this argument, people tend to overlook that.

More importantly, this simply isn't a valuable distinction to make. Markup languages and programming languages are not mutually exclusive. HTML and other markup languages such as XML are descended from SGML (Standard Generalised Markup Language), which is a set of rules that define document structures. XSLT (Extensible Stylesheet Language Transformations) is a declarative, XML-based markup language used for transformation of XML documents into other XML documents. Whilst XSLT was originally designed as a special-purpose language for XML transformation, the language is Turing-complete, making it theoretically capable of arbitrary computation. Therefore markup languages can be programming languages.

HTML is not Turing complete

Whilst this is absolutely true, I believe that this is the typical dividing line between the two camps. If you believe that programming languages must be Turing-complete, or in other words, capable of arbitrary computation, then no, HTML is not a programming language, however if you believe that programming is the process or activity of writing a schedule or sequence of instructions to automate the execution order of a machine, then HTML does qualify as a programming language, but only to a very limited degree.

Additionally, there are many examples of technology that we might consider programmable, even though they lack the ability for arbitrary computation; for example, lathes, looms, sewing machines, 3D printers, even VCRs (if you're old enough to remember those). Such devices can be programmed (or perhaps scheduled or configured) to automate a particular task, but not necessarily any task.

If HTML is a Programming Language then so are XML and JSON

There is a very distinct difference between them. For the sake of this argument we will omit JSON and focus on XML since it's a sibling of HTML. XML provides a well-defined structure for document or data storage and exchange, but aside from the <root> element, there are no syntactic or semantic elements in XML and therefore its interpretation tends to be application-specific. This is where HTML is different.

HTML has a well-defined and widely understood set of elements that not only define the logical structure of a document, but also syntactic and semantic elements that clearly describe their meaning to the browser and to the developer. In contrast to XML, web browsers provide common interpretation for HTML in order to render elements correctly, consistently and independently of the underlying platform.

Arguably, this could be considered equivalent to a virtual machine, albeit a very limited one. HTML elements can therefore be thought of as very high-level, declarative, and functional abstractions, which are interpreted by a web browser, and subsequently executed by the underlying platform or computer. Let's take a look at some examples...

When you declare a <h1> element, the browser renders a heading, normally in big, bold text.

<h1>Hello, World!</h1>

Hello, World!

When you declare a <p> element, the browser renders a paragraph of regular text.

<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.</p>

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

When you declare a <hr> element, the browser renders a horizontal rule.

<hr>

When you declare a <button> element, the browser renders a button.

<button>Click me</button>

Therefore, HTML does not burden the developer with how these elements should be rendered, but instead allows the developer to declare what should be rendered. Ultimately, rendering is the responsibility of the web browser, which will inevitably call functions in the underlying graphics sub-system to render the Document Object Model (DOM) to the screen. One might even consider these functions to be re-programmable, or at least parameterised, since you can use CSS to modify the default style for any element, thus affecting how they are rendered.

If HTML is a Programming Language then so is a Word Document

Once again, there are some significant differences between writing HTML and writing Word documents. The first and most obvious is that Word does not provide a well-defined and widely understood language for defining the logical structure or describing the content of a document. If you really wanted to, I'm sure you could painstakingly learn the XML syntax, write it all out by hand and then package it all into a Word document, but that would be utter madness when there is an application that will do it all for you. There are of course applications that allow you to visually design HTML, such as Dreamweaver or WebFlow, but these applications are not widely considered the norm, and the markup that they produce can often be a hideous, unoptimised mess. In any case, those applications could arguably be considered examples of visual programming, because they produce code in a well-defined and widely understood language, whereas Word does not.

More importantly, whilst the underlying data format for modern Word documents is XML, they contain a verbose and proprietary syntax for formatting, but not for semantics. For example, if you want a heading in a Word document, the application does not provide a specific, semantic element for it. Instead, you apply formatting to regular text by increasing the font size and making it bold. The equivalent of this in HTML would be to use paragraphs for everything and style them with CSS, but in doing so, you would lose or invalidate the semantic meaning of those elements and the content that they contain. Once again let's take a look at some examples...

As above, when you declare a <h1> element, the browser renders a heading, normally in big, bold text.

<h1>Hello, World!</h1>

Hello, World!

You can achieve the same thing by using a paragraph <p> element and applying CSS, but in doing so you incur a loss of semantic meaning about what that element is intended to convey. Note that the <p> element is styled with Bootstrap's .h1 class, which is equivalent to the default style for a <h1> element.

<p class="h1">Hello, World!</p>

Hello, World!

The latter is effectively what happens in a Word document. The following block of XML is how Word applies formatting to style a heading.

<w:p w14:paraId="2A3A39D5" w14:textId="53D87734" w:rsidR="00287C96" w:rsidRPr="0027454A" w:rsidRDefault="0027454A">
    <w:pPr>
        <w:rPr>
            <w:sz w:val="40"/>
            <w:szCs w:val="40"/>
        </w:rPr>
    </w:pPr>
    <w:r w:rsidRPr="0027454A">
        <w:rPr>
            <w:sz w:val="40"/>
            <w:szCs w:val="40"/>
        </w:rPr>
        <w:t>Hello, World!</w:t>
    </w:r>
</w:p>

If HTML is a Programming Language then we wouldn't need JavaScript

Whilst JavaScript has seen immense popularity over recent years, let's not forget its origins. JavaScript was invented in 1995 by Brendan Eich, who at the time was working for Netscape. It took 10 days to write, and even today, it shows! The question I tend to ask is, why the need for another language which could be executed by the browser, instead of just adding Turing-complete logic directly to HTML?

Firstly, adding Turing-complete syntax to HTML would have resulted in a proprietary markup language. To be fair, the 90s were a bit wild west when it came to programming/computer languages and standardisation; just ask any developer who's had to maintain websites that had to support Internet Explorer. It took Microsoft a long time to get on board with the W3C standards. In any case, adding syntax for arbitrary computation directly into HTML would probably have resulted in support only via Netscape at the time.

Secondly, have you ever considered what a program written in HTML might look like? The syntax would be hideously verbose! Not only that, but without intelligent IDEs or editors, developers would not easily be able to distinguish between presentation and logic.

In order to demontrate why HTML would not be a suitable candidate for a Turing-complete programming language, let's take a look at a simple example. The following JavaScript code demonstrates a recursive function that produces the fibonacci series.

function fibonacci(a, b, array, count) {
    if (array.length < count) {
        array.push(a + b);
        fibonacci(b, a + b, array, count);
    }

    return array;
}

fibonacci(0, 1, [], 10)

Now, let's take a look at what that might look like in some sort of proprietary HTML syntax.

<program>
    <function name="fibonacci">
        <params>
            <param name="a" type="int">
            <param name="b" type="int">
            <param name="array" type="array:int">
            <param name="count" type="int">
        </params>
        <body>
            <if left="array.length" right="count" operator="lt">
                <call function="array.push">
                    <arguments>
                        <argument>
                            <binaryexpression left="a" right="b" operator="plus" />
                        </argument>
                    </arguments>
                </call>
                <call function="fibonacci">
                    <arguments>
                        <argument identifier="b" />
                        <argument>
                            <binaryexpression left="a" right="b" operator="plus" />
                        </argument>
                        <argument identifier="array" />
                        <argument identifier="count" />
                    </arguments>
                </call>
            </if>
            <return identifier="array" />
        </body>
    </function>
    <function entrypoint="true">
        <call function="fibonacci">
            <arguments>
                <argument literal="0" />
                <argument literal="1" />
                <argument literal="[]" />
                <argument literal="10" />
            </arguments>
        </call>
    </function>
</program>

Would you really want to write programs in a language that verbose?

In actual fact, compilers produce code that looks similar to this; almost certainly not in a markup language, and probably not in any way presentable, but during the compilation process, a compiler will produce something called an Abstract Syntax Tree (AST). The following example demonstrates an AST of the fibonacci function, parsed using a tool called Esprima. It's even more verbose, but not exactly a million miles away from the example above.

<?xml version="1.0" encoding="UTF-8" ?>
<root>
  <type>Program</type>
  <body>
    <type>FunctionDeclaration</type>
    <id>
      <type>Identifier</type>
      <name>fibonacci</name>
    </id>
    <params>
      <type>Identifier</type>
      <name>a</name>
    </params>
    <params>
      <type>Identifier</type>
      <name>b</name>
    </params>
    <params>
      <type>Identifier</type>
      <name>array</name>
    </params>
    <params>
      <type>Identifier</type>
      <name>count</name>
    </params>
    <body>
      <type>BlockStatement</type>
      <body>
        <type>IfStatement</type>
        <test>
          <type>BinaryExpression</type>
          <operator><</operator>
          <left>
            <type>MemberExpression</type>
            <computed>false</computed>
            <object>
              <type>Identifier</type>
              <name>array</name>
            </object>
            <property>
              <type>Identifier</type>
              <name>length</name>
            </property>
          </left>
          <right>
            <type>Identifier</type>
            <name>count</name>
          </right>
        </test>
        <consequent>
          <type>BlockStatement</type>
          <body>
            <type>ExpressionStatement</type>
            <expression>
              <type>CallExpression</type>
              <callee>
                <type>MemberExpression</type>
                <computed>false</computed>
                <object>
                  <type>Identifier</type>
                  <name>array</name>
                </object>
                <property>
                  <type>Identifier</type>
                  <name>push</name>
                </property>
              </callee>
              <arguments>
                <type>BinaryExpression</type>
                <operator>+</operator>
                <left>
                  <type>Identifier</type>
                  <name>a</name>
                </left>
                <right>
                  <type>Identifier</type>
                  <name>b</name>
                </right>
              </arguments>
            </expression>
          </body>
          <body>
            <type>ExpressionStatement</type>
            <expression>
              <type>CallExpression</type>
              <callee>
                <type>Identifier</type>
                <name>fibonacci</name>
              </callee>
              <arguments>
                <type>Identifier</type>
                <name>b</name>
              </arguments>
              <arguments>
                <type>BinaryExpression</type>
                <operator>+</operator>
                <left>
                  <type>Identifier</type>
                  <name>a</name>
                </left>
                <right>
                  <type>Identifier</type>
                  <name>b</name>
                </right>
              </arguments>
              <arguments>
                <type>Identifier</type>
                <name>array</name>
              </arguments>
              <arguments>
                <type>Identifier</type>
                <name>count</name>
              </arguments>
            </expression>
          </body>
        </consequent>
        <alternate/>
      </body>
      <body>
        <type>ReturnStatement</type>
        <argument>
          <type>Identifier</type>
          <name>array</name>
        </argument>
      </body>
    </body>
    <generator>false</generator>
    <expression>false</expression>
    <async>false</async>
  </body>
  <body>
    <type>ExpressionStatement</type>
    <expression>
      <type>CallExpression</type>
      <callee>
        <type>Identifier</type>
        <name>fibonacci</name>
      </callee>
      <arguments>
        <type>Literal</type>
        <value>0</value>
        <raw>0</raw>
      </arguments>
      <arguments>
        <type>Literal</type>
        <value>1</value>
        <raw>1</raw>
      </arguments>
      <arguments>
        <type>ArrayExpression</type>
        <elements/>
      </arguments>
      <arguments>
        <type>Literal</type>
        <value>10</value>
        <raw>10</raw>
      </arguments>
    </expression>
  </body>
  <sourceType>script</sourceType>
</root>

In conclusion, instead of stating "if HTML is a Programming Language then we wouldn't need JavaScript", instead understand the rationale as to why HTML does not contain markup for producing arbitrary computation. Even if it did, the markup would be so verbose that JavaScript, or something like it would have replaced at least the logical aspects of HTML anyway, leaving developers to focus on just the presentational aspects, which is exactly how HTML is used today.