====== Conversion from (La)TeX to HTML ======
Translating LaTeX documents (partially or fully) to HTML is a difficult problem,
primarily because the two document formats address very different needs:
TeX is intended to produce statically laid out documents with fixed dimensions,
ultimately representing ink on paper. HTML, on the other hand, assumes
a variety of differently sized and scaled screens and consequently prefers
to express layouts in more abstract terms, the typesetting of which are ultimately
left to the browser to interpret, ideally responsively --- i.e. we want
the document layout to adapt to different screen sizes, ranging from 8K desktop monitors
to cell phone screens.
This means that there is no one “correct” way to convert TeX to HTML --- rather
there are many choices to be made; most notably, which aspects of the static layout
with fixed dimensions described by TeX code to preserve, and which to discard
in favour of leaving them up to the rendering engine, thus explaining
the plurality of existing converters.
Naturally, many LaTeX macros are somewhat aligned with tags in HTML;
for example, sectioning macros (''\chapter'', ''\section'', etc.) correspond to
''
'', '''', etc.; the ''{itemize}'' and ''{enumerate}''
environments and the \item macro correspond to '''', '''' and ''- '',
respectively; and so on. Most converters therefore opt for the reasonable strategy
of mapping common LaTeX macros directly to their closest HTML relatives,
with no or minimal usage of (simple) CSS, effectively focusing on preserving
the //document semantics// of the used constructs (e.g. “paragraph”,
“section heading”, “unordered list”). In many situations, this is the natural approach
to pursue, especially if we can reasonably assume that the document sources
to be converted are sufficiently “uniform”, so that we can provide a similarly uniform
CSS style sheet to style them, and this is largely the way existing converters work.
To name just a few:
* [[https://dlmf.nist.gov/LaTeXML|LaTeXML]] focuses strongly on the semantics, using XML as the primary output format and heuristically determining an author’s intended semantics of everything from text paragraphs (definitions, examples, theorems, etc.) down to the meaning of individual symbols in mathematical formulae; achieving great success with ar5iv.org, hosting HTML documents generated from TeX sources available on [[https://arxiv.org/|arxiv.org]].
* [[https://tug.org/tex4ht/|TeX4ht]] focuses on plain HTML as output with minimal styling, going as far as to (optionally) replace the ''\LaTeX'' macro by the plain ASCII string “LaTeX”.
* [[https://pandoc.org/|Pandoc]] largely focuses on the most important macros and environments with analogues in all of its supported document format to convert between any two of them, e.g. TeX, Markdown, HTML, or docx.
* [[https://mathjax.org/|Mathjax]] focuses exclusively on macros for mathematical formulae and symbols, allowing to use TeX syntax in HTML documents directly, which are subsequently replaced via JavaScript by the intended presentation.
However, the approach described above has notable drawbacks:
Firstly, it requires special treatment of LaTeX macros that plain TeX
would expand into primitives, and the number of LaTeX macros is
virtually unlimited --- CTAN has (currently) a collection of 6399(nbsp)packages,
tendency growing, which get updated regularly, and authors can add their own macros
at any point. Supporting only the former is a never-ending task,
and providing direct HTML translations for the latter is impossible.
This is made worse by the very real and ubiquitous practice among LaTeX users
of copy-pasting and reusing various macro definitions and preambles assembled
from StackOverflow, friends and colleagues, and handed down for (by now literally)
generations, even in situations where (unbeknownst to them) “official” packages
with better solutions (possibly supported by HTML converters) exist.
-----
//Sources://
* [[doi>10.47397/tb/44-2/tb137mueller-primitives|Dennis Müller]]
{{htmlmetatags>metatag-keywords=(LaTeX,conversion,convertir du LaTeX en HTML,Pandoc)
metatag-og:title=(Conversion from (La)TeX to HTML)
metatag-og:site_name=(FAQ LaTeX francophone)
}}