Fix missing reply text when message body is parsed as HTML in [linkedom](https://github.com/WebReflection/linkedom) (SSR).

- [`linkedom`](https://github.com/WebReflection/linkedom) is being used https://github.com/matrix-org/matrix-public-archive to server-side render (SSR) Hydrogen (`hydrogen-view-sdk`)
 - This is being fixed by using a explicit HTML wrapper boilerplate with `DOMParser` to get a matching result in the browser and `linkedom`.

Currently `parseHTML` is only used for HTML content bodies in events. Events with replies have content bodies that look like `<mx-reply>Hello</mx-reply> What's up` so they're parsed as HTML to strip out the `<mx-reply>` part.

Before | After
---  |  ---
![](https://user-images.githubusercontent.com/558581/153692011-2f0e7114-fcb4-481f-b217-49f461b1740a.png) | ![](https://user-images.githubusercontent.com/558581/153692016-52582fdb-abd9-439d-9dce-3f04da6959db.png)

Before:
```js
// Browser (Chrome, Firefox)
new DOMParser().parseFromString(`<div>foo</div>`, "text/html").body.outerHTML;
// '<body><div>foo</div></body>'

// `linkedom` 
new DOMParser().parseFromString(`<div>foo</div>`, "text/html").body.outerHTML;
// '<body></body>'
```

After (consistent matching output):

```js
// Browser (Chrome, Firefox)
new DOMParser().parseFromString(`<!DOCTYPE html><html><body><div>foo</div></body></html>`, "text/html").body.outerHTML;
// '<body><div>foo</div></body>'

// `linkedom`
new DOMParser().parseFromString(`<!DOCTYPE html><html><body><div>foo</div></body></html>`, "text/html").body.outerHTML;
// '<body><div>foo</div></body>'
```

`linkedom` goal is to be close to the current DOM standard, but [not too close](https://github.com/WebReflection/linkedom#faq). Focused on the streamlined cases for server-side rendering (SSR).

Here is some context around getting `DOMParser` to interpret things better. The conclusion was to only support the explicit standard cases with a `<html><body></body></html>` specified instead of adding the magic HTML document creation and massaging that the browser does.

 - https://github.com/WebReflection/linkedom/issues/106
 - https://github.com/WebReflection/linkedom/pull/108

 ---

Part of https://github.com/vector-im/hydrogen-web/pull/653 to support server-side rendering Hydrogen for the [`matrix-public-archive`](https://github.com/matrix-org/matrix-public-archive) project.
This commit is contained in:
Eric Eastwood 2022-02-11 19:47:24 -06:00
parent 75e2618f70
commit dfed04166e

View file

@ -64,6 +64,6 @@ export function parseHTML(html) {
// If DOMPurify uses DOMParser, can't we just get the built tree from it
// instead of re-parsing?
const sanitized = DOMPurify.sanitize(html, sanitizeConfig);
const bodyNode = new DOMParser().parseFromString(sanitized, "text/html").body;
const bodyNode = new DOMParser().parseFromString(`<!DOCTYPE html><html><body>${sanitized}</body></html>`, "text/html").body;
return new HTMLParseResult(bodyNode);
}