Boldly link where no one has linked before: Text Fragments

Fragment Identifiers #
Chrome 80 was a big release. It contained a number of highly anticipated features like
ECMAScript Modules in Web Workers,
nullish coalescing,
optional chaining, and more. The release was, as usual,
announced through a
blog post o…

Fragment Identifiers

Chrome 80 was a big release. It contained a number of highly anticipated features like
ECMAScript Modules in Web Workers,
nullish coalescing,
optional chaining, and more. The release was, as usual,
announced through a
blog post on the
Chromium blog. You can see an excerpt of the blog post in the screenshot below.

Chromium blog post with red boxes around elements with an id attribute.

You are probably asking yourself what all the red boxes mean. They are the result of running the
following snippet in DevTools. It highlights all elements that have an id attribute.

document.querySelectorAll('[id]').forEach((el) => {
el.style.border = 'solid 2px red';
});

I can place a deep link to any element highlighted with a red box thanks to the
fragment identifier
which I then use in the hash of the
page’s URL. Assuming I wanted to deep link to the Give us feedback in our
Product Forums
box in the
aside, I could do so by handcrafting the URL
https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#HTML1.
As you can see in the Elements panel of the Developer Tools, the element in question has an id
attribute with the value HTML1.

Dev Tools showing the id of an element.

If I parse this URL with JavaScript’s URL() constructor, the different components are revealed.
Notice the hash property with the value #HTML1.

new URL('https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#HTML1');
/* Creates a new `URL` object
URL {
hash: "#HTML1"
host: "blog.chromium.org"
hostname: "blog.chromium.org"
href: "https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#HTML1"
origin: "https://blog.chromium.org"
password: ""
pathname: "/2019/12/chrome-80-content-indexing-es-modules.html"
port: ""
protocol: "https:"
search: ""
searchParams: URLSearchParams {}
username: ""
}
*/

The fact though that I had to open the Developer Tools to find the id of an element speaks volumes
about the probability this particular section of the page was meant to be linked to by the author of
the blog post.

What if I want to link to something without an id? Say I want to link to the ECMAScript Modules
in Web Workers
heading. As you can see in the screenshot below, the <h1> in question does not
have an id attribute, meaning there is no way I can link to this heading. This is the problem that
Text Fragments solve.

Dev Tools showing a heading without an id.

Text Fragments

The Text Fragments proposal adds support for
specifying a text snippet in the URL hash. When navigating to a URL with such a text fragment, the
user agent can emphasize and/or bring it to the user’s attention.

Browser compatibility

The Text Fragments feature is supported in version 80 and beyond of Chromium-based browsers. At the
time of writing, Safari and Firefox have not publicly signaled an intent to implement the feature.
See Related links for pointers to the Safari and Firefox discussions.

Success: These links used to not work when served across
client-side redirects
that some common services like Twitter use. This issue was tracked as
crbug.com/1055455 and is now fixed. Regular
HTTP redirects always
worked fine.

For security reasons, the feature requires links to be opened in a
noopener context.
Therefore, make sure to include
rel="noopener" in your
<a> anchor markup or add
noopener to your
Window.open() list of window functionality features.

textStart

In its simplest form, the syntax of Text Fragments is as follows: The hash symbol # followed by
:~:text= and finally textStart, which represents the
percent-encoded
text I want to link to.

#:~:text=textStart

For example, say that I want to link to the ECMAScript Modules in Web Workers heading in the
blog post announcing features in Chrome 80,
the URL in this case would be:

https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=ECMAScript%20Modules%20in%20Web%20Workers

The text fragment is emphasized like this.
If you click the link in a supporting browser like Chrome, the text fragment is highlighted and
scrolls into view:

Text fragment scrolled into view and highlighted.

textStart and textEnd

Now what if I want to link to the entire section titled ECMAScript Modules in Web Workers, not
just its heading? Percent-encoding the entire text of the section would make the resulting URL
impracticably long.

Luckily there is a better way. Rather than the entire text, I can frame the desired text using the
textStart,textEnd syntax. Therefore, I specify a couple of percent-encoded words at the beginning
of the desired text, and a couple of percent-encoded words at the end of the desired text, separated
by a comma ,.

That looks like this:

https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=ECMAScript%20Modules%20in%20Web%20Workers,ES%20Modules%20in%20Web%20Workers..

For textStart, I have ECMAScript%20Modules%20in%20Web%20Workers, then a comma , followed
by ES%20Modules%20in%20Web%20Workers. as textEnd. When you click through on a supporting browser
like Chrome, the whole section is highlighted and scrolled into view:

Text fragment scrolled into view and highlighted.

Now you may wonder about my choice of textStart and textEnd. Actually, the slightly shorter URL
https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=ECMAScript%20Modules,Web%20Workers.
with only two words on each side would have worked, too. Compare textStart and textEnd with the
previous values.

If I take it one step further and now use only one word for both textStart and textEnd, you can
see that I am in trouble. The URL
https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=ECMAScript,Workers.
is even shorter now, but the highlighted text fragment is no longer the originally desired one. The
highlighting stops at the first occurrence of the word Workers., which is correct, but not what I
intended to highlight. The problem is that the desired section is not uniquely identified by the
current one-word textStart and textEnd values:

Non-intended text fragment scrolled into view and highlighted.

prefix- and -suffix

Using long enough values for textStart and textEnd is one solution for obtaining a unique link.
In some situations, however, this is not possible. On a side note, why did I choose the
Chrome 80 release blog post as my example? The answer is that in this release Text Fragments
were introduced:

Blog post text: Text URL Fragments.
    Users or authors can now link to a specific portion of a page
    using a text fragment provided in a URL.
    When the page is loaded, the browser highlights the text and scrolls the fragment into view.
    For example, the URL below loads a wiki page for 'Cat'
    and scrolls to the content listed in the `text` parameter.
Text Fragments announcement blog post excerpt.

Notice how in the screenshot above the word "text" appears four times. The forth occurrence is
written in a green code font. If I wanted to link to this particular word, I would set textStart
to text. Since the word "text" is, well, only one word, there cannot be a textEnd. What now? The
URL
https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=text
matches at the first occurrence of the word "Text" already in the heading:

Text Fragment matching at the first occurrence of “Text”.

Caution: Note that text fragment matching is case-insensitive.

Luckily there is a solution. In cases like this, I can specify a prefix​- and a -suffix. The
word before the green code font "text" is "the", and the word after is "parameter". None of the
other three occurrences of the word "text" has the same surrounding words. Armed with this
knowledge, I can tweak the previous URL and add the prefix- and the -suffix. Like the other
parameters, they, too, need to be percent-encoded and can contain more than one word.
https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=the-,text,-parameter.
To allow the parser to clearly identify the prefix- and the -suffix, they need to be separated
from the textStart and the optional textEnd with a dash -.

Text Fragment matching at the desired occurrence of “text”.

The full syntax

The full syntax of Text Fragments is shown below. (Square brackets indicate an optional parameter.)
The values for all parameters need to be percent-encoded. This is especially important for the dash
-, ampersand &, and comma , characters, so they are not being interpreted as part of the text
directive syntax.

#:~:text=[prefix-,]textStart[,textEnd][,-suffix]

Each of prefix-, textStart, textEnd, and -suffix will only match text within a single
block-level element,
but full textStart,textEnd ranges can span multiple blocks. For example,
:~:text=The quick,lazy dog will fail to match in the following example, because the starting
string "The quick" does not appear within a single, uninterrupted block-level element:

<div>
The
<div></div>
quick brown fox
</div>
<div>jumped over the lazy dog</div>

It does, however, match in this example:

<div>The quick brown fox</div>
<div>jumped over the lazy dog</div>

Creating Text Fragment URLs with a browser extension

Creating Text Fragments URLs by hand is tedious, especially when it comes to making sure they are
unique. If you really want to, the specification has some tips and lists the exact
steps for generating Text Fragment URLs.
We provide an open-source browser extension called
Link to Text Fragment that lets you
link to any text by selecting it, and then clicking "Copy Link to Selected Text" in the context
menu. This extension is available for the following browsers:


Link to Text Fragment

browser extension.

Multiple text fragments in one URL

Note that multiple text fragments can appear in one URL. The particular text fragments need to be
separated by an ampersand character &. Here is an example link with three text fragments:
https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=Text%20URL%20Fragments&text=text,-parameter&text=:~:text=On%20islands,%20birds%20can%20contribute%20as%20much%20as%2060%25%20of%20a%20cat's%20diet.

Three text fragments in one URL.

Mixing element and text fragments

Traditional element fragments can be combined with text fragments. It is perfectly fine to have both
in the same URL, for example, to provide a meaningful fallback in case the original text on the page
changes, so that the text fragment does not match anymore. The URL
https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#HTML1:~:text=Give%20us%20feedback%20in%20our%20Product%20Forums.
linking to the Give us feedback in our
Product Forums
section
contains both an element fragment (HTML1), as well as a text fragment
(text=Give%20us%20feedback%20in%20our%20Product%20Forums.):

Linking with both element fragment and text fragment.

The fragment directive

There is one element of the syntax I have not explained yet: the fragment directive :~:. To avoid
compatibility issues with existing URL element fragments as shown above, the
Text Fragments specification introduces the fragment
directive. The fragment directive is a portion of the URL fragment delimited by the code sequence
:~:. It is reserved for user agent instructions, such as text=, and is stripped from the URL
during loading so that author scripts cannot directly interact with it. User agent instructions are
also called directives. In the concrete case, text= is therefore called a text directive.

Feature detection

To detect support, test for the read-only fragmentDirective property on document. The fragment
directive is a mechanism for URLs to specify instructions directed to the browser rather than the
document. It is meant to avoid direct interaction with author script, so that future user agent
instructions can be added without fear of introducing breaking changes to existing content. One
potential example of such future additions could be translation hints.

if ('fragmentDirective' in document) {
// Text Fragments is supported.
}

From Chrome 80 to Chrome 85, the fragmentDirective property was defined on
Location.prototype. For details on this change, see
WICG/scroll-to-text-fragment#130.

Feature detection is mainly intended for cases where links are dynamically generated (for example by
search engines) to avoid serving text fragments links to browsers that do not support them.

Styling text fragments

By default, browsers style text fragments the same way they style
mark (typically black on yellow,
the CSS system colors
for mark). The user-agent stylesheet contains CSS that looks like this:

:root::target-text {
color: MarkText;
background: Mark;
}

As you can see, the browser exposes a pseudo selector
::target-text that you can use to
customize the applied highlighting. For example, you could design your text fragments to be black
text on a red background. As always, be sure to
check the color contrast
so your override styling does not cause accessibility issues and make sure the highlighting actually
visually stands out from the rest of the content.

:root::target-text {
color: black;
background-color: red;
}

Polyfillability

The Text Fragments feature can be polyfilled to some extent. We provide a
polyfill, which is used internally by
the extension, for browsers that do not
provide built-in support for Text Fragments where the functionality is implemented in JavaScript.

Security

Text fragment directives are invoked only on full (non-same-page) navigations that are the result of
a
user activation.
Additionally, navigations originating from a different origin than the destination will require the
navigation to take place in a
noopener context, such
that the destination page is known to be sufficiently isolated. Text fragment directives are only
applied to the main frame. This means that text will not be searched inside iframes, and iframe
navigation will not invoke a text fragment.

Privacy

It is important that implementations of the Text Fragments specification do not leak whether a text
fragment was found on a page or not. While element fragments are fully under the control of the
original page author, text fragments can be created by anyone. Remember how in my example above
there was no way to link to the ECMAScript Modules in Web Workers heading, since the <h1> did
not have an id, but how anyone, including me, could just link to anywhere by carefully crafting
the text fragment?

Imagine I ran an evil ad network evil-ads.example.com. Further imagine that in one of my ad
iframes I dynamically created a hidden cross-origin iframe to dating.example.com with a Text
Fragment URL
dating.example.com#:~:text=Log%20Out
once the user interacts with the ad. If the text "Log Out" is found, I know the victim is currently
logged in to dating.example.com, which I could use for user profiling. Since a naive Text
Fragments implementation might decide that a successful match should cause a focus switch, on
evil-ads.example.com I could listen for the blur event and thus know when a match occurred. In
Chrome, we have implemented Text Fragments in such a way that the above scenario cannot happen.

Another attack might be to exploit network traffic based on scroll position. Assume I had access to
network traffic logs of my victim, like as the admin of a company intranet. Now imagine there
existed a long human resources document What to Do If You Suffer From… and then a list of
conditions like burn out, anxiety, etc. I could place a tracking pixel next to each item on the
list. If I then determine that loading the document temporally co-occurs with the loading of the
tracking pixel next to, say, the burn out item, I can then, as the intranet admin, determine that
an employee has clicked through on a text fragment link with :~:text=burn%20out that the employee
may have assumed was confidential and not visible to anyone. Since this example is somewhat
contrived to begin with and since its exploitation requires very specific preconditions to be met,
the Chrome security team evaluated the risk of implementing scroll on navigation to be manageable.
Other user agents may decide to show a manual scroll UI element instead.

For sites that still wish to opt-out, we have proposed a
Document Policy
header value that they can send, so user agents will not process Text Fragment URLs. Since Document
Policy is not yet shipped, we are running an
origin trial to
apply this policy as an intermediate solution. The
ForceLoadAtTop origin
trial is running from Chrome version 83 to 85.

Document-Policy: force-load-at-top

Disabling text fragments

The easiest way for disabling the feature is by using an extension that can inject HTTP response
headers, for example,
ModHeader
(not a Google product), to insert a response (not request) header as follows:

Document-Policy: force-load-at-top

Another, more involved, way to opt out is by using the enterprise setting
ScrollToTextFragmentEnabled.
To do this on macOS, paste the command below in the terminal.

defaults write com.google.Chrome ScrollToTextFragmentEnabled -bool false

On Windows, follow the documentation on the
Google Chrome Enterprise Help support
site.

Warning: Please only try this when you know what you are doing.

For some searches, the search engine Google provides a quick answer or summary with a content
snippet from a relevant website. These featured snippets are most likely to show up when a search
is in the form of a question. Clicking a featured snippet takes the user directly to the featured
snippet text on the source web page. This works thanks to automatically created Text Fragments URLs.

Google search engine results page showing a featured snippet. The status bar shows the Text Fragments URL.
After clicking through, the relevant section of the page is scrolled into view.

Conclusion

Text Fragments URL is a powerful feature to link to arbitrary text on webpages. The scholarly
community can use it to provide highly accurate citation or reference links. Search engines can use
it to deeplink to text results on pages. Social networking sites can use it to let users share
specific passages of a webpage rather than inaccessible screenshots. I hope you start
using Text Fragment URLs
and find them as useful as I do. Be sure to install the
Link to Text Fragment browser
extension.

Acknowledgements

Text Fragments was implemented and specified by Nick Burris and
David Bokan, with contributions from
Grant Wang. Thanks to Joe Medley for
the thorough review of this article. Hero image by Greg Rakozy on
Unsplash.


Print Share Comment Cite Upload Translate
APA
Thomas Steiner | Sciencx (2024-03-28T18:37:52+00:00) » Boldly link where no one has linked before: Text Fragments. Retrieved from https://www.scien.cx/2020/06/17/boldly-link-where-no-one-has-linked-before-text-fragments/.
MLA
" » Boldly link where no one has linked before: Text Fragments." Thomas Steiner | Sciencx - Wednesday June 17, 2020, https://www.scien.cx/2020/06/17/boldly-link-where-no-one-has-linked-before-text-fragments/
HARVARD
Thomas Steiner | Sciencx Wednesday June 17, 2020 » Boldly link where no one has linked before: Text Fragments., viewed 2024-03-28T18:37:52+00:00,<https://www.scien.cx/2020/06/17/boldly-link-where-no-one-has-linked-before-text-fragments/>
VANCOUVER
Thomas Steiner | Sciencx - » Boldly link where no one has linked before: Text Fragments. [Internet]. [Accessed 2024-03-28T18:37:52+00:00]. Available from: https://www.scien.cx/2020/06/17/boldly-link-where-no-one-has-linked-before-text-fragments/
CHICAGO
" » Boldly link where no one has linked before: Text Fragments." Thomas Steiner | Sciencx - Accessed 2024-03-28T18:37:52+00:00. https://www.scien.cx/2020/06/17/boldly-link-where-no-one-has-linked-before-text-fragments/
IEEE
" » Boldly link where no one has linked before: Text Fragments." Thomas Steiner | Sciencx [Online]. Available: https://www.scien.cx/2020/06/17/boldly-link-where-no-one-has-linked-before-text-fragments/. [Accessed: 2024-03-28T18:37:52+00:00]
rf:citation
» Boldly link where no one has linked before: Text Fragments | Thomas Steiner | Sciencx | https://www.scien.cx/2020/06/17/boldly-link-where-no-one-has-linked-before-text-fragments/ | 2024-03-28T18:37:52+00:00
https://github.com/addpipe/simple-recorderjs-demo