Adding Pagefind Search to a Static Eleventy Site

Search is an essential feature on a website, even if we properly structure it with navigation, a sitemap, a tagging system, etc… But how to put in place it on a static content hosting?

🌐 Introduction #

⚡ Static content hosting for the win! #

I’m a big fan of static websites. They’re amazing for many reasons, including:

Near-zero server maintenance
Very limited security risks
Reduced infrastructure costs
Excellent performance

I even gave a talk on the subject titled “1 HTML page to serve 1 million URLs with static web content” (see here).

🕵️ Search on static websites #

For a long time, when we wanted to offer a search function on a static site, we had very few options. All solutions involved using an external service like:

Embedded Google Search
A search form for DuckDuckGo with a site filter
Integrating a search engine like Algolia

All these work very well. But for the sake of independence and especially data privacy protection, it’s better to use your own solution.

Recently, I migrated my blog to a static site built with Eleventy using the eleventy-duo template. But, I was missing that search function without relying on an external service^[1].

💡 Static hosted full-text-search #

Then Philippe Charrière shared a post from Axel Rauschmayer referring to full-text search (FTS) for static content. Indeed, Pagefind allows you to index content and perform searches on a static website. This solution is, of course, also statically hosted^[2].

After this long introduction, it’s time to dive into how to install Pagefind on a static content site like Eleventy. The integration on a site takes less than 15 minutes, but I will detail how to customize it.

🧭 Pagefind on your static website #

🛠️ Installing Pagefind #

Installation is straightforward with npm:

npm install --save-dev pagefind

📚 Building the Index with Pagefind CLI #

The default behavior works well. Yet, you can choose which content Pagefind indexes. To do this, add the data-pagefind-body attribute in the content templates. For example, in my src/layouts/post.njk:

<article class="post" data-pagefind-body>
...
</article>

By using this attribute, Pagefind only indexes what you explicitly indicate. You can also ask it to exclude certain HTML fragments from indexing using the data-pagefind-ignore attribute.

At this point, you can test the indexing with the command npx -y pagefind --site public^[3]:

Running Pagefind v1.3.0 (Extended)
Running from: "/home/myuser/projects/mywebsite"
Source:       "public"
Output:       "public/pagefind"

[Walking source directory]
Found 270 files matching **/*.{html}

[Parsing files]
Found a data-pagefind-body element on the site.
↳ Ignoring pages without this tag.

[Reading languages]
Discovered 1 language: en

[Building search indexes]
Total:
  Indexed 1 language
  Indexed 211 pages
  Indexed 34504 words
  Indexed 3 filters
  Indexed 0 sorts

Finished in 1.192 seconds

But, it’s better to automate this step. That’s why you need to tell Eleventy to run Pagefind’s CLI after generating the static HTML content. This is done in the .eleventy.js file by adding this step:

const { execSync } = require('child_process')

module.exports = function(eleventyConfig) {
  eleventyConfig.on('eleventy.after', () => {
      execSync(`npx -y pagefind --site public`, { encoding: 'utf-8' })
  })
};

I also updated the scripts section of the package.json with the build:index script to force index generation on demand:

  "scripts": {
    "dev": "run-p dev:*",
    "start": "eleventy --serve",
    "build": "run-s clean build:*",
    "dev:assets": "env ELEVENTY_ENV=development webpack --mode production --watch",
    "dev:site": "env ELEVENTY_ENV=development eleventy --serve",
    "build:assets": "webpack --mode production",
    "build:site": "env ELEVENTY_ENV=production eleventy",
    "build:index": "npx -y pagefind --site public",
    "clean": "rm -rf ./public"
  },

This way, reindexing is possible by running npm run build:index or npm run build.

In a CI/CD pipeline, we always aim to optimize task execution time. That’s why I modified the original command with the --glob option, as explained in this article, to only index article files: npx -y pagefind --site public --glob "posts/**/*.html". The result is the same, but Pagefind won’t try to index all the other files on the website.

Note that it’s currently not possible to reindex content in development mode. Pagefind does not yet support --watch mode, and Eleventy stores content in memory in development mode, so it doesn’t regenerate the static content used by Pagefind.

🔍 Adding the Search with Pagefind UI component #

To add the search field to a page, include this code:

<div id="search" class="search"></div>
<link href="/pagefind/pagefind-ui.css" rel="stylesheet">
<script src="/pagefind/pagefind-ui.js" onload="new PagefindUI({ element: '#search', showImages: true, showSubResults: true, autofocus: true });"></script>

The autofocus didn’t work for me at first, so I added this script to force it:

<script>
document.addEventListener("DOMContentLoaded", function() {
    var searchInput = document.querySelector(".pagefind-ui__search-input");
    if (searchInput) {
        searchInput.focus();
    }
});
</script>

You can also customize the appearance of Pagefind. Personally, I specified the website’s font and adjusted the UI component’s size:

.pagefind-ui {
    --pagefind-ui-font: Iowan Old Style, Palatino Linotype, Palatino, Georgia, serif;
    --pagefind-ui-scale: 0.9;
}

Here’s a screenshot of the search page:

PageFind UI

🚀 Going Further #

I won’t go into detail, but it’s possible to customize content indexing or search display. But, Pagefind allows you to add filters to the results. For example, to filter results by tags (for articles), I added the data-pagefind-filter attribute in my template for tags and publication years:


<a href="{{ tagUrl | url }}" class="post-tag" data-pagefind-filter="Tag">#{{ tag }}</a>

You can imagine adding other filters like authors, page types, difficulty levels, etc. This article provides filter examples.

🔗 Search Links #

By default, Pagefind doesn’t allow you to initialize a search from URL parameters. But, this kind of feature is quite common when using static content with JavaScript^[4]. Based on this article, I completed the search initialization to support both search text and filters:

<script src="/pagefind/pagefind-ui.js"></script>
<script>
  const getQueryParams = () => {
      const params = new URLSearchParams(window.location.search);
      console.log('params',params);
      return [...params.entries()].reduce((acc, [key, value]) => {
          if (key === 'q') {
              acc.searchTerm = value;
          } else {
              acc.filters[key] = acc.filters[key] || [];
              acc.filters[key].push(value);
          }
          return acc;
      }, { searchTerm: null, filters: {} });
  }

  const initializePagefindSearch = (search) => {
      const { searchTerm, filters } = getQueryParams();
      console.log('filters', filters);
      if (Object.keys(filters).length > 0) search.triggerFilters(filters);
      console.log('searchTerm', searchTerm);
      if (searchTerm) search.triggerSearch(searchTerm);
  }

  document.addEventListener("DOMContentLoaded", function() {
      initializePagefindSearch(new PagefindUI({
          element: '#search',
          showImages: false,
          showSubResults: true,
          autofocus: true,
          pageSize: 10
      }));
  });
</script>

This way, you can create links directly to a search, such as all articles about Devoxx France with the git label in 2024: /search?q=Devoxx+France&Tag=git&Year=2024

🏆 Conclusion #

Pagefind’s strengths are threefold:

Simple and quick to integrate, it’s up and running in less than 15 minutes.
Search results are relevant.
Only requires a static content server with minimal network impact on the user^[5].

It’s worth mentioning that the Pagefind documentation is very complete while remaining simple and accessible.

While researching about Pagefind, I discovered that the design system template for websites of the French État also uses Eleventy and Pagefind.

I regret that this solution doesn’t work with my single page application (SPA) websites. This is because I don’t generate HTML documents on the server. Indeed, it’s my APIs that are served with static JSON content. I still need to test it on a site with many more pages^[6].

I hope this article has shed some light on how Pagefind works and the possibilities it offers. Don’t hesitate to share your experience of search on static hosted websites.

I had noticed Lunr. But it requires more integration effort and may cause performance (network) issues at scale. ↩︎
There’s WASM under the hood (hence Philippe’s share) and an optimization of index loading with chunks. It’s a technique I had already noticed during my experiments with SQLite and the static hosting of a database. ↩︎
My static site generator (SSG) is configured to generate static content in the public directory. You’ll need to replace it with the correct directory name. ↩︎
This is even the main topic of one of my talks “1 HTML page to serve 1 million URLs with static web content”. ↩︎
I personally set up Pagefind on a dedicated search page. Some have implemented lazy loading for their own component using Pagefind’s API. This allows it to display on all pages without extra downloads if the search isn’t used. ↩︎
The project mentions sites with around 10,000 pages. I’d love to try it on a site with 1,000,000 pages. 🤩 ↩︎