Syntax highlighting for Tree-sitter

If you haven’t read the previous post, you might wan’t to check it out here.

This post explores how to write tree-sitter queries for a custom grammar and enable syntax highlighting within Neovim. If you don’t know how to make a tree-sitter grammar or want a primer, click here.

This guide is focused on Neovim, you may refer to it for syntax highlighting, but you will have to adapt the neovim configuration part.

neovim configuration

You can setup neovim to parse your local tree-sitter grammar by adding these line to your config.

 1-- Somewhere in you config
 2local parser_config = require("nvim-treesitter.parsers").get_parser_configs()
 3
 4parser_config.hotshot = {
 5    install_info = {
 6        -- url = "~/code/other-projects/tree-sitter-hotshot",
 7        url = "<path-to-your-tree-sitter-folder>",
 8        files = { "src/parser.c" }
 9    },
10    filetype = "ht"
11}
12
13-- To add filetype to the extension of `.ht'`
14vim.filetype.add({ extension = { ht = 'hotshot' } })

After loading the config, you can run this in Neovim.

:TSInstall hotshot

This would install the corresponding grammar.

This doesn’t install the tree-sitter queries. It only makes it recognize the grammar. We will cover the queries part later in the post.

tree-sitter queries

Tree-sitter has a excellent querying syntax inspired from scheme. This query syntax is the reason it’s as efficient at it’s job. Parsing is of no use without using it for something.

You write the queries inside the queries directory inside your tree-sitter-<language> directory.

  • Inside this directory the highlights.scm file provides syntax highlights.
  • Other features like indentation/text-objects can be defined inside the queries directory.

For reference here is how my directory is layed out.

 1.
 2├── bindings
 3│   ├── c
 4│   ├── .... (some other files)
 5│   └── swift
 6├── grammar.js
 7├── .... (some other files)
 8├── programs
 9│   ├── arithmetic.ht
10│   ├── booleans.ht
11│   ├── .... (some other files)
12│   └── variables.ht
13├── queries
14│   └── highlights.scm    <== The `highlights.scm`
15└── src
16    ├── grammar.json
17    ├── node-types.json
18    ├── parser.c
19    └── tree_sitter
20
2111 directories, 31 files

In this highlights.scm you will write all the queries for highlighting.

To query you use the following syntax.

1(node-type optional_name: (subquery-to-match) @tag)

For example, to match a integer with the tag number. You can write

1;; In 'queries/highlights.scm'
2(integer) @number

This will match all integer rules within your tree-sitter grammar to match with the @number tag.

But what are tags ?


Tree-sitter tags

These tags are either user-defined or pre-defined keywords used to mark the query. These can be later used for any operation on the matching elements.

Or in this case, predefined syntax targets defined by Neovim and most editors.

There are multiple tags predefined by Neovim, I haven’t found a place where all the tags are defined. These tags are taken from existing langauge queries like Go.


These queries can also contain multiple matching targets. For example these are the keywords tagged.

1;; In 'queries/highlights.scm'
2[
3 "let"
4 "while"
5 "fn"
6 "cond"
7 "lambda"
8 ] @keyword

For matching other simple targets, like booleans and strings. Here’s how you can do that.

1;; In 'queries/highlights.scm'
2(boolean) @constant
3(integer) @number
4(string) @string

For variables, Neovim provides 3 targets.

  • @type for type declarations
  • @function for function names
  • @property for classes etc.

For example, for matching the name of a function in a function call. We need to use a subquery to match the fname inside a fcall. We optionally provide a name to the query.

1;; In 'queries/highlights.scm'
2(fcall name: (fname) @function)

The fname rule inside the fcall rule is matched and tagged with @function.

Accordingly all the other keywords are matched with appropriate tags.

 1;; In 'queries/highlights.scm'
 2(fcall name: (fname) @function)
 3
 4(fdec name: (identifier) @function)
 5
 6(lambda argument: (identifier) @type)
 7
 8(fdec argument: (identifier) @type)
 9
10(let name: (identifier) @property)

Other targets like punctuation and comments also need to be matched. Here is the code for matching them.

 1;; In 'queries/highlights.scm'
 2(comments) @comment
 3
 4(operator) @operator
 5
 6
 7")" @punctuation
 8"(" @punctuation
 9"{" @punctuation
10"}" @punctuation

It’s all the queries you can write for a small language like hotshot. Here’s the highlights.scm file in it’s entirety.

 1;; `queries/highlights.scm`.
 2(fcall name: (fname) @function)
 3
 4(fdec name: (identifier) @function)
 5
 6(lambda argument: (identifier) @type)
 7
 8(fdec argument: (identifier) @type)
 9
10(let name: (identifier) @property)
11
12(boolean) @constant
13(integer) @number
14(string) @string
15
16[
17 "let"
18 "while"
19 "fn"
20 "cond"
21 "lambda"
22 ] @keyword
23
24(comments) @comment
25
26(operator) @operator
27
28
29")" @punctuation
30"(" @punctuation
31"{" @punctuation
32"}" @punctuation

Now’s the time to embed these into Neovim.

Integration with neovim

Neovim doesn’t support installing queries when you install the tree-sitter grammar (Atleast for local installation). You need to manually symlink the queries directory.

1# ln -s ~/.config/nvim/queries/hotshot ~/code/other-projects/tree-sitter-hotshot/queries
2$ ln -s ~/.config/nvim/queries/hotshot <your-queries-directory>

You need to make the ~/.config/nvim/queries directory if it doesn’t exist already.

Now the syntax highlighting should work.

tips and tricks

  • You can use tree-sitter highlight to test and highlight code while writing the queries.

    It requires the tree-sitter config to be initialized, but if are developing for a complex language, it is useful.

  • Develop the queries incrementally, start with the simple rules and then move to the bigger ones.

    Once you know your editor understands the queries, you can be confident about complex queries.

help

The documentation for tree-sitter is spotty and hard to understand. Here are some references where you might find further help.

  • In neovim the :help treesitter-query provides with the highlight groups to target (The @comment , @property tags)

  • The official documentation by tree-sitter, available here

  • The official tree-sitter grammar for Go, Rust and Python. These are a little hard to understand but might provide insights into real-world applications.