Syntax highlighting for Tree-sitter
- published
- reading time
- 5 minutes
If you haven’t read the previous post, you might wan’t to check it out here.
This post explores how to write tree-sitter queries for a custom grammar and enable syntax highlighting within Neovim
.
If you don’t know how to make a tree-sitter grammar or want a primer, click here.
This guide is focused on
Neovim
, you may refer to it for syntax highlighting, but you will have to adapt the neovim configuration part.
neovim configuration
You can setup neovim to parse your local tree-sitter grammar by adding these line to your config.
1-- Somewhere in you config
2local parser_config = require("nvim-treesitter.parsers").get_parser_configs()
3
4parser_config.hotshot = {
5 install_info = {
6 -- url = "~/code/other-projects/tree-sitter-hotshot",
7 url = "<path-to-your-tree-sitter-folder>",
8 files = { "src/parser.c" }
9 },
10 filetype = "ht"
11}
12
13-- To add filetype to the extension of `.ht'`
14vim.filetype.add({ extension = { ht = 'hotshot' } })
After loading the config, you can run this in Neovim.
:TSInstall hotshot
This would install the corresponding grammar.
This doesn’t install the tree-sitter queries. It only makes it recognize the grammar. We will cover the queries part later in the post.
tree-sitter queries
Tree-sitter has a excellent querying syntax inspired from scheme
.
This query syntax is the reason it’s as efficient at it’s job.
Parsing is of no use without using it for something.
You write the queries inside the queries
directory inside your tree-sitter-<language>
directory.
- Inside this directory the
highlights.scm
file provides syntax highlights. - Other features like
indentation
/text-objects
can be defined inside thequeries
directory.
For reference here is how my directory is layed out.
1.
2├── bindings
3│ ├── c
4│ ├── .... (some other files)
5│ └── swift
6├── grammar.js
7├── .... (some other files)
8├── programs
9│ ├── arithmetic.ht
10│ ├── booleans.ht
11│ ├── .... (some other files)
12│ └── variables.ht
13├── queries
14│ └── highlights.scm <== The `highlights.scm`
15└── src
16 ├── grammar.json
17 ├── node-types.json
18 ├── parser.c
19 └── tree_sitter
20
2111 directories, 31 files
In this highlights.scm
you will write all the queries for highlighting.
To query you use the following syntax.
1(node-type optional_name: (subquery-to-match) @tag)
For example, to match a integer with the tag number
. You can write
This will match all integer
rules within your tree-sitter grammar to match with the @number
tag.
But what are tags ?
Tree-sitter tags
These tags are either user-defined or pre-defined keywords used to mark the query. These can be later used for any operation on the matching elements.
Or in this case, predefined syntax targets defined by Neovim
and most editors.
There are multiple tags predefined by
Neovim
, I haven’t found a place where all the tags are defined. These tags are taken from existing langauge queries likeGo
.
These queries can also contain multiple matching targets. For example these are the keywords tagged.
For matching other simple targets, like booleans
and strings
. Here’s how you can do that.
For variables, Neovim
provides 3 targets.
@type
for type declarations@function
for function names@property
for classes etc.
For example, for matching the name of a function in a function call
.
We need to use a subquery to match the fname
inside a fcall
.
We optionally provide a name to the query.
The fname
rule inside the fcall
rule is matched and tagged with @function.
Accordingly all the other keywords are matched with appropriate tags.
1;; In 'queries/highlights.scm'
2(fcall name: (fname) @function)
3
4(fdec name: (identifier) @function)
5
6(lambda argument: (identifier) @type)
7
8(fdec argument: (identifier) @type)
9
10(let name: (identifier) @property)
Other targets like punctuation
and comments also need to be matched.
Here is the code for matching them.
1;; In 'queries/highlights.scm'
2(comments) @comment
3
4(operator) @operator
5
6
7")" @punctuation
8"(" @punctuation
9"{" @punctuation
10"}" @punctuation
It’s all the queries you can write for a small language like hotshot
. Here’s the highlights.scm
file in it’s entirety.
1;; `queries/highlights.scm`.
2(fcall name: (fname) @function)
3
4(fdec name: (identifier) @function)
5
6(lambda argument: (identifier) @type)
7
8(fdec argument: (identifier) @type)
9
10(let name: (identifier) @property)
11
12(boolean) @constant
13(integer) @number
14(string) @string
15
16[
17 "let"
18 "while"
19 "fn"
20 "cond"
21 "lambda"
22 ] @keyword
23
24(comments) @comment
25
26(operator) @operator
27
28
29")" @punctuation
30"(" @punctuation
31"{" @punctuation
32"}" @punctuation
Now’s the time to embed these into Neovim
.
Integration with neovim
Neovim
doesn’t support installing queries when you install the tree-sitter
grammar (Atleast for local installation).
You need to manually symlink
the queries directory.
1# ln -s ~/.config/nvim/queries/hotshot ~/code/other-projects/tree-sitter-hotshot/queries
2$ ln -s ~/.config/nvim/queries/hotshot <your-queries-directory>
You need to make the
~/.config/nvim/queries
directory if it doesn’t exist already.
Now the syntax highlighting should work.
tips and tricks
-
You can use
tree-sitter highlight
to test and highlight code while writing the queries.It requires the
tree-sitter
config to be initialized, but if are developing for a complex language, it is useful. -
Develop the queries incrementally, start with the simple rules and then move to the bigger ones.
Once you know your editor understands the queries, you can be confident about complex queries.
help
The documentation for tree-sitter
is spotty and hard to understand. Here are some references where you might find further help.
-
In
neovim
the:help treesitter-query
provides with the highlight groups to target (The@comment
,@property
tags) -
The official documentation by
tree-sitter
, available here -
The official tree-sitter grammar for
Go
,Rust
andPython
. These are a little hard to understand but might provide insights into real-world applications.