Config spec feature creep
Configuration starts simple. A few keys and values, then you want to group values, and nest them, somebody asks for simple expressions, conditionals, variables …
Any sufficiently complicated configuration system contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of a programming language.
Greenspun’s Tenth Rule, applied to config
You gradually end up with a programming language that was never designed as one. YAML added templating, Nginx added if. Terraform invented
HCL.
Helm is pretty bad, two languages, two escaping contexts, paired with sometimes conflicting indentation rules, and a preprocessor step.
{{- if eq .Values.env "production" }}
- path: /admin
backend: admin-svc
{{- end }}
HCL avoids many of Helm’s complexity and duality, but still exposes a fixed set of language constructs available to your config, which could be too much, or too little. For example there is no if:
dynamic "route" {
for_each = var.env == "production" ? [1] : []
content { path = "/admin" }
}
Bring a full language in
Another method is to just add a scripting language: Lua, Python. Now the config can do everything, and some more.
if os.getenv("ENV") == "production" then
add_route("/admin", admin_handler)
end
A full language usually also means os.execute(), io.open(), and require(). We can remove os, io, and require before handing it untrusted code. But we are just blacklisting, and …
blacklisting is preventing known dangers, but there are always unknown unknowns, and you can’t prevent what you don’t know you should
Whitelist with Rye
Quick preview
Rye takes the opposite approach: start with no language features at all, then explicitly whitelist capabilities.
Here’s a quick preview:
// Go side: grant exactly two operations
evaldo.RegisterBuiltinsFilter(ps, []string{"_++", "os/cwd?"})
On Go side, we register just two builtin functions _++ and os/cwd? (cwd? - current working directory built-in defined inside context os).
; Config side: use them
docs: os/cwd? ++ "/docs"
This is the entire vocabulary. Everything else was never given, no other word is defined. If you want to read a file
Read %my-secrets
; Error: Word `Read` not found
Not really a place here, but check the links if it bothers you, why there is a underline prefix when referencing ++ 1 and why Read is
capitalized 2.
What is Rye
Rye is a general language written in pure Go (no CGO), you can also import it like a Go library.
Rye is a homoiconic language and every active word is just a function. Every active word is added on a library level. There is no if, fn, loop behaviour hardcoded into the evaluator.
So the language on its own can just load syntax into blocks of Rye values, and assign values to words, as that (different word types like set-words and mod-words) is a part of its syntax and nothing else.
What about Starlark
Starlark was built for exactly this. It’s a mature solution and it brings a lot to the table. We are still talking about concepts here. These are the differences. Starlark gives you if,
for, and def unconditionally. You can’t take them away. While Rye has no reserved forms. Words like if are just functions you choose to register. Starlark’s modules are more all-or-nothing.
Rye lets you grant _+ but not _*. Starlark is much more mature, but niche language - Rye is a language in development, but it strives to be a general purpose language.
Example: Markdown serving web-server
We will make a Go webserver that reads markdown, converts it to HTML and serves it over HTTP. Rye is used for config file.
Step 1 - The minimal server (~50 lines + validation)
Two dependencies: goldmark for markdown rendering and rye for the config.
package main
import (
"fmt"
"html/template"
"log"
"net/http"
"os"
"path/filepath"
"strings"
"github.com/refaktor/rye/env"
"github.com/refaktor/rye/evaldo"
"github.com/refaktor/rye/loader"
"github.com/yuin/goldmark"
)
func safeMarkdownPath(baseDir, slug string) (string, error) {
/* full code: https://github.com/refaktor/rye/blob/main/examples/whitelist-config-with-rye/step1-minimal/main.go */
}
func main() {
raw, err := os.ReadFile("config.rye")
if err != nil {
log.Fatalf("failed to read config: %v", err)
}
ps := env.NewProgramState()
blk := loader.LoadString(string(raw), false, ps)
// Parse errors
if errorObj, ok := blk.(env.Error); ok {
log.Fatalf("parse error: %s", errorObj.Message)
}
evaldo.EvalBlock(ps, blk.(env.Block))
// Runtime errors
if ps.ErrorFlag {
log.Fatalf("runtime error: %s", ps.Res.Print(*ps.Idx))
}
port := ps.Ctx.GetStringOr("port", ps.Idx, "8080")
dir := ps.Ctx.GetStringOr("docs-dir", ps.Idx, "docs")
tpl := template.Must(template.New("").Parse(
`<html><body>{{.}}</body></html>`))
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
slug := strings.TrimPrefix(r.URL.Path, "/")
path, err := safeMarkdownPath(dir, slug)
if err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
md, err := os.ReadFile(path)
if err != nil {
http.NotFound(w, r)
return
}
var buf strings.Builder
goldmark.Convert(md, &buf)
tpl.Execute(w, template.HTML(buf.String()))
})
fmt.Printf("Serving on port %s\n", port)
http.ListenAndServe(":"+port, nil)
}
And the config:
port: "3000"
docs-dir: "content"
The config looks like YAML, but it’s normal Rye code. Rye is not space or newline sensitive, but requires spacing between each token ( parens also ).
There is no builtin registration call in the Go code. That’s deliberate. Without registering any builtins, the evaluator has nothing to call.
The config file is just data notation: bind a value to a word with :, and that’s all.
No functions, conditions, arithmetic. The person writing config.rye can declare values, and nothing else.
This is the zero capability baseline. We add capability one explicit registration at a time.
Step 2 - Basic computation (+2 lines of Go)
evaldo.RegisterBuiltinsFilter(ps, []string{"_*", "_+"})
Now the config can compute derived values:
port: "3000"
docs-dir: "content"
cache-max-age: 60 * 60 * 24 ; one day in seconds
max-body-kb: 10 * 1024 ; 10 kB expressed readably
Two words registered. Two operations available, nothing else is there.
Step 3 - Reading from the environment (+6 lines of Go)
We will create and register a custom built-in get-env this time. It returns false when a variable isn’t set. And Rye’s standard any combinator.
ps.RegisterBuiltin("get-env", 1, "get-env key",
func(ps *env.ProgramState, a0, a1, a2, a3, a4 env.Object) env.Object {
if v := os.Getenv(a0.(env.String).Value); v != "" {
return *env.NewString(v)
}
return *env.NewBoolean(false)
})
evaldo.RegisterBuiltinsFilter(ps, []string{"any"})
any evaluates expressions in a block and returns the first result that is not false.
port: any { get-env "PORT" "3000" }
docs-dir: any { get-env "DOCS_DIR" "content" }
cache-max-age: 60 * 60 * 24
max-body-kb: 10 * 1024
Step 4 - The config registers its own HTTP routes (+14 lines of Go)
So far the config has only produced values. Now it starts actively configuring the server. We use a Go map and register a custom built-in that adds to the map.
routes := map[string]string{}
ps.RegisterBuiltin("route", 2, "Defines a route",
func(ps *env.ProgramState, a0, a1, a2, a3, a4 env.Object) env.Object {
routes[a0.(env.String).Value] = a1.(env.String).Value
return env.Void{}
})
Then after evaluating the config, we wire up the collected routes:
for prefix, dir := range routes {
dir := dir
http.Handle(prefix+"/", http.StripPrefix(prefix,
http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
slug := strings.TrimPrefix(r.URL.Path, "/")
path, err := safeMarkdownPath(dir, slug)
if err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
md, err := os.ReadFile(path)
if err != nil { http.NotFound(w, r); return }
var buf strings.Builder
goldmark.Convert(md, &buf)
tpl.Execute(w, template.HTML(buf.String()))
})))
}
We also register Rye’s if and _= so the config can have conditional routes:
evaldo.RegisterBuiltinsFilter(ps, []string{"if", "_="})
Now this is possible:
port: any { get-env "PORT" "3000" }
docs-dir: "content"
cache-max-age: 60 * 60 * 24
route "/blog" "posts"
route "/docs" "docs"
if ( get-env "DEBUG" ) = "1" {
route "/drafts" "drafts"
}
The /drafts route only exists when DEBUG=1. The config author is making runtime decisions about server structure using Rye’s if function.
Step 5 - The config injects logic into req. handling (+6 lines)
Up to now the config ran once at startup. Now it defines a function that the Go runtime will call on every request. We register fn, a function that creates functions for this.
evaldo.RegisterBuiltinsFilter(ps, []string{"fn", "replace", "capitalize", "str"})
In the HTTP handler:
title := slug
if fn, ok := ps.Ctx.GetFunction("page-title", ps.Idx); ok {
evaldo.CallFunctionArgsN(fn, ps, ps.Ctx, *env.NewString(slug))
if s, ok := ps.Res.(env.String); ok {
title = s.Value
}
}
ps.Ctx.GetFunction looks up a word in the config’s context. evaldo.CallFunctionArgsN invokes it with the program state, so the function sees the context it
was defined in. The result comes back in ps.Res, type-asserted to string.
page-title: fn { slug } {
slug .replace "-" " " |capitalize
}
So our config now has custom functions that the parent Go app uses.
Step 6 - Debugging the config live (+2 lines of Go)
Config files aren’t usually known for good debugging. If we are lucky we can print or log some value. Rye has a helpful function probe and even a console (REPL), so we can do better.
evaldo.RegisterBuiltinsFilter(ps, []string{"probe", "enter-console"})
// Execution limits for safety
ps.MaxCallDepth = 50
ps.MaxOps = 10_000
probe prints a value (with type information) and passes it through:
port: probe any { get-env "PORT" "3000" }
; prints [String 3000] when the env variable isn't set
enter-console is the useful one. Your process drops into a live REPL with the full current context. If your config is failing in production, you don’t just get a log line, you get a terminal:
[enter-console: after-routes]
> lc ; list context
port [String 3000]
docs-dir [String content]
route [Builtin(2): Defines a route]
page-title [Function(1)]
> probe port
[String 3000]
> port:: "8080"
> [Ctrl-c]
Changes made at the prompt take effect when you exit. The rest of the config continues evaluating with the modified context. Remove the enter-console line when done.
What we ended up with
| Step | Go lines | Config capability |
|---|---|---|
| 1 | ~80 | Static values - pure data notation |
| 2 | +2 | Arithmetic, derived values |
| 3 | +6 | Env vars with any { } fallbacks |
| 4 | +14 | Routes, if, conditional logic |
| 5 | +6 | User-defined fn callbacks into request handling |
| 6 | +2 | probe + live REPL debugging |
A note on execution limits
Beyond controlling which words exist, you can also cap how much work the evaluator is allowed to do:
ps.MaxCallDepth = 50 // stop if recursion exceeds 50 frames
ps.MaxOps = 10_000 // stop after 10k expression evaluations
Both default to zero (unlimited). Set them before EvalBlock and any runaway config, infinite recursion, returns an error instead of spinning forever.
Recap
The blacklist model (embed Lua, remove os and io) is dangerous by definition. In our Rye examples, we started with zero, just INI level config and kept adding just what we decided to add. At step 5 config author can even define live functions.
I’m not saying to dump all other approaches and go with Rye. It’s still a work in progress language, but I hope I’ve shown that the concept could be a good one, and I hope I documented how to embed Rye into Go apps, for config, or more.
Warning: This is capability control, and an exploration , not a security sandbox or something you should go and use today. Rye code still runs inside your process. If you expose unsafe builtins, the config gains those capabilities.
You can find all examples in full, ready to be executed in the Rye’s examples folder: examples/whitelist-config-with-rye