.TH SELF 7
.SH NAME
design-for-exploitation \- Designing for Exploitation: How Meta-Programming Leads to Safer Code
.SH AUTHOR
Artyom Bologov
.SH SYNOPSIS
You should use meta-programming abilities of your technology as much as possible, and you should allow your users to influence your program by exposing programming languages to them.
.SH TEXT
.P
I've just finished watching a wonderful
.UR https://youtu.be/qAiqKHG6uYM?t=19950
Weird Machines: Exploiting Turing-Completeness
.UE
talk by Pedro Castilho.
He was talking about accidental Turing-completeness
and how computer technology is weird machines all the way down.
This talk had security and program design takeaways
that every programmer should remember. I'll cite just the two of these:
.P
* Every program should have the least power possible in order to execute its function.
.P
* [...] consider input for any program really as being a [...] programming language.
.P
.P
These sound pretty obvious, and yet every day there are dozens of RCEs revealed
in different software products. Why?
Because it's not rewarding to follow low-priority security department commandments,
and it's just too easy to copy the code from StackOverflow
or use JS eval() To read a number from user input.
.P
How does one fight it, then, if not with more security department commandments?
With mindsets and tools.
Pedro Castilho says using least power possible
and considering input to be untrusted code? I say:
.P
* You should use meta-programming abilities of your technology
(I mostly mean programming languages there) as much as possible, and
.P
* You should allow your users to influence your program
by exposing programming languages to them.
.P
.P
But wait, this is like the two recommendations above,
but inverted to make them meaninglessly dangerous, right?
Yes... And no.
Buckle up, I'm going to explain why
designing for exploitation leads to a safer code.
.P
Disclaimer: the meaning of meta-programming used in this post
is not a typical meaning of meta-programming,
as seen in communities like Rust and C++—a way
to abstract some repetitive/generic code using a fancy built-in preprocessor,—and
not even the one commonly used in Lisp community—a
way to generate/alter the code running in the image, using the code itself
.P
The meaning of meta-programming that you have to bear with
for the duration of this post is:
the ability of the programming language—and a program written in it—to
access the parsing, evaluation, compilation,
and other infrastructural functions that the programming language itself
uses internally.
.SH (read), eval(), #.(e(xp(l(o))it))!
.P
First things first: you should use meta-programming abilities of your technology as much as possible.
.P
There's a mantra in JavaScript community:
.UR "https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/eval#never_use_eval!">never use eval()html (symbol-value variable))))))))
.EE
.in
.P
— A redacted definition of describe-variable command
.P
While it may be non-obvious, we actually are passing raw Lisp symbols to this function
and use \fBsymbol-value\fP (function that finds the symbol value in the current environment by name),
processing its result and injecting the corresponding HTML into the page.
Is this possibly exploitable?
Well, maybe ¯\_(ツ)_/¯ Can you come up with a way to exploit these urls?
I believe in you.
.P
But before you get to filing dozens of CVE-s
for every internal page in Nyxt, I'll spice it up:
we use Lisp compiler-native code parsing facilities
(namely, \fBread\fP and \fBread-from-string\fP)
on the internal page URLs.
If you know a thing or two about Common Lisp,
you'll immediately scream in terror, because you remember
.P
.in +4n
.EX
#.(this '(sharp-period syntax)
(that allows evaluating
#|arbitrary code|# #(at read time)))
.EE
.in
.P
— Read-time evaluation syntax in Common Lisp
.P
But you also hold yourself together and refrain from screaming out loud,
because you'll remember that there's a small code snippet
(and a \fBsafe-read-from-string\fP function
abstracting it in a de-facto standard UIOP library)
that saves a lost cause:
.P
.in +4n
.EX
(let ((*read-eval* nil))
(read-from-string
"#.(this '(sharp-period syntax)
(that allows evaluating
#|arbitrary code|# #(at read time)))"))
;; Evaluation aborted on #
.EE
.in
.P
— Safe reading from untrusted input
.P
This CL developer thought train is an example of effect
that using meta-programming has on a person:
you are aware of the power your language has, the dangers it poses,
and the ways to mitigate that.
It's not the opaque JS
\fBeval()\fP that provides you with the unlimited power.
Common Lisp meta-programming allows you to limit the power it has.
.P
Meta-programming, in this perspective,
is a tool quite like loggers and database daemons,
and it's as integratable in your design documents, as loggers and databases are.
.P
And that's what I meant by using meta-programming abilities of your technology:
use the things provided by your language,
if it's at least a bit more self-aware than JavaScript.
You'll have lots of problems solved by the compiler / interpreter / transpiler
without the need to prove
.UR "https://en.wikipedia.org/wiki/Greenspun's_tenth_rule"
Greenspun's Tenth Rule
.UE
yet again,
and your code will be safe from accidental eval() injections 😃
.P
I am aware that there are many languages
that don't have meta-programming facilities exposed to the programmer.
And I have no particular suggestions to the programmers in those.
Except maybe moving to Lisp and enjoying the programmer-friendly
and secure meta-languages you can build in less than an hour:
.SH Embeddable Weird Machines
.P
Now that we established that language implementations are our best friends
providing us with power and structure,
how about making things Turing-complete by design? I mean it.
.P
You should allow your users to cange your program by exposing programming languages to them.
.P
This point may sound uneasy to you,
because the immediate thought would be
"so I have to write a parser, analyzer, compiler and an environment for it..."
No, you don't have to!
Remember the previous point:
use the tools provided by your programming language and expose those to your users.
.P
Here, by exposing programming languages,
I don't necessarily mean making users write code.
I rather mean exposing a Turing-complete way to customize your system.
Be it scratch-like code blocks, Notion Relations,
or HTML templates from MySpace—all of them are united
by exposing an (almost) Turing-complete language
to their users and to allow users shape their experience
in the allocated boundaries and compiled to the programming language-ish form.
.P
Again, example from Nyxt.
When I've stumbled across the problem of managing per-website user configuration,
my initial idea was inspired by ad-blocking host lists:
simply store a match pattern for a website,
and store all the settings associated with it on the same line:
.P
.in +4n
.EX
((match-domain "paulgraham.com") :excluded (:force-https-mode :reading-line-mode))
.EE
.in
.P
— Nyxt auto-rules format example
.P
Then, when I implemented the first prototype of this feature
(called auto-mode and destined to be renamed to auto-rules
in Nyxt 3.* after a major refactoring we've done),
I realized that the \fB(match-domain "paulgraham.com")\fP
is actually too close of a resemblance to a function call.
Why not implement it as a function call
required to return a boolean value,
and allow users to write arbitrary code in there?
It was a win-win decision both for extensibility and security, because
.P
* Using raw function calls allows to arbitrarily
fine-grain the website settings,
even to the point we couldn't have had anticipated it in Nyxt core.
.P
* I've avoided writing incomplete parsers for an arbitrary rule syntax,
avoiding the dangers those pose.
.P
* The list of rules is stored on user's filesystem
and is only likely to do harm when one willingly modifies it to do harm.
In a sense, we're relying on the OS mechanisms,
instead of re-implementing those ourselves.
Now, given the restriction of the thing to the filesystem,
.P
* The condition-then-modes structure of these rules
is restrictive enough to make this feature
both easy to learn and hard to exploit.
.P
.P
With this design decisions, one can now write rules like:
.P
.in +4n
.EX
((lambda (url) (string= (quri.uri:uri-path url) "/index")) :excluded (:dark-mode))
.EE
.in
.P
— Use of arbitrary lambda in place of auto-rule match pattern
.P
This overpowered feature is used by a reasonable fraction of the community
and there were
.UR https://discourse.atlas.engineer/t/tor-v3-onion-address-auto-mode-rule/395
mentions
.UE
and
.UR https://discourse.atlas.engineer/t/disable-javascript-by-default/509
suggestions
.UE
on how to improve it further on our forum,
which may be the best marker of success for such a niche feature.
.P
Auto-mode, being the restricted, yet still quite a turing-complete programming sub-system that Nyxt has,is not a vulnerability. It's consciously designed as a restricted (in this case—by the filesystem and the rule structure) meta-programming tool. If you consciously design a feature to be turing-complete, you make yourself aware of the abilities it has and restrict them.
.P
Designing for Exploitation, you avoid discovering tool turing-completeness being more that you anticipated, after thousands of your users are already pwned...
.SH Takeaways
.P
If you only have one thing to take away from this post, take this one:
using meta-programming tools for everyday tasks
makes these tasks both easier to complete and more secure,
because of the psychological and technical restrictions
one puts on the code and themselves when being aware of
and using a sufficiently good meta-programming system.
.P
Designing for Exploitation, you stay aware of the huge security risks
that accidental Turing-completeness possesses.
Designing for Exploitation,
you shape your software in such a way as to leverage
the most powerful yet domain-specific parts of your programming language.
.P
You don't simply put eval() everywhere—you
also wrap it in *read-eval*
to only allow literal values to be parsed.
Or, if it's not only literal values,
you only restrict the effects of this pet Turing machine to a single file,
website, window.
This is the thing that unites my suggestions with Pedro Castilho's theses cited above:
.P
* By using the meta-programming facilities your programming language has,
you're both using and restricting it's most exploitable part—the
meta-programming facilities themselves.
.P
* By allowing the users to have a programming language
as their input/configuration, you save yourself from the realization
that their input already is a sloppily evaluated programming language!
.P
.P
Now that you understand how embracing the meta-programming
can actually make your code safer,
go out there and make some software that's Designed for Exploitation;)
.SH Acknowledgements
.P
This post wouldn't be as good as it (arguably) is without the help of
.P
* Pierre Neidhardt, who helped me polish the phrasing and pointed at some of my assertions about Lisp that were somewhat exaggerated.
And do watch
.UR https://www.youtube.com/watch?t=4520&v=qAiqKHG6uYM
his talk on GambiConf
.UE
too,
it a nice showcase of what we've done in Nyxt :)
.P
* Vasily Gerasimov, who helped me realize that
I actually need the paragraph about fighting human laziness
with proper mindsets and tools, instead of administrative imperatives.
.P
* Milana Faizulina, who've taken a non-programmer stance,
and whose questions made this post (supposedly)
more understandable to a more general public.
.P
.P
I am the only person to blame for mistakes and inconsistencies you may find in this post, though.
I hope there are none left, but one can only hope :)
.SH COPYRIGHT
.UR https://creativecommons.org/licenses/by/4.0
CC-BY 4.0
.UE
2022-2026 by Artyom Bologov (aartaka,)
.UR https://codeberg.org/aartaka/pages/commit/a91befa
with one commit remixing Claude-generated code
.UE
.
Any and all opinions listed here are my own and not representative of my employers; future, past and present.