Mentirinhas #2203
Fábio Coala 07/02/2025 | Source: Mentirinhas
Julia Evans 05/02/2025 | Source: Julia Evans
A few weeks ago I ran a terminal survey (you can read the results here) and at the end I asked:
What’s the most frustrating thing about using the terminal for you?
1600 people answered, and I decided to spend a few days categorizing all the responses. Along the way I learned that classifying qualitative data is not easy but I gave it my best shot. I ended up building a custom tool to make it faster to categorize everything.
As with all of my surveys the methodology isn’t particularly scientific. I just posted the survey to Mastodon and Twitter, ran it for a couple of days, and got answers from whoever happened to see it and felt like responding.
Here are the top categories of frustrations!
I think it’s worth keeping in mind while reading these comments that
These comments aren’t coming from total beginners.
Here are the categories of frustrations! The number in brackets is the number of people with that frustration. I’m mostly writing this up for myself because I’m trying to write a zine about the terminal and I wanted to get a sense for what people are having trouble with.
People talked about struggles remembering:
One example comment:
There are just so many little “trivia” details to remember for full functionality. Even after all these years I’ll sometimes forget where it’s 2 or 1 for stderr, or forget which is which for
>
and>>
.
People talked about struggling with switching systems (for example home/work computer or when SSHing) and running into:
as well as differences inside the same system like pagers being not consistent with each other (git diff pagers, other pagers).
One example comment:
I got used to fish and vi mode which are not available when I ssh into servers, containers.
Lots of problems with color, like:
This comment felt relatable to me:
Getting my terminal theme configured in a reasonable way between the terminal emulator and fish (I did this years ago and remember it being tedious and fiddly and now feel like I’m locked into my current theme because it works and I dread touching any of that configuration ever again).
Half of the comments on keyboard shortcuts were about how on Linux/Windows, the keyboard shortcut to copy/paste in the terminal is different from in the rest of the OS.
Some other issues with keyboard shortcuts other than copy/paste:
Ctrl-W
in a browser-based terminal and closing the windowCtrl-Shift-
, no Super
, no Hyper
, lots of ctrl-
shortcuts aren’t
possible like Ctrl-,
)Ctrl+left arrow
for something else)Aside from “the keyboard shortcut for copy and paste is different”, there were a lot of OTHER issues with copy and paste, like:
There were lots of comments about this, which all came down to the same basic complaint – it’s hard to discover useful tools or features! This comment kind of summed it all up:
How difficult it is to learn independently. Most of what I know is an assorted collection of stuff I’ve been told by random people over the years.
A lot of comments about it generally having a steep learning curve. A couple of example comments:
After 15 years of using it, I’m not much faster than using it than I was 5 or maybe even 10 years ago.
and
That I know I could make my life easier by learning more about the shortcuts and commands and configuring the terminal but I don’t spend the time because it feels overwhelming.
Some issues with shell history:
One example comment:
It wasted a lot of time until I figured it out and still annoys me that “history” on zsh has such a small buffer; I have to type “history 0” to get any useful length of history.
People talked about:
Here’s a representative comment:
Finding good examples and docs. Man pages often not enough, have to wade through stack overflow
A few issues with scrollback:
One example comment:
When resizing the terminal (in particular: making it narrower) leads to broken rewrapping of the scrollback content because the commands formatted their output based on the terminal window width.
Lots of comments about how the terminal feels hampered by legacy decisions and how users often end up needing to learn implementation details that feel very esoteric. One example comment:
Most of the legacy cruft, it would be great to have a green field implementation of the CLI interface.
Lots of complaints about POSIX shell scripting. There’s a general feeling that shell scripting is difficult but also that switching to a different less standard scripting language (fish, nushell, etc) brings its own problems.
Shell scripting. My tolerance to ditch a shell script and go to a scripting language is pretty low. It’s just too messy and powerful. Screwing up can be costly so I don’t even bother.
Some more issues that were mentioned at least 10 times:
Ctrl-S
, cat
ing a binary, etc)There were also 122 answers to the effect of “nothing really” or “only that I can’t do EVERYTHING in the terminal”
One example comment:
Think I’ve found work arounds for most/all frustrations
I’m not going to make a lot of commentary on these results, but here are a couple of categories that feel related to me:
Trying to categorize all these results in a reasonable way really gave me an appreciation for social science researchers’ skills.
Fábio Coala 04/02/2025 | Source: Mentirinhas
O post Cícero #92 apareceu primeiro em Mentirinhas.
Anonymous 04/02/2025 | Source: Irrational Exuberance
Entering 2025, I decided to spend some time exploring the topic of agents. I started reading Anthropic’s Building effective agents, followed by Chip Huyen’s AI Engineering. I kicked off a major workstream at work on using agents, and I also decided to do a personal experiment of sorts. This is a general commentary on building that project.
What I wanted to build was a simple chat interface where I could write prompts, select models,
and have the model use tools as appropriate.
My side goal was to build this using Cursor and generally avoid writing code directly as much
as possible, but I found that generally slower than writing code in emacs while relying
on 4o-mini
to provide working examples to pull from.
Similarly, while I initially envisioned building this in fullstack TypeScript via Cursor, I ultimately bailed into a stack that I’m more comfortable, and ended up using Python3, FastAPI, PostgreSQL, and SQLAlchemy with the async psycopg3 driver. It’s been a… while… since I started a brand new Python project, and used this project as an opportunity to get comfortable with Python3’s async/await mechanisms along with Python3’s typing along with mypy. Finally, I also wanted to experiment with Tailwind, and ended up using TailwindUI’s components to build the site.
The working version supports everything I wanted: creating chats with models, and allowing those models
to use function calling to use tools that I provide. The models are allowed to call any number of tools
in pursuit of the problem they are solving.
The tool usage is the most interesting part here for sure.
The simplest tool I created was a get_temperature
tool that provided a fake temperature for your
location. This allowed me to ask questions like “What should I wear tomorrow in San Francisco, CA?”
and get a useful respond.
The code to add this function to my project was pretty straightforward, just three lines of Python and 25 lines of metadata to pass to the OpenAI API.
def tool_get_current_weather(location: str|None=None, format: str|None=None) -> str: "Simple proof of concept tool." temp = random.randint(40, 90) if format == 'fahrenheit' else random.randint(10, 25) return f"It's going to be {temp} degrees {format} tomorrow." FUNCTION_REGISTRY['get_current_weather'] = tool_get_current_weather TOOL_USAGE_REGISTRY['get_current_weather'] = { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA", }, "format": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The temperature unit to use. Infer this from the users location.", }, }, "required": ["location", "format"], }, } }
After getting this tool, the next tool I added was a simple URL retriever tool, which allowed the agent to grab a URL and use the content of that URL in its prompt.
The implementation for this tool was similarly quite simple.
def tool_get_url(url: str|None=None) -> str: if url is None: return '' url = str(url) response = requests.get(url) soup = BeautifulSoup(response.content, 'html.parser') content = soup.find('main') or soup.find('article') or soup.body if not content: return str(response.content) markdown = markdownify(str(content), heading_style="ATX").strip() return str(markdown) FUNCTION_REGISTRY['get_url'] = tool_get_url TOOL_USAGE_REGISTRY['get_url'] = { "type": "function", "function": { "name": "get_url", "description": "Retrieve the contents of a website via its URL.", "parameters": { "type": "object", "properties": { "url": { "type": "string", "description": "The complete URL, including protocol to retrieve. For example: \"https://lethain.com\"", } }, "required": ["url"], }, } }
What’s pretty amazing is how much power you can add to your agent by adding such a trivial tool as retrieving a URL. You can similarly imagine adding tools for retrieving and commenting on Github pull requests and so, which could allow a very simple agent tool like this to become quite useful.
Working on this project gave me a moderately compelling view of a near-term future where most engineers have simple application like this running that they can pipe events into from various systems (email, text, Github pull requests, calendars, etc), create triggers that map events to templates that feed into prompts, and execute those prompts with tool-aware agents.
Combine that with ability for other agents to register themselves with you and expose the tools that they have access to (e.g. schedule an event with tool’s owner), and a bunch of interesting things become very accessible with a very modest amount of effort:
None of these are exactly lifesaving, but each is somewhat useful, and I imagine there are many more fairly obvious ideas that become easy once you have the necessary scaffolding to make this sort of thing easy.
Altogether, I think that I am convinced at this points that agents, using current foundational models, are going to create a number of very interesting experiences that improve our day to day lives in small ways that are, in aggregate, pretty transformational. I’m less convinced that this is the way all software should work going forward though, but more thoughts on that over time. (A bunch of fun experiments happening at work, but early days on those.)
Armin Ronacher 04/02/2025 | Source: Armin Ronacher's Thoughts and Writings
I recently wrote about dependencies in Rust. The feedback, both within and outside the Rust community, was very different. A lot of people, particularly some of those I greatly admire expressed support. The Rust community, on the other hand, was very dismissive on on Reddit and Lobsters.
Last time, I focused on the terminal_size crate, but I also want to show you a different one that I come across once more: rand. It has a similarly out-of-whack value-to-dependency ratio, but in a slightly different way. More than terminal_size, you are quite likely to use it. If for instance if you want to generate a random UUID, the uuid crate will depend on it. Due to its nature it also has a high security exposure.
I don't want to frame this as “rand is a bad crate”. It's not a bad crate at all! It is however a crate that does not appear very concerned about how many dependencies it has, and I want to put this in perspective: of all the dependencies and lines of codes it pulls in, how many does it actually use?
As the name implies, the rand crate is capable of calculating random numbers. The crate itself has seen a fair bit of churn: for instance 0.9 broke backwards compatibility with 0.8. So, as someone who used that crate, I did what a responsible developer is supposed to do, and upgraded the dependency. After all, I don't want to be the reason there are two versions of rand in the dependency tree. After the upgrade, I was surprised how fat that dependency tree has become over the last nine months.
Today, this is what the dependency tree looks like for the default feature set on macOS and Linux:
x v0.1.0 (/private/tmp/x) └── rand v0.9.0 ├── rand_chacha v0.9.0 │ ├── ppv-lite86 v0.2.20 │ │ └── zerocopy v0.7.35 │ │ ├── byteorder v1.5.0 │ │ └── zerocopy-derive v0.7.35 (proc-macro) │ │ ├── proc-macro2 v1.0.93 │ │ │ └── unicode-ident v1.0.16 │ │ ├── quote v1.0.38 │ │ │ └── proc-macro2 v1.0.93 (*) │ │ └── syn v2.0.98 │ │ ├── proc-macro2 v1.0.93 (*) │ │ ├── quote v1.0.38 (*) │ │ └── unicode-ident v1.0.16 │ └── rand_core v0.9.0 │ ├── getrandom v0.3.1 │ │ ├── cfg-if v1.0.0 │ │ └── libc v0.2.169 │ └── zerocopy v0.8.14 ├── rand_core v0.9.0 (*) └── zerocopy v0.8.14
About a year ago, it looked like this:
x v0.1.0 (/private/tmp/x) └── rand v0.8.5 ├── libc v0.2.169 ├── rand_chacha v0.3.1 │ ├── ppv-lite86 v0.2.17 │ └── rand_core v0.6.4 │ └── getrandom v0.2.10 │ ├── cfg-if v1.0.0 │ └── libc v0.2.169 └── rand_core v0.6.4 (*)
Not perfect, but better.
So, let's investigate what all these dependencies do. The current version pulls in quite a lot.
First there is the question of getting access to the system RNG. On Linux and Mac it uses libc, for Windows it uses the pretty heavy Microsoft crates (windows-targets). The irony is that the Rust standard library already implements a way to get a good seed from the system, but it does not expose it. Well, not really at least. There is a crate called fastrand which does not have any dependencies which seeds itself by funneling out seeds from the stdlib via the hasher system. That looks a bit like this:
use std::collections::hash_map::RandomState;
use std::hash::{BuildHasher, Hasher};
fn random_seed() -> u64 {
RandomState::new().build_hasher().finish()
}
Now obviously that's a hack, but it will work because the hashmap's hasher is randomly seeded from good sources. There is a single-dependency crate too which can read from the system's entropy source and that's getrandom. So there at least could be a world where rand only depends on that.
If you want to audit the entire dependency chain, you end up with maintainers that form eight distinct groups:
If I also cared about WASM targets, I'd have to consider even more dependencies.
So let's vendor it. How much code is there? After removing all tests, we end up with 29 individual crates vendored taking up 62MB disk space. Tokei reports 209,150 lines of code.
Now this is a bit misleading, because like many times most of this is within windows-*. But how much of windows-* does getrandom need? A single function:
extern "system" fn ProcessPrng(pbdata: *mut u8, cbdata: usize) -> i32
For that single function (and the information which DLL it needs link into), we are compiling and downloading megabytes of windows-targets. Longer term this might not be necessary, but today it is.
On Unix, it's harder to avoid libc because it tries multiple APIs. These are mostly single-function APIs, but some non-portable constants make libc difficult to avoid.
Beyond the platform dependencies, what else is there?
All of these are great crates, but do I need all of this just to generate a random number?
Then there are compile times. How long does it take to compile? 4.3 seconds on my high-end M1 Max. A lot of dependencies block each other, particularly the part that waits for the derives to finish.
Only after all the code generation finished, the rest will make meaningful progress. In total a release build produces 36MB of compiler artifacts. 12 months ago, it took just under 2 seconds.
The Rust developer community on Reddit doesn't seem very concerned. The main sentiment is that rand now uses less unsafe so that's benefit enough. While the total amount of unsafe probably did not go down, that moved unsafe is is now in a common crate written by people that know how to use unsafe (zerocopy). There is also the sentiment that all of this doesn't matter anyways, because we will will all soon depend on zerocopy everywhere anyways, as more and more dependencies are switching over to it.
Maybe this points to Rust not having a large enough standard library. Perhaps features like terminal size detection and random number generation should be included. That at least is what people pointed out on Twitter.
We already treat crates like regex, rand, and serde as if they were part of the standard library. The difference is that I can trust the standard library as a whole—it comes from a single set of authors, making auditing easier. If these external, but almost standard crates were more cautious about dependencies and make it more of a goal to be auditable, we would all benefit.
Or maybe this is just how Rust works now. That would make me quite sad.
Update: it looks like there is some appetite in rand to improve on this.
Edit: This post originally incorrectly said that getrandom depends on windows-sys. That is incorrect, it only depends on windows-targets.
Augusto Campos 03/02/2025 | Source: TRILUX
Por mais que o pleonasmo vicioso empobreça o discurso, as listas de pleonasmos que circulam frequentemente parecem ser construídas por gramáticos amadores rasos de imaginação, que consideram que existe apenas um acabamento e este encerra a obra e que o número de repetições e de retornos é sempre limitado a 01.
E que não há gradação de intensidade sonora em gritos, que não se pode encarar algo pelos flancos, que não há convívios distantes, nem metades diferentes entre si, que todos os prefeitos são gestores municipais e que todos os elos são igualmente conectáveis.
O artigo "Nem tudo que parece é pleonasmo" foi originalmente publicado no site TRILUX, de Augusto Campos.
Junior Capela 03/02/2025 | Source: Datassette
Marque na sua agenda: sábado dia 8 de fevereiro você tem um encontro com a comunidade Gamer/Retrogamer/Geek/Nerd do Vale.
Serão muitas atrações como Concurso e Desfile Cosplay, Campeonatos, Fliperamas, Indie Games, Lojas, Convidados e muito mais.
E o melhor de tudo: GRATUITO!!!
Data: Sábado dia 08/02/2025
Horário: das 12 as 19:30
Local: Praça Bahia de São Salvador - Jardim Vale do Sol, SJC
Varios espositores, e muita diversão....
Augusto Campos 03/02/2025 | Source: TRILUX
Desde o ano passado eu venho pensando em reduzir a minha dependência tecnológica dos serviços de hospedagem e backup on-line hospedados no exterior, e a maneira como a situação internacional vem se desenrolando neste começo de 2025 me fez ter um pouco mais de pressa nesse sentido.
A minha preocupação não é tanto com onde os dados estão hospedados, e sim com o risco cada vez maior de eu perder acesso a eles subitamente devido a algo fora do meu controle, motivado por questões internacionais ou pelos posicionamentos que estão mudando em empresas do ramo.
Essa thread do fim de semana sobre a Oracle cancelando a conta de um cara sem avisar nada, removendo o acesso dele a todos os seus dados, e sem nem dizer a razão, me fez ter mais pressa de ter meus dados e minhas rotinas sob meu próprio controle.
A minha intenção de montar um servidor pessoal de nuvem de arquivos ficou mais séria, e estou me movimentando para aproveitar o servidor que eu já tenho, mas colocar neles os recursos de armazenamento, compartilhamento e backup pra serem a minha estratégia primária (mantendo, a princípio, os serviços on-line como forma secundária, complementar).
Provavelmente vou narrar os passos por aqui, aguardem, pois eles já estão em andamento.
O artigo "Migrando meus arquivos para um servidor ao meu alcance" foi originalmente publicado no site TRILUX, de Augusto Campos.
Eli Bendersky 03/02/2025 | Source: Eli Bendersky's website
Spend enough time looking at Python programs and packages for machine learning, and you'll notice that the "JIT decorator" pattern is pretty popular. For example, this JAX snippet:
import jax.numpy as jnp
import jax
@jax.jit
def add(a, b):
return jnp.add(a, b)
# Use "add" as a regular Python function
... = add(...)
Or the Triton language for writing GPU kernels directly in Python:
import triton
import triton.language as tl
@triton.jit
def add_kernel(x_ptr,
y_ptr,
output_ptr,
n_elements,
BLOCK_SIZE: tl.constexpr):
pid = tl.program_id(axis=0)
block_start = pid * BLOCK_SIZE
offsets = block_start + tl.arange(0, BLOCK_SIZE)
mask = offsets < n_elements
x = tl.load(x_ptr + offsets, mask=mask)
y = tl.load(y_ptr + offsets, mask=mask)
output = x + y
tl.store(output_ptr + offsets, output, mask=mask)
In both cases, the function decorated with jit doesn't get executed by the Python interpreter in the normal sense. Instead, the code inside is more like a DSL (Domain Specific Language) processed by a special purpose compiler built into the library (JAX or Triton). Another way to think about it is that Python is used as a meta language to describe computations.
In this post I will describe some implementation strategies used by libraries to make this possible.
The goal is to explain how different kinds of jit decorators work by using a simplified, educational example that implements several approaches from scratch. All the approaches featured in this post will be using this flow:
These are the steps that happen when a Python function wrapped with our educational jit decorator is called:
Steps (2) and (3) use llvmlite; I've written about llvmlite before, see this post and also the pykaleidoscope project. For an introduction to JIT compilation, be sure to read this and maybe also the series of posts starting here.
First, let's look at the Expr IR. Here we'll make a big simplification - only supporting functions that define a single expression, e.g.:
def expr2(a, b, c, d):
return (a + d) * (10 - c) + b + d / c
Naturally, this can be easily generalized - after all, LLVM IR can be used to express fully general computations.
Here are the Expr data structures:
class Expr:
pass
@dataclass
class ConstantExpr(Expr):
value: float
@dataclass
class VarExpr(Expr):
name: str
arg_idx: int
class Op(Enum):
ADD = "+"
SUB = "-"
MUL = "*"
DIV = "/"
@dataclass
class BinOpExpr(Expr):
left: Expr
right: Expr
op: Op
To convert an Expr into LLVM IR and JIT-execute it, we'll use this function:
def llvm_jit_evaluate(expr: Expr, *args: float) -> float:
"""Use LLVM JIT to evaluate the given expression with *args.
expr is an instance of Expr. *args are the arguments to the expression, each
a float. The arguments must match the arguments the expression expects.
Returns the result of evaluating the expression.
"""
llvm.initialize()
llvm.initialize_native_target()
llvm.initialize_native_asmprinter()
llvm.initialize_native_asmparser()
cg = _LLVMCodeGenerator()
modref = llvm.parse_assembly(str(cg.codegen(expr, len(args))))
target = llvm.Target.from_default_triple()
target_machine = target.create_target_machine()
with llvm.create_mcjit_compiler(modref, target_machine) as ee:
ee.finalize_object()
cfptr = ee.get_function_address("func")
cfunc = CFUNCTYPE(c_double, *([c_double] * len(args)))(cfptr)
return cfunc(*args)
It uses the _LLVMCodeGenerator class to actually generate LLVM IR from Expr. This process is straightforward and covered extensively in the resources I linked to earlier; take a look at the full code here.
My goal with this architecture is to make things simple, but not too simple. On one hand - there are several simplifications: only single expressions are supported, very limited set of operators, etc. It's very easy to extend this! On the other hand, we could have just trivially evaluated the Expr without resorting to LLVM IR; I do want to show a more complete compilation pipeline, though, to demonstrate that an arbitrary amount of complexity can be hidden behind these simple interfaces.
With these building blocks in hand, we can review the strategies used by jit decorators to convert Python functions into Exprs.
Python comes with powerful code reflection and introspection capabilities out of the box. Here's the astjit decorator:
def astjit(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
if kwargs:
raise ASTJITError("Keyword arguments are not supported")
source = inspect.getsource(func)
tree = ast.parse(source)
emitter = _ExprCodeEmitter()
emitter.visit(tree)
return llvm_jit_evaluate(emitter.return_expr, *args)
return wrapper
This is a standard Python decorator. It takes a function and returns another function that will be used in its place (functools.wraps ensures that function attributes like the name and docstring of the wrapper match the wrapped function).
Here's how it's used:
from astjit import astjit
@astjit
def some_expr(a, b, c):
return b / (a + 2) - c * (b - a)
print(some_expr(2, 16, 3))
After astjit is applied to some_expr, what some_expr holds is the wrapper. When some_expr(2, 16, 3) is called, the wrapper is invoked with *args = [2, 16, 3].
The wrapper obtains the AST of the wrapped function, and then uses _ExprCodeEmitter to convert this AST into an Expr:
class _ExprCodeEmitter(ast.NodeVisitor):
def __init__(self):
self.args = []
self.return_expr = None
self.op_map = {
ast.Add: Op.ADD,
ast.Sub: Op.SUB,
ast.Mult: Op.MUL,
ast.Div: Op.DIV,
}
def visit_FunctionDef(self, node):
self.args = [arg.arg for arg in node.args.args]
if len(node.body) != 1 or not isinstance(node.body[0], ast.Return):
raise ASTJITError("Function must consist of a single return statement")
self.visit(node.body[0])
def visit_Return(self, node):
self.return_expr = self.visit(node.value)
def visit_Name(self, node):
try:
idx = self.args.index(node.id)
except ValueError:
raise ASTJITError(f"Unknown variable {node.id}")
return VarExpr(node.id, idx)
def visit_Constant(self, node):
return ConstantExpr(node.value)
def visit_BinOp(self, node):
left = self.visit(node.left)
right = self.visit(node.right)
try:
op = self.op_map[type(node.op)]
return BinOpExpr(left, right, op)
except KeyError:
raise ASTJITError(f"Unsupported operator {node.op}")
When _ExprCodeEmitter finishes visiting the AST it's given, its return_expr field will contain the Expr representing the function's return value. The wrapper then invokes llvm_jit_evaluate with this Expr.
Note how our decorator interjects into the regular Python execution process. When some_expr is called, instead of the standard Python compilation and execution process (code is compiled into bytecode, which is then executed by the VM), we translate its code to our own representation and emit LLVM from it, and then JIT execute the LLVM IR. While it seems kinda pointless in this artificial example, in reality this means we can execute the function's code in any way we like.
This approach is almost exactly how the Triton language works. The body of a function decorated with @triton.jit gets parsed to a Python AST, which then - through a series of internal IRs - ends up in LLVM IR; this in turn is lowered to PTX by the NVPTX LLVM backend. Then, the code runs on a GPU using a standard CUDA pipeline.
Naturally, the subset of Python that can be compiled down to a GPU is limited; but it's sufficient to run performant kernels, in a language that's much friendlier than CUDA and - more importantly - lives in the same file with the "host" part written in regular Python. For example, if you want testing and debugging, you can run Triton in "interpreter mode" which will just run the same kernels locally on a CPU.
Note that Triton lets us import names from the triton.language package and use them inside kernels; these serve as the intrinsics for the language - special calls the compiler handles directly.
Python is a fairly complicated language with a lot of features. Therefore, if our JIT has to support some large portion of Python semantics, it may make sense to leverage more of Python's own compiler. Concretely, we can have it compile the wrapped function all the way to bytecode, and start our translation from there.
Here's the bytecodejit decorator that does just this [1]:
def bytecodejit(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
if kwargs:
raise BytecodeJITError("Keyword arguments are not supported")
expr = _emit_exprcode(func)
return llvm_jit_evaluate(expr, *args)
return wrapper
def _emit_exprcode(func):
bc = func.__code__
stack = []
for inst in dis.get_instructions(func):
match inst.opname:
case "LOAD_FAST":
idx = inst.arg
stack.append(VarExpr(bc.co_varnames[idx], idx))
case "LOAD_CONST":
stack.append(ConstantExpr(inst.argval))
case "BINARY_OP":
right = stack.pop()
left = stack.pop()
match inst.argrepr:
case "+":
stack.append(BinOpExpr(left, right, Op.ADD))
case "-":
stack.append(BinOpExpr(left, right, Op.SUB))
case "*":
stack.append(BinOpExpr(left, right, Op.MUL))
case "/":
stack.append(BinOpExpr(left, right, Op.DIV))
case _:
raise BytecodeJITError(f"Unsupported operator {inst.argval}")
case "RETURN_VALUE":
if len(stack) != 1:
raise BytecodeJITError("Invalid stack state")
return stack.pop()
case "RESUME" | "CACHE":
# Skip nops
pass
case _:
raise BytecodeJITError(f"Unsupported opcode {inst.opname}")
The Python VM is a stack machine; so we emulate a stack to convert the function's bytecode to Expr IR (a bit like an RPN evaluator). As before, we then use our llvm_jit_evaluate utility function to lower Expr to LLVM IR and JIT execute it.
Using this JIT is as simple as the previous one - just swap astjit for bytecodejit:
from bytecodejit import bytecodejit
@bytecodejit
def some_expr(a, b, c):
return b / (a + 2) - c * (b - a)
print(some_expr(2, 16, 3))
Numba is a compiler for Python itself. The idea is that you can speed up specific functions in your code by slapping a numba.njit decorator on them. What happens next is similar in spirit to our simple bytecodejit, but of course much more complicated because it supports a very large portion of Python semantics.
Numba uses the Python compiler to emit bytecode, just as we did; it then converts it into its own IR, and then to LLVM using llvmlite [2].
By starting with the bytecode, Numba makes its life easier (no need to rewrite the entire Python compiler). On the other hand, it also makes some analyses harder, because by the time we're in bytecode, a lot of semantic information existing in higher-level representations is lost. For example, Numba has to sweat a bit to recover control flow information from the bytecode (by running it through a special interpreter first).
The two approaches we've seen so far are similar in many ways - both rely on Python's introspection capabilities to compile the source code of the JIT-ed function to some extent (one to AST, the other all the way to bytecode), and then work on this lowered representation.
The tracing strategy is very different. It doesn't analyze the source code of the wrapped function at all - instead, it traces its execution by means of specially-boxed arguments, leveraging overloaded operators and functions, and then works on the generated trace.
The code implementing this for our smile demo is surprisingly compact:
def tracejit(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
if kwargs:
raise TraceJITError("Keyword arguments are not supported")
argspec = inspect.getfullargspec(func)
argboxes = []
for i, arg in enumerate(args):
if i >= len(argspec.args):
raise TraceJITError("Too many arguments")
argboxes.append(_Box(VarExpr(argspec.args[i], i)))
out_box = func(*argboxes)
return llvm_jit_evaluate(out_box.expr, *args)
return wrapper
Each runtime argument of the wrapped function is assigned a VarExpr, and that is placed in a _Box, a placeholder class which lets us do operator overloading:
@dataclass
class _Box:
expr: Expr
_Box.__add__ = _Box.__radd__ = _register_binary_op(Op.ADD)
_Box.__sub__ = _register_binary_op(Op.SUB)
_Box.__rsub__ = _register_binary_op(Op.SUB, reverse=True)
_Box.__mul__ = _Box.__rmul__ = _register_binary_op(Op.MUL)
_Box.__truediv__ = _register_binary_op(Op.DIV)
_Box.__rtruediv__ = _register_binary_op(Op.DIV, reverse=True)
The remaining key function is _register_binary_op:
def _register_binary_op(opcode, reverse=False):
"""Registers a binary opcode for Boxes.
If reverse is True, the operation is registered as arg2 <op> arg1,
instead of arg1 <op> arg2.
"""
def _op(arg1, arg2):
if reverse:
arg1, arg2 = arg2, arg1
box1 = arg1 if isinstance(arg1, _Box) else _Box(ConstantExpr(arg1))
box2 = arg2 if isinstance(arg2, _Box) else _Box(ConstantExpr(arg2))
return _Box(BinOpExpr(box1.expr, box2.expr, opcode))
return _op
To understand how this works, consider this trivial example:
@tracejit
def add(a, b):
return a + b
print(add(1, 2))
After the decorated function is defined, add holds the wrapper function defined inside tracejit. When add(1, 2) is called, the wrapper runs:
This might be a little mind-bending at first, because there are two different executions that happen:
This tracing approach has some interesting characteristics. Since we don't have to analyze the source of the wrapped functions but only trace through the execution, we can "magically" support a much richer set of programs, e.g.:
@tracejit
def use_locals(a, b, c):
x = a + 2
y = b - a
z = c * x
return y / x - z
print(use_locals(2, 8, 11))
This just works with our basic tracejit. Since Python variables are placeholders (references) for values, our tracing step is oblivious to them - it follows the flow of values. Another example:
@tracejit
def use_loop(a, b, c):
result = 0
for i in range(1, 11):
result += i
return result + b * c
print(use_loop(10, 2, 3))
This also just works! The created Expr will be a long chain of BinExpr additions of i's runtime values through the loop, added to the BinExpr for b * c.
This last example also leads us to a limitation of the tracing approach; the loop cannot be data-dependent - it cannot depend on the function's arguments, because the tracing step has no concept of runtime values and wouldn't know how many iterations to run through; or at least, it doesn't know this unless we want to perform the tracing run for every runtime execution [4].
The tracing approach is useful in several domains, most notably automatic differentiation (AD). For a slightly deeper taste, check out my radgrad project.
The JAX ML framework uses a tracing approach very similar to the one described here. The first code sample in this post shows the JAX notation. JAX cleverly wraps Numpy with its own version which is traced (similar to our _Box, but JAX calls these boxes "tracers"), letting you write regular-feeling Numpy code that can be JIT optimized and executed on accelerators like GPUs and TPUs via XLA. JAX's tracer builds up an underlying IR (called jaxpr) which can then be emitted to XLA ops and passed to XLA for further lowering and execution.
For a fairly deep overview of how JAX works, I recommend reading the autodidax doc.
As mentioned earlier, JAX has some limitations with things like data-dependent control flow in native Python. This won't work, because there's control flow that depends on a runtime value (count):
import jax
@jax.jit
def sum_datadep(a, b, count):
total = a
for i in range(count):
total += b
return total
print(sum_datadep(10, 3, 3))
When sum_datadep is executed, JAX will throw an exception, saying something like:
This concrete value was not available in Python because it depends on the value of the argument count.
As a remedy, JAX has its own built-in intrinsics from the jax.lax package. Here's the example rewritten in a way that actually works:
import jax
from jax import lax
@jax.jit
def sum_datadep_fori(a, b, count):
def body(i, total):
return total + b
return lax.fori_loop(0, count, body, a)
fori_loop (and many other built-ins in the lax package) is something JAX can trace through, generating a corresponding XLA operation (XLA has support for While loops, to which this lax.fori_loop can be lowered).
The tracing approach has clear benefits for JAX as well; because it only cares about the flow of values, it can handle arbitrarily complicated Python code, as long as the flow of values can be traced. Just like the local variables and data-independent loops shown earlier, but also things like closures. This makes meta-programming and templating easy.
The full code for this post is available on GitHub.
[1] | Once again, this is a very simplified example. A more realistic translator would have to support many, many more Python bytecode instructions. |
[2] | In fact, llvmlite itself is a Numba sub-project and is maintained by the Numba team, for which I'm grateful! |
[3] | For a fun exercise, try adding constant folding to the wrapped _op: when both its arguments are constants (not boxes), instead placing each in a _Box(ConstantExpr(...)), it could perform the mathematical operation on them and return a single constant box. This is a common optimization in compilers! |
[4] | In all the JIT approaches showed in this post, the expectation is that compilation happens once, but the compiled function can be executed many times (perhaps in a loop). This means that the compilation step cannot depend on the runtime values of the function's arguments, because it has no access to them. You could say that it does, but that's just for the very first time the function is run (in the tracing approach); it has no way of knowing their values the next times the function will run. JAX has some provisions for cases where a function is invoked with a small set of runtime values and we want to separately JIT each of them. |
Nicole Tietz-Sokolskaya 03/02/2025 | Source: ntietz.com blog
Last year, my coworker Suzanne and I got a case study accepted into an ethnography conference! She's an anthropologist by training, and I'm a software engineer. How'd we wind up there?
The short version is I asked her what I thought was an interesting question, and she thought it was an interesting question, and we wound up doing research. We have a lot more to go on the original question—we're studying the relationship between the values we hold as engineers and our documentation practices—and this case study is the first step in that direction.
The title of the case study is a mouthful: "Foundations of Practice: An Ethnographic Exploration of Software Documentation and its Cultural Implications for an Agile Startup". I mean, it's intended for a more academic audience, and in a field that's not software engineering. But what we studied is relevant for software engineers, so let's talk about it here in more familiar language! If you want to read the full case study, it's open access.
I'll start with the recommendations we provided, share a little bit about our methodology, and finish us out with a little about where we can go from here.
The case study concluded with recommendations. This is an area I'd like us to do a lot more work in, because recommendations based on a case study are based on one example, and they'd obviously be much stronger based on more data! But for the conference submission, they were a necessary component. I think that the specific recommendations we gave hold up based on my experience (I'm biased, so take this as it is).
These were our core recommendations.
Start with high-level documentation. If you don't have any docs today, or all your docs are just in a bad way, the biggest bang-for-buck is with your high-level systems docs and process documentation. Don't have a system architecture diagram? Make it! Don't have a CI process diagram? Make it! These help you get your bearings on everything quickly, and they're also usually much faster to make than detailed documentation of more focused pieces.
Use design reviews! Here, "design review" means putting your architecture changes—or any proposal for a major code change—through a review process analogous to a code review. This is one of the things we implemented at my company over four years ago, and it's incredibly useful. It makes us think more deeply on our designs, so we make fewer errors in the design phase of any given feature. And it makes the design visible to the whole team, so we get much better knowledge transfer. If I could drop these into every software company, I would. They're that useful.
Think about your audience. This is just plain good writing advice, and it bears repeating. Not every kind of documentation has the same audience, and if you don't take that into account you'll write bad docs that your end user won't be able to use effectively. You may use too much jargon for the lay user, or too little jargon for efficient communication with the engineers working on a feature.
Maintain the docs, updating them as part of your workflow. I think we all know that docs should be updated. The recommendation here is that updating docs should be part of your workflow. If you run into something that's out of date, go ahead and fix it. If someone asks you a question because it wasn't in a given doc, go ahead and add it there.
Document test plans. QA processes are woefully underappreciated in software engineering. I don't quite understand why QA engineering is so looked down upon, because y'all, that stuff is hard. At any rate, testing? That's something you should document! Assuming you run tests before you release (you do, don't you?) then you need to have some organization and documentation around what those tests are, why they're there, etc. This needs to be shared between both QA and product engineers since if the test fails, it's probably, you know, the product engineer's change that broke it—so the better those docs are, the faster they can find the issue.
Make onboarding docs. We know that good onboarding is important for bringing new team members up to productive contribution quickly. And we know that documentation is a big part of that. So focus on your documentation as part of your onboarding strategy.
Run post-mortems and retrospectives. Mistakes will happen, and when they do? It's imperative that we go back through why they happened and what changes (if any) are merited to our processes as a result. Maybe there's work to be done to prevent similar incidents, or there's something we could do to detect this quicker in the future. Whatever you find, also update your docs: point to the post-mortems in the design docs that led to these decisions so that if someone reads those, they'll also see the longer-term impacts of those decisions.
Encourage collaboration on documentation. You can do this by involving multiple team members in creating and reviewing design docs. This is a tool for mentoring and professional development. It also improves all the documentation quality. If you involve team members at all levels (not just your senior engineers!), you'll be able to ensure the docs and processes meet everyone's actual needs.
Prioritize critical, complex areas for docs. You can't document everything. We mentioned earlier that you want to start with a high-level documentation overview, but you also want to focus your efforts on things which are critical or particularly complex. This will pay off by reducing the difficulty of understanding these areas, reducing rates of defects and speeding up development.
Eliminate ineffective docs, focus on friction points. If your team says that documentation isn't serving a useful purpose, trust them and get rid of it. Change what you do to do something more effective. Focus your documentation efforts on resolving friction points between teams or in the development process.
Our research comprised two main pieces: a literature review and interviews with our engineering team.
We did the literature review first, to get our bearings. And, honestly, I was hoping to find answers to what we ultimately were setting out to answer. I'd rather not have to do this research to get our answers! The existing papers gave us a good grounding. I won't repeat too much here—you can read the section in the case study if you're interested.
The bulk of the new research we did was through interviews, and analysis of those interviews1. We interviewed nine of our software engineers who were present at a critical transition in the company's history, when we implemented a major new documentation process as a result of transitioning to remote work. Each interview was recorded with the consent of the participants. We then transcribed them and did qualitative coding to extract themes. We were looking for some pre-defined themes (collaboration, learning, and knowledge sharing) and we were also interested in emergent themes.
From there, we were able to extract out five values that are represented by documentation practices at the company. These values (collaboration, learning, continuous improvement, accountability, and technical excellence) are exhibited through our documentation practices. They are also reinforced by the documentation practices, so it's not wholly a causal relationship one way or the other.
At this point we were able to take the results of the interviews and combine that with the literature review to figure out recommendations based on our case study. These are, of course, not universal. And they're to some extent conjecture: by the nature of a case study, we're not able to ground this in data across multiple companies. But they're a launching off point for teams in need and for future research.
Ultimately, we want to drive toward answering the bigger questions we have around software engineering and documentation. One of those is: a lot of software engineers say their documentation is bad, and we also say that we value good documentation, so why do we not make the documentation better? Why are we not able to live our what we value there? Or do we not value documentation?
To answer these questions to a satisfying end, we'll need to do a larger study. That means we'll need to recruit participants across many companies, and we'll need to design a study that can cut to the heart of these issues.
I'm not sure when we'll have the time or energy to approach that, but I know we've both been completely sucked in by these questions. So we'll get there eventually!
One of the most interesting questions Suzanne got when presenting our research was: how did you get permission to do this? The question surprised me, because I hadn't thought that other companies would make this really hard, but her peers at the conference certainly had a hard time doing similar projects! The answer to this is a few things: we didn't ask for budget (nothing needs approval there), our engineers volunteered their time for it (they were interested, too!), and I'm on the leadership team and was able to just talk to the other company leaders to clear it.
Amit Patel 02/02/2025 | Source: Red Blob Games: latest blog posts
I know a lot of people hated the Flash Player web plugin, but I found it to be quite useful for my experiments. It gave me vector graphics in the browser so that I could share demos without asking people to download an executable from me. And it ran long before SVG / HTML5 was widely available in browsers. I had been porting some of my old Flash code to Javascript, but that takes time that I could be instead spending on new projects. So I'm glad to see that the Ruffle Flash emulator has made so much progress on ActionScript 3:
Some of the things I resurrected:
Back when I initially got interested in making interactive tutorials (2007), HTML5 wasn't around. Java applets and Flash applets were the best choices to run in a web browser, and I found Java was the respectable but slow/clunky choice, whereas Flash was the fast/lightweight choice, but didn't get any respect. ActionScript 3 was a decent programming language. Think of it like TypeScript + JSX but ten years ahead of its time, and based on the ECMAscript standard. It had type checking, classes, modules, etc. The Flash graphics system offered 2D vector graphics, 2D bitmap graphics, and 3D graphics, and ways to combine all three in a fine-grained way. That's something I can't easily do in HTML5.
Many of the interactive parts of my pages, including the ones about pathfinding, hexagons, and procedural map generation, have their origins in experiments I did in Flash. I was quite glad that Ruffle made some of these work again.
But while looking at the Polygon Map Generator article, I realized I haven't updated it since 2010. It has lots of references to the ActionScript 3 source code. I think ActionScript was nice — But Flash is dead, so nobody's using ActionScript anymore. I decided to remove specific references to ActionScript code, and instead point to either descriptions of the algorithms or JavaScript/TypeScript code.
I also took the opportunity to update some of the text based on what I've learned since then. A big one is that the article was meant to describe what I did and not what you should do, but I didn't convey that well. I made specific decisions based on the game design, and those decisions may not be right for another project. In each section where I made such a decision, I added alternative decisions that I've used or seen in other projects.
I used to think of my pages as something I wrote once and then published. I'm trying instead to of think of them as living documents that I update as I find better ways of explaining things. Updating the Flash parts of my site led me to revisit and update some of my older pages.
Ken Shirriff 01/02/2025 | Source: Ken Shirriff's blog
What is the origin of the word "mainframe", referring to a large, complex computer? Most sources agree that the term is related to the frames that held early computers, but the details are vague.1 It turns out that the history is more interesting and complicated than you'd expect.
Based on my research, the earliest computer to use the term "main frame" was the IBM 701 computer (1952), which consisted of boxes called "frames." The 701 system consisted of two power frames, a power distribution frame, an electrostatic storage frame, a drum frame, tape frames, and most importantly a main frame. The IBM 701's main frame is shown in the documentation below.2
The meaning of "mainframe" has evolved, shifting from being a part of a computer to being a type of computer. For decades, "mainframe" referred to the physical box of the computer; unlike modern usage, this "mainframe" could be a minicomputer or even microcomputer. Simultaneously, "mainframe" was a synonym for "central processing unit." In the 1970s, the modern meaning started to develop—a large, powerful computer for transaction processing or business applications—but it took decades for this meaning to replace the earlier ones. In this article, I'll examine the history of these shifting meanings in detail.
Early computers used a variety of mounting and packaging techniques including panels, cabinets, racks, and bays.3 This packaging made it very difficult to install or move a computer, often requiring cranes or the removal of walls.4 To avoid these problems, the designers of the IBM 701 computer came up with an innovative packaging technique. This computer was constructed as individual units that would pass through a standard doorway, would fit on a standard elevator, and could be transported with normal trucking or aircraft facilities.7 These units were built from a metal frame with covers attached, so each unit was called a frame. The frames were named according to their function, such as the power frames and the tape frame. Naturally, the main part of the computer was called the main frame.
The IBM 701's internal documentation used "main frame" frequently to indicate the main box of the computer, alongside "power frame", "core frame", and so forth. For instance, each component in the schematics was labeled with its location in the computer, "MF" for the main frame.6 Externally, however, IBM documentation described the parts of the 701 computer as units rather than frames.5
The term "main frame" was used by a few other computers in the 1950s.8 For instance, the JOHNNIAC Progress Report (August 8, 1952) mentions that "the main frame for the JOHNNIAC is ready to receive registers" and they could test the arithmetic unit "in the JOHNNIAC main frame in October."10 An article on the RAND Computer in 1953 stated that "The main frame is completed and partially wired" The main body of a computer called ERMA is labeled "main frame" in the 1955 Proceedings of the Eastern Computer Conference.9
The progression of the word "main frame" can be seen in reports from the Ballistics Research Lab (BRL) that list almost all the computers in the United States. In the 1955 BRL report, most computers were built from cabinets or racks; the phrase "main frame" was only used with the IBM 650, 701, and 704. By 1961, the BRL report shows "main frame" appearing in descriptions of the IBM 702, 705, 709, and 650 RAMAC, as well as the Univac FILE 0, FILE I, RCA 501, READIX, and Teleregister Telefile. This shows that the use of "main frame" was increasing, but still mostly an IBM term.
In modern usage, mainframes are distinct from minicomputers or microcomputers. But until the 1980s, the word "mainframe" could also mean the main physical part of a minicomputer or microcomputer. For instance, a "minicomputer mainframe" was not a powerful minicomputer, but simply the main part of a minicomputer.13 For example, the PDP-11 is an iconic minicomputer, but DEC discussed its "mainframe."14. Similarly, the desktop-sized HP 2115A and Varian Data 620i computers also had mainframes.15 As late as 1981, the book Mini and Microcomputers mentioned "a minicomputer mainframe."
Even microcomputers had a mainframe: the cover of Radio Electronics (1978, above) stated, "Own your own Personal Computer: Mainframes for Hobbyists", using the definition below. An article "Introduction to Personal Computers" in Radio Electronics (Mar 1979) uses a similar meaning: "The first choice you will have to make is the mainframe or actual enclosure that the computer will sit in." The popular hobbyist magazine BYTE also used "mainframe" to describe a microprocessor's box in the 1970s and early 1980s16. BYTE sometimes used the word "mainframe" both to describe a large IBM computer and to describe a home computer box in the same issue, illustrating that the two distinct meanings coexisted.
Words often change meaning through metonymy, where a word takes on the meaning of something closely associated with the original meaning. Through this process, "main frame" shifted from the physical frame (as a box) to the functional contents of the frame, specifically the central processing unit.17
The earliest instance that I could find of the "main frame" being equated with the central processing unit was in 1955. Survey of Data Processors stated: "The central processing unit is known by other names; the arithmetic and ligical [sic] unit, the main frame, the computer, etc. but we shall refer to it, usually, as the central processing unit." A similar definition appeared in Radio Electronics (June 1957, p37): "These arithmetic operations are performed in what is called the arithmetic unit of the machine, also sometimes referred to as the 'main frame.'"
The US Department of Agriculture's Glossary of ADP Terminology (1960) uses the definition: "MAIN FRAME - The central processor of the computer system. It contains the main memory, arithmetic unit and special register groups." I'll mention that "special register groups" is nonsense that was repeated for years.18 This definition was reused and extended in the government's Automatic Data Processing Glossary, published in 1962 "for use as an authoritative reference by all officials and employees of the executive branch of the Government" (below). This definition was reused in many other places, notably the Oxford English Dictionary.19
By the early 1980s, defining a mainframe as the CPU had become obsolete. IBM stated that "mainframe" was a deprecated term for "processing unit" in the Vocabulary for Data Processing, Telecommunications, and Office Systems (1981); the American National Dictionary for Information Processing Systems (1982) was similar. Computers and Business Information Processing (1983) bluntly stated: "According to the official definition, 'mainframe' and 'CPU' are synonyms. Nobody uses the word mainframe that way."
Rather than defining the mainframe as the CPU, some dictionaries defined the mainframe in opposition to the "peripherals", the computer's I/O devices. The two definitions are essentially the same, but have a different focus.20 One example is the IFIP-ICC Vocabulary of Information Processing (1966) which defined "central processor" and "main frame" as "that part of an automatic data processing system which is not considered as peripheral equipment." Computer Dictionary (1982) had the definition "main frame—The fundamental portion of a computer, i.e. the portion that contains the CPU and control elements of a computer system, as contrasted with peripheral or remote devices usually of an input-output or memory nature."
One reason for this definition was that computer usage was billed for mainframe time, while other tasks such as printing results could save money by taking place directly on the peripherals without using the mainframe itself.21 A second reason was that the mainframe vs. peripheral split mirrored the composition of the computer industry, especially in the late 1960s and 1970s. Computer systems were built by a handful of companies, led by IBM. Compatible I/O devices and memory were built by many other companies that could sell them at a lower cost than IBM.22 Publications about the computer industry needed convenient terms to describe these two industry sectors, and they often used "mainframe manufacturers" and "peripheral manufacturers."
An interesting linguistic shift is from "main frame" as two independent words to a compound word: either hyphenated "main-frame" or the single word "mainframe." This indicates the change from "main frame" being a type of frame to "mainframe" being a new concept. The earliest instance of hyphenated "main-frame" that I found was from 1959 in IBM Information Retrieval Systems Conference. "Mainframe" as a single, non-hyphenated word appears the same year in Datamation, mentioning the mainframe of the NEAC2201 computer. In 1962, the IBM 7090 Installation Instructions refer to a "Mainframe Diag[nostic] and Reliability Program." (Curiously, the document also uses "main frame" as two words in several places.) The 1962 book Information Retrieval Management discusses how much computer time document queries can take: "A run of 100 or more machine questions may require two to five minutes of mainframe time." This shows that by 1962, "main frame" had semantically shifted to a new word, "mainframe."
So far, I've shown how "mainframe" started as a physical frame in the computer, and then was generalized to describe the CPU. But how did "mainframe" change from being part of a computer to being a class of computers? This was a gradual process, largely happening in the mid-1970s as the rise of the minicomputer and microcomputer created a need for a word to describe large computers.
Although microcomputers, minicomputers, and mainframes are now viewed as distinct categories, this was not the case at first. For instance, a 1966 computer buyer's guide lumps together computers ranging from desk-sized to 70,000 square feet.23 Around 1968, however, the term "minicomputer" was created to describe small computers. The story is that the head of DEC in England created the term, inspired by the miniskirt and the Mini Minor car.24 While minicomputers had a specific name, larger computers did not.25
Gradually in the 1970s "mainframe" came to be a separate category, distinct from "minicomputer."2627 An early example is Datamation (1970), describing systems of various sizes: "mainframe, minicomputer, data logger, converters, readers and sorters, terminals." The influential business report EDP first split mainframes from minicomputers in 1972.28 The line between minicomputers and mainframes was controversial, with articles such as Distinction Helpful for Minis, Mainframes and Micro, Mini, or Mainframe? Confusion persists (1981) attempting to clarify the issue.29
With the development of the microprocessor, computers became categorized as mainframes, minicomputers or microcomputers. For instance, a 1975 Computerworld article discussed how the minicomputer competes against the microcomputer and mainframes. Adam Osborne's An Introduction to Microcomputers (1977) described computers as divided into mainframes, minicomputers, and microcomputers by price, power, and size. He pointed out the large overlap between categories and avoided specific definitions, stating that "A minicomputer is a minicomputer, and a mainframe is a mainframe, because that is what the manufacturer calls it."32
In the late 1980s, computer industry dictionaries started defining a mainframe as a large computer, often explicitly contrasted with a minicomputer or microcomputer. By 1990, they mentioned the networked aspects of mainframes.33
Even though IBM is almost synonymous with "mainframe" now, IBM avoided marketing use of the word for many years, preferring terms such as "general-purpose computer."35 IBM's book Planning a Computer System (1962) repeatedly referred to "general-purpose computers" and "large-scale computers", but never used the word "mainframe."34 The announcement of the revolutionary System/360 (1964) didn't use the word "mainframe"; it was called a general-purpose computer system. The announcement of the System/370 (1970) discussed "medium- and large-scale systems." The System/32 introduction (1977) said, "System/32 is a general purpose computer..." The 1982 announcement of the 3084, IBM's most powerful computer at the time, called it a "large scale processor" not a mainframe.
IBM started using "mainframe" as a marketing term in the mid-1980s. For example, the 3270 PC Guide (1986) refers to "IBM mainframe computers." An IBM 9370 Information System brochure (c. 1986) says the system was "designed to provide mainframe power." IBM's brochure for the 3090 processor (1987) called them "advanced general-purpose computers" but also mentioned "mainframe computers." A System 390 brochure (c. 1990) discussed "entry into the mainframe class." The 1990 announcement of the ES/9000 called them "the most powerful mainframe systems the company has ever offered."
By 2000, IBM had enthusiastically adopted the mainframe label: the z900 announcement used the word "mainframe" six times, calling it the "reinvented mainframe." In 2003, IBM announced "The Mainframe Charter", describing IBM's "mainframe values" and "mainframe strategy." Now, IBM has retroactively applied the name "mainframe" to their large computers going back to 1959 (link), (link).
While "mainframe" was a relatively obscure computer term for many years, it became widespread in the 1980s. The Google Ngram graph below shows the popularity of "microcomputer", "minicomputer", and "mainframe" in books.36 The terms became popular during the late 1970s and 1980s. The popularity of "minicomputer" and "microcomputer" roughly mirrored the development of these classes of computers. Unexpectedly, even though mainframes were the earliest computers, the term "mainframe" peaked later than the other types of computers.
I studied many old dictionaries to see when the word "mainframe" showed up and how they defined it. To summarize, "mainframe" started to appear in dictionaries in the late 1970s, first defining the mainframe in opposition to peripherals or as the CPU. In the 1980s, the definition gradually changed to the modern definition, with a mainframe distinguished as being large, fast, and often centralized system. These definitions were roughly a decade behind industry usage, which switched to the modern meaning in the 1970s.
The word didn't appear in older dictionaries, such as the Random House College Dictionary (1968) and Merriam-Webster (1974). The earliest definition I could find was in the supplement to Webster's International Dictionary (1976): "a computer and esp. the computer itself and its cabinet as distinguished from peripheral devices connected with it." Similar definitions appeared in Webster's New Collegiate Dictionary (1976, 1980).
A CPU-based definition appeared in Random House College Dictionary (1980): "the device within a computer which contains the central control and arithmetic units, responsible for the essential control and computational functions. Also called central processing unit." The Random House Dictionary (1978, 1988 printing) was similar. The American Heritage Dictionary (1982, 1985) combined the CPU and peripheral approaches: "mainframe. The central processing unit of a computer exclusive of peripheral and remote devices."
The modern definition as a large computer appeared alongside the old definition in Webster's Ninth New Collegiate Dictionary (1983): "mainframe (1964): a computer with its cabinet and internal circuits; also: a large fast computer that can handle multiple tasks concurrently." Only the modern definition appears in The New Merriram-Webster Dictionary (1989): "large fast computer", while Webster's Unabridged Dictionary of the English Language (1989): "mainframe. a large high-speed computer with greater storage capacity than a minicomputer, often serving as the central unit in a system of smaller computers. [MAIN + FRAME]." Random House Webster's College Dictionary (1991) and Random House College Dictionary (2001) had similar definitions.
The Oxford English Dictionary is the principal historical dictionary, so it is interesting to see its view. The 1989 OED gave historical definitions as well as defining mainframe as "any large or general-purpose computer, exp. one supporting numerous peripherals or subordinate computers." It has seven historical examples from 1964 to 1984; the earliest is the 1964 Honeywell Glossary. It quotes a 1970 Dictionary of Computers as saying that the word "Originally implied the main framework of a central processing unit on which the arithmetic unit and associated logic circuits were mounted, but now used colloquially to refer to the central processor itself." The OED also cited a Hewlett-Packard ad from 1974 that used the word "mainframe", but I consider this a mistake as the usage is completely different.15
A look at encyclopedias shows that the word "mainframe" started appearing in discussions of computers in the early 1980s, later than in dictionaries. At the beginning of the 1980s, many encyclopedias focused on large computers, without using the word "mainframe", for instance, The Concise Encyclopedia of the Sciences (1980) and World Book (1980). The word "mainframe" started to appear in supplements such as Britannica Book of the Year (1980) and World Book Year Book (1981), at the same time as they started discussing microcomputers. Soon encyclopedias were using the word "mainframe", for example, Funk & Wagnalls Encyclopedia (1983), Encyclopedia Americana (1983), and World Book (1984). By 1986, even the Doubleday Children's Almanac showed a "mainframe computer."
I examined old newspapers to track the usage of the word "mainframe." The graph below shows the usage of "mainframe" in newspapers. The curve shows a rise in popularity through the 1980s and a steep drop in the late 1990s. The newspaper graph roughly matches the book graph above, although newspapers show a much steeper drop in the late 1990s. Perhaps mainframes aren't in the news anymore, but people still write books about them.
The first newspaper appearances were in classified ads seeking employees, for instance, a 1960 ad in the San Francisco Examiner for people "to monitor and control main-frame operations of electronic computers...and to operate peripheral equipment..." and a (sexist) 1966 ad in the Philadelphia Inquirer for "men with Digital Computer Bkgrnd [sic] (Peripheral or Mainframe)."37
By 1970, "mainframe" started to appear in news articles, for example, "The computer can't work without the mainframe unit." By 1971, the usage increased with phrases such as "mainframe central processor" and "'main-frame' computer manufacturers". 1972 had usages such as "the mainframe or central processing unit is the heart of any computer, and does all the calculations". A 1975 article explained "'Mainframe' is the industry's word for the computer itself, as opposed to associated items such as printers, which are referred to as 'peripherals.'" By 1980, minicomputers and microcomputers were appearing: "All hardware categories-mainframes, minicomputers, microcomputers, and terminals" and "The mainframe and the minis are interconnected."
By 1985, the mainframe was a type of computer, not just the CPU: "These days it's tough to even define 'mainframe'. One definition is that it has for its electronic brain a central processor unit (CPU) that can handle at least 32 bits of information at once. ... A better distinction is that mainframes have numerous processors so they can work on several jobs at once." Articles also discussed "the micro's challenge to the mainframe" and asked, "buy a mainframe, rather than a mini?"
By 1990, descriptions of mainframes became florid: "huge machines laboring away in glass-walled rooms", "the big burner which carries the whole computing load for an organization", "behemoth data crunchers", "the room-size machines that dominated computing until the 1980s", "the giant workhorses that form the nucleus of many data-processing centers", "But it is not raw central-processing-power that makes a mainframe a mainframe. Mainframe computers command their much higher prices because they have much more sophisticated input/output systems."
After extensive searches through archival documents, I found usages of the term "main frame" dating back to 1952, much earlier than previously reported. In particular, the introduction of frames to package the IBM 701 computer led to the use of the word "main frame" for that computer and later ones. The term went through various shades of meaning and remained fairly obscure for many years. In the mid-1970s, the term started describing a large computer, essentially its modern meaning. In the 1980s, the term escaped the computer industry and appeared in dictionaries, encyclopedias, and newspapers. After peaking in the 1990s, the term declined in usage (tracking the decline in mainframe computers), but the term and the mainframe computer both survive.
Two factors drove the popularity of the word "mainframe" in the 1980s with its current meaning of a large computer. First, the terms "microcomputer" and "minicomputer" led to linguistic pressure for a parallel term for large computers. For instance, the business press needed a word to describe IBM and other large computer manufacturers. While "server" is the modern term, "mainframe" easily filled the role back then and was nicely alliterative with "microcomputer" and "minicomputer."38
Second, up until the 1980s, the prototype meaning for "computer" was a large mainframe, typically IBM.39 But as millions of home computers were sold in the early 1980s, the prototypical "computer" shifted to smaller machines. This left a need for a term for large computers, and "mainframe" filled that need. In other words, if you were talking about a large computer in the 1970s, you could say "computer" and people would assume you meant a mainframe. But if you said "computer" in the 1980s, you needed to clarify if it was a large computer.
The word "mainframe" is almost 75 years old and both the computer and the word have gone through extensive changes in this time. The "death of the mainframe" has been proclaimed for well over 30 years but mainframes are still hanging on. Who knows what meaning "mainframe" will have in another 75 years?
Follow me on Bluesky (@righto.com) or RSS. (I'm no longer on Twitter.) Thanks to the Computer History Museum and archivist Sara Lott for access to many documents.
The Computer History Museum states: "Why are they called “Mainframes”? Nobody knows for sure. There was no mainframe “inventor” who coined the term. Probably “main frame” originally referred to the frames (designed for telephone switches) holding processor circuits and main memory, separate from racks or cabinets holding other components. Over time, main frame became mainframe and came to mean 'big computer.'" (Based on my research, I don't think telephone switches have any connection to computer mainframes.)
Several sources explain that the mainframe is named after the frame used to construct the computer. The Jargon File has a long discussion, stating that the term "originally referring to the cabinet containing the central processor unit or ‘main frame’." Ken Uston's Illustrated Guide to the IBM PC (1984) has the definition "MAIN FRAME A large, high-capacity computer, so named because the CPU of this kind of computer used to be mounted on a frame." IBM states that mainframe "Originally referred to the central processing unit of a large computer, which occupied the largest or central frame (rack)." The Microsoft Computer Dictionary (2002) states that the name mainframe "is derived from 'main frame', the cabinet originally used to house the processing unit of such computers." Some discussions of the origin of the word "mainframe" are here, here, here, here, and here.
The phrase "main frame" in non-computer contexts has a very old but irrelevant history, describing many things that have a frame. For example, it appears in thousands of patents from the 1800s, including drills, saws, a meat cutter, a cider mill, printing presses, and corn planters. This shows that it was natural to use the phrase "main frame" when describing something constructed from frames. Telephony uses a Main distribution frame or "main frame" for wiring, going back to 1902. Some people claim that the computer use of "mainframe" is related to the telephony use, but I don't think they are related. In particular, a telephone main distribution frame looks nothing like a computer mainframe. Moreover, the computer use and the telephony use developed separately; if the computer use started in, say, Bell Labs, a connection would be more plausible.
IBM patents with "main frame" include a scale (1922), a card sorter (1927), a card duplicator (1929), and a card-based accounting machine (1930). IBM's incidental uses of "main frame" are probably unrelated to modern usage, but they are a reminder that punch card data processing started decades before the modern computer. ↩
It is unclear why the IBM 701 installation manual is dated August 27, 1952 but the drawing is dated 1953. I assume the drawing was updated after the manual was originally produced. ↩
This footnote will survey the construction techniques of some early computers; the key point is that building a computer on frames was not an obvious technique. ENIAC (1945), the famous early vacuum tube computer, was constructed from 40 panels forming three walls filling a room (ref, ref). EDVAC (1949) was built from large cabinets or panels (ref) while ORDVAC and CLADIC (1949) were built on racks (ref). One of the first commercial computers, UNIVAC 1 (1951), had a "Central Computer" organized as bays, divided into three sections, with tube "chassis" plugged in (ref ). The Raytheon computer (1951) and Moore School Automatic Computer (1952) (ref) were built from racks. The MONROBOT VI (1955) was described as constructed from the "conventional rack-panel-cabinet form" (ref). ↩
The size and construction of early computers often made it difficult to install or move them. The early computer ENIAC required 9 months to move from Philadelphia to the Aberdeen Proving Ground. For this move, the wall of the Moore School in Philadelphia had to be partially demolished so ENIAC's main panels could be removed. In 1959, moving the SWAC computer required disassembly of the computer and removing one wall of the building (ref). When moving the early computer JOHNNIAC to a different site, the builders discovered the computer was too big for the elevator. They had to raise the computer up the elevator shaft without the elevator (ref). This illustrates the benefits of building a computer from moveable frames. ↩
The IBM 701's main frame was called the Electronic Analytical Control Unit in external documentation. ↩
The 701 installation manual (1952) has a frame arrangement diagram showing the dimensions of the various frames, along with a drawing of the main frame, and power usage of the various frames. Service documentation (1953) refers to "main frame adjustments" (page 74). The 700 Series Data Processing Systems Component Circuits document (1955-1959) lists various types of frames in its abbreviation list (below)
When repairing an IBM 701, it was important to know which frame held which components, so "main frame" appeared throughout the engineering documents. For instance, in the schematics, each module was labeled with its location; "MF" stands for "main frame."
The "main frame" terminology was used in discussions with customers. For example, notes from a meeting with IBM (April 8, 1952) mention "E. S. [Electrostatic] Memory 15 feet from main frame" and list "main frame" as one of the seven items obtained for the $15,000/month rental cost. ↩
For more information on how the IBM 701 was designed to fit on elevators and through doorways, see Building IBM: Shaping an Industry and Technology page 170, and The Interface: IBM and the Transformation of Corporate Design page 69. This is also mentioned in "Engineering Description of the IBM Type 701 Computer", Proceedings of the IRE Oct 1953, page 1285. ↩
Many early systems used "central computer" to describe the main part of the computer, perhaps more commonly than "main frame." An early example is the "central computer" of the Elecom 125 (1954). The Digital Computer Newsletter (Apr 1955) used "central computer" several times to describe the processor of SEAC. The 1961 BRL report shows "central computer" being used by Univac II, Univac 1107, Univac File 0, DYSEAC and RCA Series 300. The MIT TX-2 Technical Manual (1961) uses "central computer" very frequently. The NAREC glossary (1962) defined "central computer. That part of a computer housed in the main frame." ↩
This footnote lists some other early computers that used the term "main frame." The October 1956 Digital Computer Newsletter mentions the "main frame" of the IBM NORC. Digital Computer Newsletter (Jan 1959) discusses using a RAMAC disk drive to reduce "main frame processing time." This document also mentions the IBM 709 "main frame." The IBM 704 documentation (1958) says "Each DC voltage is distributed to the main frame..." (IBM 736 reference manual) and "Check the air filters in each main frame unit and replace when dirty." (704 Central Processing Unit).
The July 1962 Digital Computer Newsletter discusses the LEO III computer: "It has been built on the modular principle with the main frame, individual blocks of storage, and input and output channels all physically separate." The article also mentions that the new computer is more compact with "a reduction of two cabinets for housing the main frame."
The IBM 7040 (1964) and IBM 7090 (1962) were constructed from multiple frames, including the processing unit called the "main frame."11 Machines in IBM's System/360 line (1964) were built from frames; some models had a main frame, power frame, wall frame, and so forth, while other models simply numbered the frames sequentially.12 ↩
The 1952 JOHNNIAC progress report is quoted in The History of the JOHNNIAC. This memorandum was dated August 8, 1952, so it is the earliest citation that I found. The June 1953 memorandum also used the term, stating, "The main frame is complete." ↩
A detailed description of IBM's frame-based computer packaging is in Standard Module System Component Circuits pages 6-9. This describes the SMS-based packaging used in the IBM 709x computers, the IBM 1401, and related systems as of 1960. ↩
IBM System/360 computers could have many frames, so they were usually given sequential numbers. The Model 85, for instance, had 12 frames for the processor and four megabytes of memory in 18 frames (at over 1000 pounds each). Some of the frames had descriptive names, though. The Model 40 had a main frame (CPU main frame, CPU frame), a main storage logic frame, a power supply frame, and a wall frame. The Model 50 had a CPU frame, power frame, and main storage frame. The Model 75 had a main frame (consisting of multiple physical frames), storage frames, channel frames, central processing frames, and a maintenance console frame. The compact Model 30 consisted of a single frame, so the documentation refers to the "frame", not the "main frame." For more information on frames in the System/360, see 360 Physical Planning. The Architecture of the IBM System/360 paper refers to the "main-frame hardware." ↩
A few more examples that discuss the minicomputer's mainframe, its physical box: A 1970 article discusses the mainframe of a minicomputer (as opposed to the peripherals) and contrasts minicomputers with large scale computers. A 1971 article on minicomputers discusses "minicomputer mainframes." Computerworld (Jan 28, 1970, p59) discusses minicomputer purchases: "The actual mainframe is not the major cost of the system to the user." Modern Data (1973) mentions minicomputer mainframes several times. ↩
DEC documents refer to the PDP-11 minicomputer as a mainframe. The PDP-11 Conventions manual (1970) defined: "Processor: A unit of a computing system that includes the circuits controlling the interpretation and execution of instructions. The processor does not include the Unibus, core memory, interface, or peripheral devices. The term 'main frame' is sometimes used but this term refers to all components (processor, memory, power supply) in the basic mounting box." In 1976, DEC published the PDP-11 Mainframe Troubleshooting Guide. The PDP-11 mainframe is also mentioned in Computerworld (1977). ↩
Test equipment manufacturers started using the term "main frame" (and later "mainframe") around 1962, to describe an oscilloscope or other test equipment that would accept plug-in modules. I suspect this is related to the use of "mainframe" to describe a computer's box, but it could be independent. Hewlett-Packard even used the term to describe a solderless breadboard, the 5035 Logic Lab. The Oxford English Dictionary (1989) used HP's 1974 ad for the Logic Lab as its earliest citation of mainframe as a single word. It appears that the OED confused this use of "mainframe" with the computer use.
In the 1980s, the use of "mainframe" to describe the box holding a microcomputer started to conflict with "mainframe" as a large computer. For example, Radio Electronics (October 1982), started using the short-lived term "micro-mainframe" instead of "mainframe" for a microcomputer's enclosure. By 1985, Byte magazine had largely switched to the modern usage of "mainframe." But even as late as 1987, a review of the Apple IIGC described one of the system's components as the '"mainframe" (i.e. the actual system box)'. ↩
Definitions of "central processing unit" disagreed as to whether storage was part of the CPU, part of the main frame, or something separate. This was largely a consequence of the physical construction of early computers. Smaller computers had memory in the same frame as the processor, while larger computers often had separate storage frames for memory. Other computers had some memory with the processor and some external. Thus, the "main frame" might or might not contain memory, and this ambiguity carried over to definitions of CPU. (In modern usage, the CPU consists of the arithmetic/logic unit (ALU) and control circuitry, but excludes memory.) ↩
Many definitions of mainframe or CPU mention "special register groups", an obscure feature specific to the Honeywell 800 computer (1959). (Processors have registers, special registers are common, and some processors have register groups, but only the Honeywell 800 had "special register groups.") However, computer dictionaries kept using this phrase for decades, even though it doesn't make sense for other computers. I wrote a blog post about special register groups here. ↩
This footnote provides more examples of "mainframe" being defined as the CPU. The Data Processing Equipment Encyclopedia (1961) had a similar definition: "Main Frame: The main part of the computer, i.e. the arithmetic or logic unit; the central processing unit." The 1967 IBM 360 operator's guide defined: "The main frame - the central processing unit and main storage." The Department of the Navy's ADP Glossary (1970): "Central processing unit: A unit of a computer that includes the circuits controlling the interpretation and execution of instructions. Synonymous with main frame." This was a popular definition, originally from the ISO, used by IBM (1979) among others. Funk & Wagnalls Dictionary of Data Processing Terms (1970) defined: "main frame: The basic or essential portion of an assembly of hardware, in particular, the central processing unit of a computer." The American National Standard Vocabulary for Information Processing (1970) defined: "central processing unit: A unit of a computer that includes the circuits controlling the interpretation and execution of instructions. Synonymous with main frame." ↩
Both the mainframe vs. peripheral definition and the mainframe as CPU definition made it unclear exactly what components of the computer were included in the mainframe. It's clear that the arithmetic-logic unit and the processor control circuitry were included, while I/O devices were excluded, but some components such as memory were in a gray area. It's also unclear if the power supply and I/O interfaces (channels) are part of the mainframe. These distinctions were ignored in almost all of the uses of "mainframe" that I saw.
An unusual definition in a Goddard Space Center document (1965, below) partitioned equipment into the "main frame" (the electronic equipment), "peripheral equipment" (electromechanical components such as the printer and tape), and "middle ground equipment" (the I/O interfaces). The "middle ground" terminology here appears to be unique. Also note that computers are partitioned into "super speed", "large-scale", "medium-scale", and "small-scale."
This footnote gives some examples of using peripherals to save the cost of mainframe time. IBM 650 documentation (1956) describes how "Data written on tape by the 650 can be processed by the main frame of the 700 series systems." Univac II Marketing Material (1957) discusses various ways of reducing "main frame time" by, for instance, printing from tape off-line. The USAF Guide for auditing automatic data processing systems (1961) discusses how these "off line" operations make the most efficient use of "the more expensive main frame time." ↩
Peripheral manufacturers were companies that built tape drives, printers, and other devices that could be connected to a mainframe built by IBM or another company. The basis for the peripheral industry was antitrust action against IBM that led to the 1956 Consent Decree. Among other things, the consent decree forced IBM to provide reasonable patent licensing, which allowed other firms to build "plug-compatible" peripherals. The introduction of the System/360 in 1964 produced a large market for peripherals and IBM's large profit margins left plenty of room for other companies. ↩
Computers and Automation, March 1965, categorized computers into five classes, from "Teeny systems" (such as the IBM 360/20) renting for $2000/month, through Small, Medium, and Large systems, up to "Family or Economy Size Systems" (such as the IBM 360/92) renting for $75,000 per month. ↩
The term "minicomputer" was supposedly invented by John Leng, head of DEC's England operations. In the 1960s, he sent back a sales report: "Here is the latest minicomputer activity in the land of miniskirts as I drive around in my Mini Minor", which led to the term becoming popular at DEC. This story is described in The Ultimate Entrepreneur: The Story of Ken Olsen and Digital Equipment Corporation (1988). I'd trust the story more if I could find a reference that wasn't 20 years after the fact. ↩
For instance, Computers and Automation (1971) discussed the role of the minicomputer as compared to "larger computers." A 1975 minicomputer report compared minicomputers to their "general-purpose cousins." ↩
This footnote provides more on the split between minicomputers and mainframes. In 1971, Modern Data Products, Systems, Services contained .".. will offer mainframe, minicomputer, and peripheral manufacturers a design, manufacturing, and production facility...." Standard & Poor's Industry Surveys (1972) mentions "mainframes, minicomputers, and IBM-compatible peripherals." Computerworld (1975) refers to "mainframe and minicomputer systems manufacturers."
The 1974 textbook "Information Systems: Technology, Economics, Applications" couldn't decide if mainframes were a part of the computer or a type of computer separate from minicomputers, saying: "Computer mainframes include the CPU and main memory, and in some usages of the term, the controllers, channels, and secondary storage and I/O devices such as tape drives, disks, terminals, card readers, printers, and so forth. However, the equipment for storage and I/O are usually called peripheral devices. Computer mainframes are usually thought of as medium to large scale, rather than mini-computers."
Studying U.S. Industrial Outlook reports provides another perspective over time. U.S. Industrial Outlook 1969 divides computers into small, medium-size, and large-scale. Mainframe manufacturers are in opposition to peripheral manufacturers. The same mainframe vs. peripherals opposition appears in U.S. Industrial Outlook 1970 and U.S. Industrial Outlook 1971. The 1971 report also discusses minicomputer manufacturers entering the "maxicomputer market."30 1973 mentions "large computers, minicomputers, and peripherals." U.S. Industrial Outlook 1976 states, "The distinction between mainframe computers, minis, micros, and also accounting machines and calculators should merge into a spectrum." By 1977, the market was separated into "general purpose mainframe computers", "minicomputers and small business computers" and "microprocessors."
Family Computing Magazine (1984) had a "Dictionary of Computer Terms Made Simple." It explained that "A Digital computer is either a "mainframe", a "mini", or a "micro." Forty years ago, large mainframes were the only size that a computer could be. They are still the largest size, and can handle more than 100,000,000 instructions per second. PER SECOND! [...] Mainframes are also called general-purpose computers." ↩
In 1974, Congress held antitrust hearings into IBM. The thousand-page report provides a detailed snapshot of the meanings of "mainframe" at the time. For instance, a market analysis report from IDC illustrates the difficulty of defining mainframes and minicomputers in this era (p4952). The "Mainframe Manufacturers" section splits the market into "general-purpose computers" and "dedicated application computers" including "all the so-called minicomputers." Although this section discusses minicomputers, the emphasis is on the manufacturers of traditional mainframes. A second "Plug-Compatible Manufacturers" section discusses companies that manufactured only peripherals. But there's also a separate "Minicomputers" section that focuses on minicomputers (along with microcomputers "which are simply microprocessor-based minicomputers"). My interpretation of this report is the terminology is in the process of moving from "mainframe vs. peripheral" to "mainframe vs. minicomputer." The statement from Research Shareholders Management (p5416) on the other hand discusses IBM and the five other mainframe companies; they classify minicomputer manufacturers separately. (p5425) p5426 mentions "mainframes, small business computers, industrial minicomputers, terminals, communications equipment, and minicomputers." Economist Ralph Miller mentions the central processing unit "(the so-called 'mainframe')" (p5621) and then contrasts independent peripheral manufacturers with mainframe manufacturers (p5622). The Computer Industry Alliance refers to mainframes and peripherals in multiple places, and "shifting the location of a controller from peripheral to mainframe", as well as "the central processing unit (mainframe)" p5099. On page 5290, "IBM on trial: Monopoly tends to corrupt", from Harper's (May 1974), mentions peripherals compatible with "IBM mainframe units—or, as they are called, central processing computers." ↩
The influential business newsletter EDP provides an interesting view on the struggle to separate the minicomputer market from larger computers. Through 1968, they included minicomputers in the "general-purpose computer" category. But in 1969, they split "general-purpose computers" into "Group A, General Purpose Digital Computers" and "Group B, Dedicated Application Digital Computers." These categories roughly corresponded to larger computers and minicomputers, on the (dubious) assumption that minicomputers were used for a "dedicated application." The important thing to note is that in 1969 they did not use the term "mainframe" for the first category, even though with the modern definition it's the obvious term to use. At the time, EDP used "mainframe manufacturer" or "mainframer"31 to refer to companies that manufactured computers (including minicomputers), as opposed to manufacturers of peripherals. In 1972, EDP first mentioned mainframes and minicomputers as distinct types. In 1973, "microcomputer" was added to the categories. As the 1970s progressed, the separation between minicomputers and mainframes became common. However, the transition was not completely smooth; 1973 included a reference to "mainframe shipments (including minicomputers)."
To specific, the EDP Industry Report (Nov. 28, 1969) gave the following definitions of the two groups of computers:
Group A—General Purpose Digital Computers: These comprise the bulk of the computers that have been listed in the Census previously. They are character or byte oriented except in the case of the large-scale scientific machines, which have 36, 48, or 60-bit words. The predominant portion (60% to 80%) of these computers is rented, usually for $2,000 a month or more. Higher level languages such as Fortran, Cobol, or PL/1 are the primary means by which users program these computers.
Group B—Dedicated Application Digital Computers: This group of computers includes the "mini's" (purchase price below $25,000), the "midi's" ($25,000 to $50,000), and certain larger systems usually designed or used for one dedicated application such as process control, data acquisition, etc. The characteristics of this group are that the computers are usually word oriented (8, 12, 16, or 24-bits per word), the predominant number (70% to 100%) are purchased, and assembly language (at times Fortran) is the predominant means of programming. This type of computer is often sold to an original equipment manufacturer (OEM) for further system integration and resale to the final user.
These definitions strike me as rather arbitrary. ↩
In 1981 Computerworld had articles trying to clarify the distinctions between microcomputers, minicomputers, superminicomputers, and mainframes, as the systems started to overlay. One article, Distinction Helpful for Minis, Mainframes said that minicomputers were generally interactive, while mainframes made good batch machines and network hosts. Microcomputers had up to 512 KB of memory, minis were 16-bit machines with 512 KB to 4 MB of memory, costing up to $100,000. Superminis were 16- to 32-bit machines with 4 MB to 8 MB of memory, costing up to $200,000 but with less memory bandwidth than mainframes. Finally, mainframes were 32-bit machines with more than 8 MB of memory, costing over $200,000. Another article Micro, Mini, or Mainframe? Confusion persists described a microcomputer as using an 8-bit architecture and having fewer peripherals, while a minicomputer has a 16-bit architecture and 48 KB to 1 MB of memory. ↩
The miniskirt in the mid-1960s was shortly followed by the midiskirt and maxiskirt. These terms led to the parallel construction of the terms minicomputer, midicomputer, and maxicomputer.
The New York Times had a long article Maxi Computers Face Mini Conflict (April 5, 1970) explicitly making the parallel: "Mini vs. Maxi, the reigning issue in the glamorous world of fashion, is strangely enough also a major point of contention in the definitely unsexy realm of computers."
Although midicomputer and maxicomputer terminology didn't catch on the way minicomputer did, they still had significant use (example, midicomputer examples, maxicomputer examples).
The miniskirt/minicomputer parallel was done with varying degrees of sexism. One example is Electronic Design News (1969): "A minicomputer. Like the miniskirt, the small general-purpose computer presents the same basic commodity in a more appealing way." ↩
Linguistically, one indication that a new word has become integrated in the language is when it can be extended to form additional new words. One example is the formation of "mainframers", referring to companies that build mainframes. This word was moderately popular in the 1970s to 1990s. It was even used by the Department of Justice in their 1975 action against IBM where they described the companies in the systems market as the "mainframe companies" or "mainframers." The word is still used today, but usually refers to people with mainframe skills. Other linguistic extensions of "mainframe" include mainframing, unmainframe, mainframed, nonmainframe, and postmainframe. ↩
More examples of the split between microcomputers and mainframes: Softwide Magazine (1978) describes "BASIC versions for micro, mini and mainframe computers." MSC, a disk system manufacturer, had drives "used with many microcomputer, minicomputer, and mainframe processor types" (1980). ↩
Some examples of computer dictionaries referring to mainframes as a size category: Illustrated Dictionary of Microcomputer Terminology (1978) defines "mainframe" as "(1) The heart of a computer system, which includes the CPU and ALU. (2) A large computer, as opposed to a mini or micro." A Dictionary of Minicomputing and Microcomputing (1982) includes the definition of "mainframe" as "A high-speed computer that is larger, faster, and more expensive than the high-end minicomputers. The boundary between a small mainframe and a large mini is fuzzy indeed." The National Bureau of Standards Future Information Technology (1984) defined: "Mainframe is a term used to designate a medium and large scale CPU." The New American Computer Dictionary (1985) defined "mainframe" as "(1) Specifically, the rack(s) holding the central processing unit and the memory of a large computer. (2) More generally, any large computer. 'We have two mainframes and several minis.'" The 1990 ANSI Dictionary for Information Systems (ANSI X3.172-1990) defined: mainframe. A large computer, usually one to which other computers are connected in order to share its resources and computing power. Microsoft Press Computer Dictionary (1991) defined "mainframe computer" as "A high-level computer designed for the most intensive computational tasks. Mainframe computers are often shared by multiple users connected to the computer via terminals." ISO 2382 (1993) defines a mainframe as "a computer, usually in a computer center, with extensive capabilities and resources to which other computers may be connected so that they can share facilities."
The Microsoft Computer Dictionary (2002) had an amusingly critical definition of mainframe: "A type of large computer system (in the past often water-cooled), the primary data processing resource for many large businesses and organizations. Some mainframe operating systems and solutions are over 40 years old and have the capacity to store year values only as two digits." ↩
IBM's 1962 book Planning a Computer System (1962) describes how the Stretch computer's circuitry was assembled into frames, with the CPU consisting of 18 frames. The picture below shows how a "frame" was, in fact, constructed from a metal frame.
The term "general-purpose computer" is probably worthy of investigation since it was used in a variety of ways. It is one of those phrases that seems obvious until you think about it more closely. On the one hand, a computer such as the Apollo Guidance Computer can be considered general purpose because it runs a variety of programs, even though the computer was designed for one specific mission. On the other hand, minicomputers were often contrasted with "general-purpose computers" because customers would buy a minicomputer for a specific application, unlike a mainframe which would be used for a variety of applications. ↩
The n-gram graph is from the Google Books Ngram Viewer. The curves on the graph should be taken with a grain of salt. First, the usage of words in published books is likely to lag behind "real world" usage. Second, the number of usages in the data set is small, especially at the beginning. Nonetheless, the n-gram graph generally agrees with what I've seen looking at documents directly. ↩
More examples of "mainframe" in want ads: A 1966 ad from Western Union in The Arizona Republic looking for experience "in a systems engineering capacity dealing with both mainframe and peripherals." A 1968 ad in The Minneapolis Star for an engineer with knowledge of "mainframe and peripheral hardware." A 1968 ad from SDS in The Los Angeles Times for an engineer to design "circuits for computer mainframes and peripheral equipment." A 1968 ad in Fort Lauderdale News for "Computer mainframe and peripheral logic design." A 1972 ad in The Los Angeles Times saying "Mainframe or peripheral [experience] highly desired." In most of these ads, the mainframe was in contrast to the peripherals. ↩
A related factor is the development of remote connections from a microcomputer to a mainframe in the 1980s. This led to the need for a word to describe the remote computer, rather than saying "I connected my home computer to the other computer." See the many books and articles on connecting "micro to mainframe." ↩
To see how the prototypical meaning of "computer" changed in the 1980s, I examined the "Computer" article in encyclopedias from that time. The 1980 Concise Encyclopedia of the Sciences discusses a large system with punched-card input. In 1980, the World Book article focused on mainframe systems, starting with a photo of an IBM System/360 Model 40 mainframe. But in the 1981 supplement and the 1984 encyclopedia, the World Book article opened with a handheld computer game, a desktop computer, and a "large-scale computer." The article described microcomputers, minicomputers, and mainframes. Funk & Wagnalls Encyclopedia (1983) was in the middle of the transition; the article focused on large computers and had photos of IBM machines, but mentioned that future growth is expected in microcomputers. By 1994, the World Book article's main focus was the personal computer, although the mainframe still had a few paragraphs and a photo. This is evidence that the prototypical meaning of "computer" underwent a dramatic shift in the early 1980s from a mainframe to a balance between small and large computers, and then to the personal computer. ↩
Anonymous 30/01/2025 | Source: Irrational Exuberance
While I frequently hear engineers bemoan a missing strategy, they rarely complete the thought by articulating why the missing strategy matters. Instead, it serves as more of a truism: the economy used to be better, children used to respect their parents, and engineering organizations used to have an engineering strategy.
This chapter starts by exploring something I believe quite strongly: there’s always an engineering strategy, even if there’s nothing written down. From there, we’ll discuss why strategy, especially written strategy, is such a valuable opportunity for organizations that take it seriously.
We’ll dig into:
By this chapter’s end, hopefully you will agree with me that strategy is an undertaking worth investing your–and your organization’s–time in.
This is an exploratory, draft chapter for a book on engineering strategy that I’m brainstorming in #eng-strategy-book. As such, some of the links go to other draft chapters, both published drafts and very early, unpublished drafts.
I’ve never worked somewhere where people didn’t claim there as no strategy. In many of those companies, they’d say there was no engineering strategy. Once I became an executive and was able to document and distribute an engineering strategy, accusations of missing strategy didn’t go away, they just shifted to focus on a missing product or company strategy.
This even happened at companies that definitively had engineering strategies like Stripe in 2016 which had numerous pillars to a clear engineering strategy such as:
Working there it was generally clear what the company’s engineering strategy was on any given topic. That said, it sometimes required asking around, and over time certain decisions became sufficiently contentious that it became hard to definitively answer what the strategy was. For example, the adoption of Ruby versus Java became contentious enough that I distributed a strategy attempting to mediate the disagreement, Magnitudes of exploration, although it wasn’t a particularly successful effort (for reasons that are obvious in hindsight, particularly the lack of any enforcement mechanism).
In the same sense that William Gibson said “The future is already here – it’s just not very evenly distributed,” there is always a strategy embedded into an organization’s decisions, although in many organizations that strategy is only visible to a small group, and may be quickly forgotten.
If you ever find yourself thinking that a strategy doesn’t exist, I’d encourage you to instead ask yourself where the strategy lives if you can’t find it. Once you do find it, you may also find that the strategy is quite ineffective, but I’ve simply never found that it doesn’t exist.
In “We are a product engineering company!”, we discuss Calm’s engineering strategy to address pervasive friction within the engineering team. The core of that strategy is clarifying how Calm makes major technology decisions, along with documenting the motivating goal steering those decisions: maximizing time and energy spent on creating their product.
That strategy reduced friction by eliminating the cause of ongoing debate. It was successful in resetting the team’s focus. It also caused several engineers to leave the company, because it was incompatible with their priorities. It’s easy to view that as a downside, but I don’t think it was. A clear, documented strategy made it clear to everyone involved what sort of game we were playing, the rules for that game, and for the first time let them accurately decide if they wanted to be part of that game with the wider team.
Creating alignment is one of the ways that strategy makes an impact, but it’s certainly not the only way. Some of the ways that strategies support the organization are:
Concentrating company investment into a smaller space.
For example, deciding not to decompose a monolith allows you to invest the majority of your tooling efforts on one language, one test suite, and one deployment mechanism.
Many interesting properties only available through universal adoption.
For example, moving to an “N-1 policy” on backfilled roles is a significant opportunity for managing costs, but only works if consistently adopted. As another example, many strategies for disaster recovery or multi-region are only viable if all infrastructure has a common configuration mechanism.
Focus execution on what truly matters.
For example, Uber’s service migration strategy allowed a four engineer team to migrate a thousand services operated by two thousand engineers to a new provisioning and orchestration platform in less than a year. This was an extraordinarily difficult project, and was only possible because of clear thinking.
Creating a knowledge repository of how your organization thinks. Onboarding new hires, particularly senior new hires, is much more effective with documented strategy.
For example, most industry professionals today have a strongly held opinion on how to adopt large language models. New hires will have a strong opinion as well, but they’re unlikely to share your organization’s opinion unless there’s a clear document they can read to understand it.
There are some things that a strategy, even a cleverly written one, cannot do. However, it’s always been my experience that developing a strategy creates progress, even if the progress is understanding the inherent disagreement preventing agreement.
While good strategy can accomplish many things, it sometimes feels that inappropriate strategy is far more impactful. Of course, impactful in all the wrong ways. Digg V4 remains the worst considered strategy I’ve personally participated in. It was a complete rewrite of the Digg V3.5 codebase from a PHP monolith to a PHP frontend and backend of a dozen Python services. It also moved the database from sharded MySQL to an early version of Cassandra. Perhaps worst, it replaced the nuanced algorithms developed over a decade with a hack implemented a few days before launch.
Although it’s likely Digg would have struggled to become profitable due to its reliance on search engine optimization for traffic, and Google’s frequently changing search algorithm of that era, the engineering strategy ensured we died fast rather than having an opportunity to dig our way out.
Importantly, it’s not just Digg. Almost every engineering organization you drill into will have its share of unused platform projects that captured decades of engineering years to the detriment of an important opportunity. A shocking number of senior leaders join new companies and initiate a grand migration that attempts to entirely rewrite the architecture, switch programming languages, or otherwise shift their new organization to resemble a prior organization where they understood things better.
Inappropriate versus bad
When I first wrote this section, I just labeled this sort of strategy as “bad.” The challenge with that term is that the same strategy might well be very effective in a different set of circumstances. For example, if Digg had been a three person company with no revenue, rewriting from scratch could have the right decision!
As a result, I’ve tried to prefer the term “inappropriate” rather than “bad” to avoid getting caught up on whether a given approach might work in other circumstances. Every approach undoubtedly works in some organization.
When I joined Carta, I noticed we had an inconsistent approach to a number of important problems. Teams had distinct standard kits for how they approached new projects. Adoption of existing internal platforms was inconsistent, as was decision making around funding new internal platforms. There was widespread agreement that we were decomposing our monolith, but no agreement on how we were doing it.
Coming into such a permissive strategy environment, with strong, differing perspectives on the ideal path forward, one of my first projects was writing down an explicit engineering strategy along with our newly formed Navigators team, itself a part of our new engineering strategy.
Navigators at Carta
As discussed in Navigators, we developed a program at Carta to have explicitly named individual contributor, technical leaders to represent key parts of the engineering organization. This representative leadership group made it possible to iterate on strategy with a small team of about ten engineers that represented the entire organization, rather than take on the impossible task of negotiating with 400 engineers directly.
This written strategy made it possible to explicitly describe the problems we saw, and how we wanted to navigate those problems. Further, it was an artifact that we were able to iterate on in a small group, but then share widely for feedback from teams we might have missed.
After initial publishing, we shared it widely and talked about it frequently in engineering all-hands meetings. Then we came back to it each year, or when things stopped making much sense, and revised it. As an example, our initial strategy didn’t talk about artificial intelligence at all. A few months later, we extended it to mention a very conservative approach to using Large Language Models. Most recently, we’ve revised the artificial intelligence portion again, as we dive deeply into agentic workflows.
A lot of people have disagreed with parts of the strategy, which is great: that’s one of the key benefits of a written strategy, it’s possible to precisely disagree. From that disagreement, we’ve been able to evolve our strategy. Sometimes because there’s new information like the current rapid evolution of artificial intelligence practices, and other times because our initial approach could be improved like in how we gated membership of the initial Navigators team.
New hires are able to disagree too, and do it from an informed place rather than coming across as attached to their prior company’s practices. In particular, they’re able to understand the historical thinking that motivated our decisions, even when that context is no longer obvious. At the time we paused decomposition of our monolith, there was significant friction in service provisioning, but that’s far less true today, which makes the decision seem a bit arbitrary. Only the written document can consistently communicate that context across a growing, shifting, and changing organization.
With oral history, what you believe is highly dependent on who you talk with, which shapes your view of history and the present. With written history, it’s far more possible to agree at scale, which is the prerequisite to growing at scale rather than isolating growth to small pockets of senior leadership.
We just finished talking about written strategy, and this book spends a lot of time on this topic, including a chapter on how to structure strategies to maximize readability. It’s not just because of the positives created by written strategy, but also because of the damage unwritten strategy creates.
Vulnerable to misinterpretation.
Information flow in verbal organizations depends on an individual being in a given room for a decision, and then accurately repeating that information to the others who need it. However, it’s common to see those individuals fail to repeat that information elsewhere. Sometimes their interpretation is also faulty to some degree. Both of these create significant problems in operating strategy.
Two-headed organizations
Some years ago, I started moving towards a model where most engineering organizations I worked with have two leaders: one who’s a manager, and another who is a senior engineer. This was partially to ensure engineering context was included in senior decision making, but it was also to reduce communication errors.
Errors in point-to-point communication are so prevalent when done one-to-one, that the only solution I could find for folks who weren’t reading-oriented communicators was ensuring I had communicated strategy (and other updates) to at least two people.
Inconsistency across teams.
At one company I worked in, promotions to Staff-plus role happened at a much higher rate in the infrastructure engineering organization than the product engineering team. This created a constant drain out of product engineering to work on infrastructure shaped problems, even if those problems weren’t particularly valuable to the business.
New leaders had no idea this informal policy existed, and they would routinely run into trouble in calibration discussions. They also weren’t aware they needed to go argue for a better policy. Worse, no one was sure if this was a real policy or not, so it was ultimately random whether this perspective was represented for any given promotion: sometimes good promotions would be blocked, sometimes borderline cases would be approved.
Inconsistency over time.
Implementing a new policy tends to be a mix of persistent and one-time actions. For example, let’s say you wanted to standardize all HTTP operations to use the same library across your codebase. You might add a linter check to reject known alternatives, and you’ll probably do a one-time pass across your codebase standardizing on that library.
However, two years later there are another three random HTTP libraries in your codebase, creeping into the cracks surrounding your linting. If the policy is written down, and a few people read it, then there’s a number of ways this could be nonetheless prevented. If it’s not written down, it’s much less likely someone will remember, and much more likely they won’t remember the rationale well enough to argue about it.
Hazard to new leadership.
When a new Staff-plus engineer or executive joins a company, it’s common to blame them for failing to understand the existing context behind decisions. That’s fair: a big part of senior leadership is uncovering and understanding context. It’s also unfair: explicit documentation of prior thinking would have made this much easier for them.
Every particularly bad new-leader onboarding that I’ve seen has involved a new leader coming into an unfilled role, that the new leader’s manager didn’t know how to do. In those cases, success is entirely dependent on that new leader’s ability and interest in learning.
In most ways, the practice of documenting strategy has a lot in common with succession planning, where the full benefits accrue to the organization rather than to the individual doing it. It’s possible to maintain things when the original authors are present, appreciating the value requires stepping outside yourself for a moment to value things that will matter most to the organization when you’re no longer a member.
Information herd immunity
A frequent objection to written strategy is that no one reads anything. There’s some truth to this: it’s extremely hard to get everyone in an organization to know something. However, I’ve never found that goal to be particularly important.
My view of information dispersal in an organization is the same as Herd immunity: you don’t need everyone to know something, just to have enough people who know something that confusion doesn’t propagate too far.
So, it may be impossible for all engineers to know strategy details, but you certainly can have every Staff-plus engineer and engineering manager know those details.
While I believe that the largest benefits of strategy accrue to the organization, rather than the individual creating it, I also believe that strategy is an underrated avenue for self-development.
The ways that I’ve seen strategy support personal development are:
Creating strategy builds self-awareness.
Starting with a concrete example, I’ve worked with several engineers who viewed themselves as extremely senior, but frequently demanded that projects were implemented using new programming languages or technologies because they personally wanted to learn about the technology. Their internal strategy was clear–they wanted to work on something fun–but following the steps to build an engineering strategy would have created a strategy that even they agreed didn’t make sense.
Strategy supports situational awareness in new environments.
Wardley mapping talks a lot about situational awareness as a prerequisite to good strategy. This is ensuring you understand the realities of your circumstances, which is the most destructive failure of new senior engineering leaders. By explicitly stating the diagnosis where the strategy applied, it makes it easier for you to debug why reusing a prior strategy in a new team or company might not work.
Strategy as your personal archive.
Just as documented strategy is institutional memory, it also serves as personal memory to understand the impact of your prior approaches. Each of us is an archivist of our prior work, pulling out the most valuable pieces to address the problem at hand. Over a long career, memory fades–and motivated reasoning creeps in–but explicit documentation doesn’t.
Indeed, part of the reason I started working on this book now rather than later is that I realized I was starting to forget the details of the strategy work I did earlier in my career. If I wanted to preserve the wisdom of that era, and ensure I didn’t have to relearn the same lessons in the future, I had to write it now.
We’ve covered why strategy can be a valuable learning mechanism for both your engineering organization and for you. We’ve shown how strategies have helped organizations deal with service migrations, monolith decomposition, and right-sizing backfilling. We’ve also discussed how inappropriate strategy contributed to Digg’s demise.
However, if I had to pick two things to emphasize as this chapter ends, it wouldn’t be any of those things. Rather, it would be two themes that I find are the most frequently ignored:
Discussions around topics like strategy often get caught up in high prestige activities like making controversial decisions, but the most effective strategists I’ve seen make more progress by actually performing the basics: writing things down, exploring widely to see how other companies solve the same problem, accepting feedback into their draft from folks who disagree with them. Strategy is useful, and doing strategy can be simple, too.
Armin Ronacher 30/01/2025 | Source: Armin Ronacher's Thoughts and Writings
After Musk's acquisition of Twitter, many people I respect and follow moved to Bluesky. I created an account there and made an honest attempt of making it my primary platform. Sadly, I found Bluesky to be surprisingly hostile towards AI content. There is an almost religious resistance to AI on there, at least in whatever corner of the platform I ended up in.
Despite these challenges, some individuals on both Twitter and Bluesky have shown genuine curiosity about my AI usage. In this post, I want to share how I use Large Language Models and why I find them so helpful.
Before moving on, I want to include an an important disclaimer: I am by no means an expert in AI; I'm mostly an enthusiastic user. Absolutely nothing here is novel! What I do here is pretty boring which to some degree is the point. I won't be covering underlying technology or my programmatic experience. This is strictly about how I use AI as a “techy consumer”.
In addition, as you read through this article you will probably figure out rather quickly that the way I use AI — despite being used in part for content creation — does not really impact intellectual property much. That said, I'm curious and open to discussions about how we should be dealing with this problem. Particularly on Bluesky a lot of the negativity towards AI is related to watering down of copyrights and human creation. I don't know the answers to these things, but I think we need to have a productive dialog here rather than wishing for the technology to go away.
In short: AI makes me significantly more productive. I recently read Aaron Francis's Tweet about how he gets a week's worth of productivity out of a day now thanks to AI. I'm not anywhere close to that, but I use AI a lot now. It has become indispensable for me for both content writing as well as programming.
Moreover, a common misconception is that AI is still at the level it was when ChatGPT first launched. Many people tried it early, saw its limitations, and never returned. However, AI is evolving rapidly. If you haven't kept up, you risk drawing inaccurate conclusions based on outdated impressions. In some sense I believe people who never tried to use AI yet, are in a better position to judge than the folks who used it two years ago and never returned.
I work with a variety of AI tools, mostly because of professional curiosity and to a smaller degree because each tool excels at something slightly different. Here are the ones I use most often:
It's probably worth mentioning that you can get most of the benefits of this from just paying for a single AI tool. I think as one expands their use, as especially as one gets better at writing prompts, the desire naturally grows to use more tooling. As for which models (and services) to use day to day I don't have a particular strong strategy and preferences change quickly. For instance after DeepSeek's R1 release, I started exploring it quite a bit for programming — and it's doing a phenomenal job at it — and as of writing that's just a few days old.
If you want to run models locally, Apple Silicon machines currently offer some of the best “bang for your buck” in terms of performance, power usage, and money. With tailscale, I can even access my MacBook's Open WebUI interface from my phone, as long as it is powered on.
This is a bit of a plug for uv quite honestly. If you have it, it's crazy quick to get Open WebUI and llm running locally:
uv tool install -p python3.11 open-webui uv tool install --with llm-ollama --with llm-claude-3 --with llm-deepseek llm
Afterwards you can launch open-webui and use the llm tool:
open-webui serve llm keys set deepseek llm -m deepseek-coder '# write timsort in python'
Functions you can import to the WebUI:
One frequent concern I hear is “you cannot trust LLMs” as they tend to hallucinate. I get this in particular when I explain that I frequently use this as a replacement for Google! However, I approach the risk of hallucination the same way I would when seeking advice from another human: people can and are routinely wrong, and you learn to cross-check selectively.
I treat AI as I would a collaborator or a pretty good intern but I remain responsible for the final outcome. In this case the intern also happens to get better month by month as models improve. And a bit like a human, that digital intern has morals and wants to be argued with. Except, of course, that some of those AI interns don't want to talk about China, while others get a stroke if you talk about certain people. But regardless of how good they get, in the end, it's my fault and my fault alone if I do the wrong thing. I won't blame the AI and I need to spot check.
However, the logical conclusion of this is not that it's wrong all the time and you need to check everything, or that you cannot trust it at all. It's similar to how you engage in a technical discussion with others about a problem. I have seen more than one situation where the conventional wisdom in the room is just wrong for a few minutes, until someone points out that we had it wrong.
Another major advantage is that AI tools are relatively open. You can run models locally and integrate them with scripts. Even the famous OpenAI which is not at all open is much more open than a Google search is. For instance, you can create a simple script for grammar-checking right from your command line.
In other words, you can integrate it locally and nobody stops you. By contrast, many, many years ago I had a tool on my computer that allowed me to issue web searches and extract text from results. That has stopped working such a long time ago that I almost forgot about it. It has stopped working because there is basically no competition in search, and Google does not want me to use it like that.
For instance, you can create a simple script for grammar checking right from your command line:
#!/bin/sh
MODEL=phi4:latest
if ping -q -c1 google.com &>/dev/null; then
MODEL=claude-3-5-sonnet-latest
fi
OLD_TEXT="$(cat)"
llm -m $MODEL "$OLD_TEXT" -s "fix spelling and grammar in the given text,
and reply with the improved text and no extra commentary.
Use double spacing."
This script can automatically switch between a local model (phi4 via Ollama) and a remote one (claude-3-5-sonnet-latest) based on internet connectivity. With a command like !llm-spell in Vim, I can fix up sentences with a single step.
Or you can manipulate the contents of the clipboard like this:
pbpaste | llm-spell | pbcopy && say "AI is done"
I don't let AI write my articles. As a non-native Speaker, mistakes and linguistic quirks are part of me. That said, I do rely on AI tools for helping me write.
Often, after I have made my first pass through a page, I ask an LLM to read through it and give me comments. I have a Apple Note with various prompts I can use or I just come up with what I need in the moment. The process is not particularly scripted out. I basically talk to the LLM to figure out what it thinks of the text, etc.
Here are some of the things I use AI for when writing:
This is probably the bulk of what I do and it falls into three major categories.
Now it's not great at all languages. I find it performs exceptionally well at Python, somewhat okay at JavaScript and horrible at Rust. However, particularly if I wanna solve an algorithmic problem, I can often iterate in Python until I have what I want, then ask it to translate it to Rust. That's from my experience at least a decent way to get going quickly on some harder problems.
Pair programming is fantastic, but not always possible if you're programming alone at home. An LLM can serve as a stand-in discussion partner — a poor man's interlocutor. Thankfully it's actually not all that poor, because it's really good at telling you what stuff is called. This is most fun if you do it with voice mode, but it also works quite well in text.
Here a motivating example from voice mode:
A: So, do you remember if you have, like, strcat, it always copies from the start, but if you keep concatenating to the same buffer, it's really inefficient. So, remember what the Big-O of that is if you do it in a loop, and wasn't there, like, a humorous name for this kind of algorithm?
AI: You’re thinking of the “Schlemiel the Painter’s algorithm.” It’s a humorous way of describing the inefficiency of repeatedly appending to a string using operations like strcat, which copies from the start each time. The Big O complexity for this operation is O(n^2) if done repeatedly in a loop, since each strcat operation takes time proportional to the current length of the string.
It's also fun and educational for my kids. They ask a lot of questions, and ChatGPT can tailor its explanations to their age level. If the explanation is still too advanced, I can immediately ask it to simplify further. If it was too dumbed down, I can make it go the other way. Now again, there is always a risk that it gets it wrong, but that is okay. Part of all of this is to convey to the children the idea that we need to validate it and that being critical is important.
What makes ChatGPT so enjoyable here is that it's able to keep the conversation going — it has state. Answered in large metric numbers? Can always ask it to scale to elephants or soccer fields.
ChatGPT is also incredibly helpful when having to work with multiple languages. For a recent example, my kids have Greek friends and we tried to understand the difference between some Greek words that came up. I have no idea how to write it, Google translate does not understand my attempts of pronouncing them either. However, ChatGPT does. If I ask it in voice mode what “pa-me-spee-tee” in Greek means it knows what I tried to mumble and replies in a helpful manner.
Lastly the use on the go. Sometimes I'm just not in a position where I can actually write it down on a keyboard, but if I basically talk to ChatGPT, it can transcribe it down and it will also remove some of my “um's” and it will just be there and I can copy-paste it later and edit it. (To prove a point, I transcribed the previous sentence with the mobile app). This is a very different experience than transcribing on iOS natively. It understands enough context of what I'm babbling that it will start and stop transcribing. Now this is not perfect, but pretty damn good and incredibly helpful.
The multilingual aspect is particularly helpful because our family is multilingual. Being able to fluidly switch between German, Russian and English is such a refreshing experience.
I mentioned earlier that I don't let LLMs write my texts. I also don't use AI to make illustrations, though I have in the past. The reason is that there is a certain style that goes along with these illustrations, which is just incredibly off-putting. They are noticeably AI-generated, and typically quite low-quality. People have taken to calling these images “AI slop” and I personally respond really badly to it. When someone throws me a pull request, an email or a text message that is obviously AI-generated without disclosing this, I immediately have a very low opinion of them.
Slop like hallucinations are a problem, but they are only a problem if you don't use your brain. Even the worst slop can be the foundation of really good content. I'm a horrible artist, but I can use Illustrator. Even an AI slop image can help me trace the person in the pose I wanted. Likewise you can throw your notes into a document and let the AI imagine a story around it. You probably can't use that story right away, but you can use it as potential inspiration.
AI tools, at their best, feel less like disruptive and dark magic and more like a natural extension of the creative process as long as you see them as curious collaborators. My approach isn't about outsourcing thinking, but augmenting it: using LLMs to accelerate grunt work, untangle mental knots, and prototype ideas faster. Skepticism is healthy, but dismissing AI outright risks missing its potential as a multiplier for those willing to engage critically.
Osvaldo Santana 28/01/2025 | Source: osantana.me
Eu sempre detestei aprender línguas. Só me esforço para aprender o básico de um idioma quando realmente não há outra forma de sobreviver sem ele. Foi assim que acabei aprendendo o suficiente de inglês e português.
Lembro que, na escola, eu odiava as aulas de português. A única parte que eu realmente curtia era Literatura. Todo o resto era um verdadeiro martírio.
Mas o destino, com seu senso de humor peculiar, colocou meus dois professores de português como vizinhos de muro. Em anos pares, eu tinha aula com uma professora; em anos ímpares, com um professor.
Como eu sempre ficava de recuperação nessa matéria, minha mãe acabava pagando aulas particulares — com o professor do outro ano. O resultado? Eu tinha o dobro de aulas de português com os dois vizinhos ao longo do ano inteiro.
Por conta disso, acabei aprendendo algumas coisas “úteis completamente inúteis”. Por exemplo, sei que toda paroxítona terminada em ditongo crescente era acentuada, mas que, com a nova reforma ortográfica, isso deixou de ser verdade. Se antes “idéia” tinha acento, agora não tem mais.
Outra coisa que aprendi, à base de muita dor e sofrimento, foi análise sintática. Toda aquela parte que disseca as sentenças e mostra como elas são compostas: os sujeitos e predicados, os verbos com suas “transitividades”, os objetos, e por aí vai. Tudo isso foi entendido, ainda que sem compreender o mais importante: por que, afinal, eu estava aprendendo isso?
Nessa época, eu já sabia programar e até tinha começado a trabalhar profissionalmente com programação (sim, comecei bem antes da lei permitir). Mesmo assim, eu não conseguia “conectar os pontos” e enxergar as relações entre a gramática e a programação.
Programar é comunicar. E, para comunicar, precisamos de uma linguagem comum entre todos os envolvidos no processo.
Ao longo dos anos, a computação desenvolveu uma infinidade de linguagens de programação, cada uma com suas próprias características e propósitos. Essas linguagens facilitam a comunicação entre programadores e computadores, além de permitir que os programadores se entendam entre si.
Por exemplo, qualquer programador que conheça Python e um computador com um interpretador Python será capaz de entender e executar códigos escritos nessa linguagem.
Assim como os idiomas e linguagens naturais, as linguagens de programação possuem gramática. Elas se preocupam com sintaxe, semântica, pragmática, entre outros aspectos que fazem parte do estudo das línguas. Além disso, também possuem um léxico e vocabulário próprios que, se não estão estritamente sob o guarda-chuva da gramática, fazem parte do domínio linguístico.
Python, assim como muitas outras linguagens de programação, é uma linguagem imperativa. Isso significa que você escreve comandos para instruir “como” as coisas devem acontecer. Nesse paradigma, o foco está em ações e etapas específicas.
Por outro lado, linguagens que seguem outros paradigmas — como funcional, declarativo, lógico ou de marcação — não se preocupam tanto com o “como”, mas sim com o “o quê” deve ser feito.
Aqui é possível traçar um paralelo interessante entre o paradigma imperativo na programação e o modo verbal imperativo na língua portuguesa: ambos lidam com comandos e ordens. No português, a conjugação dos verbos no imperativo é usada para instruir ou demandar ações — exatamente o que fazemos ao escrever código em uma linguagem de programação imperativa.
Se programar é comunicar e usamos linguagens para transmitir algo, será que os conhecimentos de gramática de uma linguagem natural (português, inglês, alemão, etc.) podem ajudar a melhorar o nosso código?
Eu acredito que sim. E, a seguir, vou listar algumas práticas que adoto ao desenvolver meus projetos.
Vale lembrar: estou falando de projetos de software escritos em linguagens de programação imperativas, não de uma obra literária. As sugestões que vou apresentar podem melhorar o seu código, mas provavelmente seriam péssimas escolhas para seu conto, romance ou artigo acadêmico.
Escrever sentenças simples facilita a compreensão. A estrutura mais básica de uma sentença segue o padrão:
Sujeito — Verbo — Complemento
O sujeito pode assumir várias formas, mas geralmente é representado por um substantivo (nome/noun). Em linguagens orientadas a objeto, o sujeito é (quase) sempre um objeto. Assim, uma sentença simples em código poderia ser algo como:
objeto.verbo(complemento) # ou...
objeto.verboComplemento()
Parece óbvio, não? Sim, é básico, mas dito dessa forma, fica claro que os métodos de um objeto devem necessariamente ser verbos, e que os parâmetros passados ao método funcionam como complementos — nos moldes da análise sintática da transitividade dos verbos:
Exemplo: “Ela comprou um livro.” (complemento: “um livro”).
Exemplo: “Ela dormiu.”
Se o verbo usado no nome do método for transitivo, é importante lembrar que ele exige um complemento para que a ação faça sentido. Essa escolha ajuda a tornar o código mais intuitivo e compreensível.
Observe que a sentença usada como exemplo contém apenas um verbo. Mantendo essa estrutura, também garantimos que cada método realiza apenas uma ação:
1 verbo → 1 ação → 1 responsabilidade
Trabalhar dessa forma nos ajuda a garantir que nossos métodos respeitem o Princípio da Responsabilidade Única (Single Responsibility Principle – SRP), que estabelece que cada método ou função deve ter apenas uma responsabilidade clara e bem definida.
Interfaces são um meio de disponibilizar o acesso a componentes do seu sistema. Existem diversos tipos de interfaces que podemos utilizar, e uma delas são as APIs REST, que usam o protocolo HTTP.
O protocolo HTTP trabalha de forma stateless: um cliente faz uma requisição (request) para um recurso/documento/objeto em um servidor, e o servidor retorna uma resposta (response) para o cliente. O formato de uma requisição HTTP é aproximadamente assim:
METHOD /url
Header1: ...
HeaderN: ...
<payload>
Alguns dos METHOD
s disponíveis são: GET
, POST
, PUT
, PATCH
, DELETE
, etc. Perceba que todos os métodos são verbos, ou seja, indicam ações requisitando algo do servidor. Seguindo nossa lógica gramatical, podemos inferir que a /url
é o complemento (geralmente representado por um substantivo).
Como mencionado anteriormente, é importante manter as sentenças simples, ou seja, com apenas um verbo por requisição. Uma boa interface deveria seguir este padrão:
GET /users/1
POST /users
DELETE /users/c0ffe
No entanto, é comum encontrar requisições que apresentam dois verbos na mesma sentença:
POST /contracts/1/remove
GET /documents/2/cancel
Percebe como isso soa estranho? Casos como esse geralmente aparecem quando a ação envolve processos mais complexos. Por exemplo, cancelar um contrato pode exigir várias etapas, e um simples DELETE /contracts/1 pode não ser suficiente. Quando isso acontece, uma boa prática é transformar o segundo verbo em um substantivo, tratando o processo como um objeto.
Por exemplo, ao invés de usar algo como:
POST /contracts/1/cancel # verbo + verbo
Eu prefiro reestruturar assim:
POST /contracts/1/cancellation # verbo + substantivo
Essa abordagem não só torna a requisição mais natural, como também reflete melhor a lógica gramatical e mantém a consistência da interface.
Augusto Campos 27/01/2025 | Source: TRILUX
Continuo não sendo vegano, e talvez nunca venha a ser. Mas a consciência crescente de que ao redor tem tanta gente legal que é vegana me faz várias vezes moderar várias coisas que eu iria dizer – escolher um exemplo melhor, evitar determinada metáfora, ser menos descritivo sobre algum detalhe etc.
Não estou reclamando: é uma moderação voluntária, positiva, nada me custa fazê-la.
Embora seja voltada a evitar o desagrado em outras pessoas (das quais eu gosto mt), também acho que estou em um ponto em que dizer essas coisas que eu agora deixo de dizer já não faz mesmo parte do meu escopo.
Tenho mais coisas pra dizer sem precisar falar dessas.
O artigo "Uma transição em andamento, quanto ao discurso" foi originalmente publicado no site TRILUX, de Augusto Campos.
Augusto Campos 27/01/2025 | Source: TRILUX
"Não há o que fazer" é a resposta de quem já decidiu que não se importa que o seu grupo vai sofrer mais do que os grupos que se prepararam.
O artigo "Se preparar é melhor do que a alternativa" foi originalmente publicado no site TRILUX, de Augusto Campos.
Nicole Tietz-Sokolskaya 27/01/2025 | Source: ntietz.com blog
It's kind of dark times right now. And I'm definitely only talking about the days being short.
It's pretty dark out right now, since it's the winter in the northern hemisphere. Every year, I start to realize somewhere around January that I'm tired, really tired, and don't want to do anything. I'm not sure that this is seasonal affective disorder, but I did start to explore it and found that bright light makes me feel a lot better in the winter.
The problem is, I go through this discovery every year, it seems.
After I first learned that bright light can actually help you feel better in the winter, I got a Happy Light. That's the actual branding on it, and it might have helped, maybe? But it was really inconvenient to use.
I got one of the more affordable ones, which meant a small panel I had to sit uncomfortably close to. And I was supposed to use it almost first thing in the morning for 30 minutes. That's... really not going to fit in my routine, especially now that I have two young kids who are raring to go right at 7am, and they do not want to deal with mom sitting in front of a lamp that early.
So I just kind of stopped using it, and would poke at it in the drawer occasionally but not really do anything with it. Instead, I got a few more lamps in my office. These seemed to help, a little, but that might just be that I like my spaces well lit1.
Somewhere along the line I saw a post talking about winter energy and light. And the author suggested that indoor lights just have to be bright. They recommended some high-watt LED bulbs.
I went ahead and ordered some and... these are very bright. They're 100 watts of LEDs each, and I have two of them. I plugged them in to these plug-in fixtures that have inline switches.
These seemed to really help me out a lot. On days I used them, I was more energetic and felt better. But I had a lot of days when I didn't use them, because using them was inconvenient.
The main problems were that I had to turn on two switches manually, and they were too bright to look at. Turning them on doesn't sound like a lot, but doing it every day when you're already struggling can be just enough friction to avoid doing it! The brightness was the biggest issue, because they were blinding to look at and cast this reverse shadow out of my office. The brightness was the point, so now do I deal with that?
Turning on the switches was easy. I put them on some smart outlets, set a schedule, and now they turn on and off at my predetermined times!
Then I had to make a shade for the lamps. My main goal here is to diffuse the light a bit, so I was thinking of a small wooden frame wrapped in a light white fabric. That would spread the light out and avoid having any spots that are too bright to look at.
I took measurements for where it was going to fit, then headed out to my workshop. Then I ripped a scrap board into 0.75"x0.75" stock. I ended up with just enough for the design.
This is what I ended up with, after it was glued up.
Here's a bonus detail shot of the extremely unnecessary joinery here. But hey, this joinery was incredibly satisfying, and the whole thing went together without any nails or screws. Joy makes it worthwhile.
And then with the fabric attached2, it ended looking pretty nice. In its intended home, you can see that it does its job!
It's now not painful to look straight at the lamp. You wouldn't want to do it, but it won't hurt you at least!
Of course, we need these bright lights during the winter, during the dark days. But summer will come again.
Not all of us will see the long days. We'll lose some to the winter. But summer will come again, and we'll be able to go outside freely. Soak in the sun. Be ourselves. Dance in sunshowers.
It's hard to see that in the darkest of times. I'm writing this for myself as much as I'm writing it for you. But believe me: you are strong, and I am strong, and together, we will survive.
It seems like this winter is going to last a long time. And it probably will. But I'm not just talking about the winter, and the dark times are going to end, eventually.
Most places are very poorly lit and don't have enough lamps. You should have a lot of lamps in every room to make it as bright and cheery as possible! I refuse to believe otherwise, but my wife, a gremlin of the darkness3, does like rooms quite dim.
My wife helped with this step. She works with fabric a lot, and this was our first project combining our respective crafts. Certainly not our last!
She approved this phrasing.
Carson Gross 27/01/2025 | Source: > htmx - high power tools for html
I’m very excited to be able to interview @defunkt, the author of pjax, an early hypermedia-oriented javascript library that served as an inspiration for intercooler.js, which later became htmx. He’s done a few other things too, like co-founding GitHub, but in this interview I want to focus on pjax, how it came to be, what influenced it and what it in turn influenced.
Thank you for agreeing to an interview @defunkt!
Q: To begin with, why don’t you give the readers a bit of your background both professionally & technically:
I think I can sum up most of my technical background in two quick anecdotes:
For “show and tell” in 6th grade, I brought in a printout of a web page I had made - including its source code. I like to imagine that everyone was impressed.
Right after 7th grade, a bunch of rowdy high schoolers took me down to the local university during a Linux installfest and put Red Hat on my family’s old PC. That became my main computer for all of high school.
So pretty much from the start I was a web-slinging, UNIX-loving hippie.
In terms of coding, I started on QBasic using the IBM PC running OS/2 in my grandparents’ basement. Then I got deep into MUDs (and MUSHes and MUXes and MOOs…) which were written in C and usually included their own custom scripting language. Writing C was “hardcoding”, writing scripts was “softcoding”. I had no idea what I was doing in C, but I really liked the softcoding aspect.
The same rowdy high schoolers who introduced me to Linux gave me the O’Reilly camel book and told me to learn Perl. I did not enjoy it. But they also showed me php3, and suddenly it all came together: HTML combined with MUD-like softcoding. I was hooked.
I tried other things like ASP 3.0 and Visual Basic, but ultimately PHP was my jam for all of high school. I loved making dynamic webpages, and I loved Linux servers. My friends and I had a comedy website in high school that shall remain nameless, and I wrote the whole mysql/php backend myself before blogging software was popular. It was so much fun.
My first year of college I switched to Gentoo and became fascinated with their package manager, which was written in Python. You could write real Linux tools with it, which was amazing, but at the time the web story felt weak.
I bought the huge Python O’Reilly book and was making my way through it when, randomly, I discovered Ruby on Rails. It hit me like a bolt of lightning and suddenly my PHP and Python days were over.
At the same time, Web 2.0 had just been coined and JavaScript was, like, “Hey, everyone. I’ve been here all along.” So as I was learning Rails, I was also learning JavaScript. Rails had helpers to abstract the JS away, but I actually really liked the language (mostly) and wanted to learn it without relying on a framework or library.
The combination of administering my own Linux servers, writing backend code in Rails, and writing frontend code in JavaScript made me fall deeper in love with the web as a platform and exposed me to concepts like REST and HATEOAS. Which, as someone who had been writing HTML for over a decade, felt natural and made sense.
GitHub launched in 2008 powered by, surprise, Gentoo, Rails, and JavaScript. But due to GitHub’s position as not just a Rails community, but a collection of programming communities, I quickly evolved into a massive polyglot.
I went back and learned Python, competing in a few programming competitions like Django Dash and attending (and speaking) at different PyCons. I learned Objective-C and made Mac (and later iPhone) apps. I learned Scheme and Lisp, eventually switching to Emacs from Vim and writing tons of Emacs Lisp. I went back and learned what all the sigils mean in Perl. Then Lua, Java, C++, C, even C# - I wanted to try everything.
And I’m still that way today. I’ve written projects in Go, Rust, Haskell, OCaml, F#, all sorts of Lisps (Chicken Scheme, Clojure, Racket, Gambit), and more. I’ve written a dozen programming languages, including a few that can actually do something. Right now I’m learning Zig.
But I always go back to the web. It’s why I created the Atom text editor using web technologies, it’s why Electron exists, and it’s why I just cofounded the Ladybird Browser Initiative with Andreas Kling to develop the independent, open source Ladybird web browser.
Q: Can you give me the history of how pjax came to be?
It all starts with XMLHttpRequest, of course. Ajax. When I was growing up, walking to school both ways uphill in the snow, the web was simple: you clicked on a link and a new web page loaded. Nothing fancy. It was a thing of beauty, and it was good.
Then folks started building email clients and all sorts of application-like programs in HTML using
<frames>
and friends. It was not very beautiful, and not very good, but there was something there.Luckily, in the mid-2000s, Gmail and Ajax changed things. Hotmail had been around for a while, but Gmail was fast. By updating content without a full page load using XMLHttpRequest, you could make a webpage that felt like a desktop application without resorting to frames or other chicanery. And while other sites had used Ajax before Gmail, Gmail became so popular that it really put this technique on the map.
Soon Ajax, along with the ability to add rounded corners to web pages, ushered in the era known as Web 2.0. By 2010, more and more web developers were pushing more and more of their code into JavaScript and loading dynamic content with Ajax. There was just one problem: in the original, good model of the web, each page had a unique URL that you could use to load its content in any context. This is one of the innovations of the web. When using Ajax, however, the URL doesn’t change. And even worse, it can’t be changed - not the part that gets read by the server, anyway. The web was broken.
As is tradition, developers created hacks to work around this limitation. The era of the #! began, pioneered by Ajax-heavy sites like Facebook and Twitter. Instead of http://twitter.com/htmx_org, you’d see http://twitter.com/#!/htmx_org in your browser’s URL bar when visiting someone’s profile. The # was traditionally used for anchor tags, to link to a sub-section within a full web page, and could be modified by JavaScript. These ancient web 2.0 developers took advantage of #’s malleability and started using it to represent permanent content that could be updated inline, much like a real URL. The only problem was that your server code never saw the # part of a URL when serving a request, so now you needed to start changing your backend architecture to make everything work.
Oh, and it was all very buggy. That was a problem too.
As an HTTP purist, I detested the #!. But I didn’t have a better way.
Time passed and lo, a solution appeared. One magical day, the #!s quietly disappeared from Facebook, replaced by good old fashioned URLs. Had they abandoned Web 2.0? No… they had found a better way.
The
history.pushState()
function, along with its siblinghistory.replaceState()
, had been recently added to all major web browsers. Facebook quickly took advantage of this new API to update the full URL in your browser whenever changing content via Ajax, returning the web to its previous glory.And so there it was: the Missing Link.
We had our solution, but now a new problem: GitHub was not an SPA, and I didn’t want it to be one. By 2011 I had been writing JavaScript for six years - more than enough time to know that too much JS is a terrible thing. The original GitHub Issue Tracker was a Gmail-style web application built entirely in JS, circa 2009. It was an awful experience for me, GitHub developers, and, ultimately, our users.
That said, I still believed Ajax could dramatically speed up a web page’s user interface and improve the overall experience. I just didn’t want to do it by writing lots of, or any, JavaScript. I liked the simple request/response paradigm that the web was built on.
Thus, Pjax was born. It sped up GitHub’s UI by loading new pages via Ajax instead of full page loads, correctly updating URLs while not requiring any JS beyond the Pjax library itself. Our developers could just tag a link with
[data-pjax]
and our backend application would then automatically render a page’s content without any layout, quickly getting you just the data you need without asking the browser to reload any JS or CSS or HTML that didn’t need to change. It also ( mostly) worked with the back button, just like regular web pages, and it had a JS API if you did need to dip into the dark side and write something custom.The first commit to Pjax was Feb 26, 2011 and it was released publicly in late March 2011, after we had been using it to power GitHub.com for some time.
Q: I recall it being a big deal in the rails community. Did the advent of turbolinks hurt adoption there?
My goal wasn’t really adoption of the library. If it was, I probably would have put in the work to decouple it from jQuery. At the time, I was deep in building GitHub and wasn’t the best steward of my many existing open source projects.
What I wanted instead was adoption of the idea - I wanted people to know about
pushState()
, and I wanted people to know there were ways to build websites other than just doing everything by hand in JavaScript. Rendering pages in whole or in part on the server was still viable, and could be sped up using modern techniques.Turbolinks being created and integrated into Rails was amazing to see, and not entirely unsurprising. I was a huge fan of Sam Stephenson’s work even pre-GitHub, and we had very similiar ideas about HTTP and the web. Part of my thinking was influenced by him and the Rails community, and part of what drew me to the Rails community was the shared ideas around what’s great about the web.
Besides being coupled to jQuery, pjax’s approach was quite limited. It was a simple library. I knew that other people could take it further, and I’m glad they did.
Q: How much “theory” was there to pjax? Did you think much about hypermedia, REST, etc. when you were building it? ( I backed into the theory after I had built intercooler, curious how it went for you!)
Not much. It started by appending
?pjax=1
to every request, but before release we switched it to send anX-PJAX
header instead. Very fancy.Early GitHub developer Rick Olson (@technoweenie), also from the Rails community, was the person who introduced me to HATEOAS and drove that philosophy in GitHub’s API. So anything good about Pjax came from him and Josh Peek, another early Rails-er.
My focus was mostly on the user experience, the developer experience, and trying to stick to what made the web great.
Carson Gross 27/01/2025 | Source: > htmx - high power tools for html
I’m delighted to be able to interview Makinde Adeagbo, one of the creators of Primer, an hypermedia-oriented javascript library that was being used at Facebook in the 2000s.
Thank you for agreeing to an interview!
Q: To begin with, why don’t you give the readers a bit of your background both professionally & technically?
I’ve always been into tech. In high school, I used to build computers for friends and family. I took the computer science classes my high school offered and went on to study computer science in college. I was always amazed by the fact that I could build cool things—games, tools, etc.—with just a computer and an internet connection.
I was lucky enough to participate in Explore Microsoft, an internship that identifies underrepresented college freshmen and gives them a shot at working at Microsoft. After that experience, I was sold on software as my future. I later interned at Apple and Microsoft again. During college, I also worked at Facebook when the company was about 150 employees. It was an incredible experience where engineers had near-total freedom to build and contribute to the company’s growth. It was exactly what I needed early in my career, and I thrived. From there, I went on to work at Dropbox and Pinterest and also co-founded the nonprofit, /dev/color.
Q: Can you give me the history of how Primer came to be?
In 2010, the Facebook website was sloooow. This wasn’t the fault of any specific person—each engineer was adding features and, along the way, small amounts of JavaScript. However, we didn’t have a coherent system for sharing libraries or tracking how much JavaScript was being shipped with each page. Over time, this led to the 90th-percentile page load time ballooning to about 10 seconds! Midway through the year, reducing that load time by half became one of the company’s three top priorities. I was on a small team of engineers tasked with making it happen.
As we investigated where most of the JavaScript was coming from, we noticed the majority of it was performing simple tasks. These tasks involved either fetching additional data or markup from the server, or submitting a form and then receiving more markup to update the page. With limited time, we decided to build a small solution to abstract those patterns and reduce the amount of code needed on the page.
Tom Occhino and I built the first version of Primer and converted a few use cases ourselves to ensure it worked well. Once we were confident, we brought more engineers into the effort to scale it across the codebase.
Q: Primer & React were both created at Facebook. Was there any internal competition or discussion between the teams? What did that look like?
The two projects came from different eras, needs, and parts of the codebase. As far as I know, there was never any competition between them.
Primer worked well for the type of website we were building in 2010. A key part of its success was understanding that it wasn’t meant to handle every use case. It was an 80/20 solution, and we didn’t use it for particularly complex interactions (like the interface for creating a new post).
React emerged from a completely different challenge: the ads tools. Managing, composing, and tracking hundreds of ads required a highly involved, complex interface. I’m not sure if they ever attempted to use Primer for it, but it would have been a miserable experience. We didn’t have the terminology at the time, but this was a classic example of a single-page application needing purpose-built tools. The users of that site also had a very different profile from someone browsing their home feed or clicking through photos.
Q: Why do you think Primer ultimately failed at Facebook?
I don’t think there’s any single technical solution that has spanned 15 years in Facebook’s platform. The site’s needs evolve, technology changes, and the internet’s landscape shifts over time. Primer served the site well for its time and constraints, but eventually, the product demanded richer interactivity, which wasn’t what Primer was designed for.
Other tradeoffs also come into play: developer ease/speed, security, scalability. These priorities and tradeoffs change over time, especially as a company grows 10x in size.
More broadly, these things tend to work in cycles in the industry. Streamlined, fast solutions give way to richer, heavier tools, which eventually cycle back to streamlined and fast. I wouldn’t be surprised if something like Primer made a comeback at some point.
Q: How much “theory” was there to Primer? Did you think much about hypermedia, REST, etc., when you were building it?
Not much. Honestly, I was young and didn’t know a ton about the internet’s history or past research. I was drawn to the simplicity of the web’s underlying building blocks and thought it was fun to use those tools as they were designed. But, as always, the web is a layer cake of hacks and bandaids, so you have to be flexible.
Q: What were the most important technical lessons you took away from Primer?
Honestly, the biggest lessons were about people. Building a system like Primer is one thing, but for it to succeed, you have to train hundreds of engineers to use it. You have to teach them to think differently about building things, ask questions at the right time, and avoid going too far in the wrong direction. At the end of the day, even if the system is perfect, if engineers hate using it, it won’t succeed.
Carson Gross 27/01/2025 | Source: > htmx - high power tools for html
Mike Amundsen is a computer programmer, author and speaker, and is one of the world leading experts on REST & hypermedia. He has been writing about REST and Hypermedia since 2008 and has published two books on the ideas:
Mike agreed to do an interview with me on his view of the history of hypermedia and where things are today.
Q: The “standard” history of hypermedia is Vannevar Bush’s “As We May Think”, followed by Nelson introducing the term “hypermedia” in 1963, Englebart’s “Mother of all Demos” in 1968 and then Berners-Lee creating The Web in 1990. Are there any other important points you see along the way?
I think starting the history of what I call the “modern web” with Bush makes a lot of sense. Primarily because you can directly link Bush to Engelbart to Nelson to Berners-Lee to Fielding. That’s more than half a century of scholarship, design, and implementation that we can study, learn from, and expand upon.
At the same time, I think there is an unsung hero in the hypermedia story; one stretches back to the early 20th century. I am referring to the Belgian author and entrepreneur Paul Otlet. Otlet had a vision of a multimedia information system he named the “World Wide Network”. He saw how we could combine text, audio, and video into a mix of live and on-demand replay of content from around the world. He even envisioned a kind of multimedia workstation that supported searching, storing, and playing content in what was the earliest instance I can find of an understanding of what we call “streaming services” today.
To back all this up, he created a community of researchers that would read monographs, articles, and books then summarize them to fit on a page or less. He then designed an identification system – much like our URI/URN/URLs today and created a massive card catalog system to enable searching and collating the results into a package that could be shared – even by postal service – with recipients. He created web search by mail in the 1920s!
This was a man well ahead of his time that I’d like to see talked about more in hypermedia and information system circles.
Question: Why do you think that The Web won over other hypermedia systems (such as Xanadu)?
The short reason is, I think, that Xanadu was a much more detailed and specific way of thinking about linking documents, documenting provenance, and compensating authors. That’s a grand vision that was difficult to implement back in the 60s and 70s when Nelson was sharing his ideas.
There are, of course, lots of other factors. Berners-Lee’s vision was much smaller (he was trying to make it easy for CERN staff to share contact information!). Berners-Lee was, I think, much more pragmatic about the implementation details. He himself said he used existing tech (DNS, packet networking, etc.) to implement his ideas. That meant he attracted interest from lots of different communities (telephone, information systems, computing, networking, etc.).
I would also say here that I wish Wendy Hall ’s Microcosm had gotten more traction than it did. Hall and her colleagues built an incredibly rich hypermedia system in the 90s and released it before Berners-Lee’s version of “the Web” was available. And Hall’s Microcosm held more closely to the way Bush, Englebart, and Nelson thought hypermedia systems would be implemented – primarily by storing the hyperlinks in a separate “anchor document” instead of in the source document itself.
Question: What do you think of my essay “How did REST come to mean the opposite of REST”? Are there any points you disagree with in it?
I read that piece back in 2022 when you released it and enjoyed it. While I have nothing to quibble with, really, there are a few observations I can share.
I think I see most hypermedia developers/researchers go through a kind of cycle where you get exposed to “common” REST, then later learn of “Fielding’s REST” and then go back to the “common REST” world with your gained knowledge and try to get others on board; usually with only a small bit of success.
I know you like memes so, I’ll add mine here. This journey away from home, into expanded knowledge and the return to the mundane life you once led is – to me – just another example of Campbell’s Hero’s Journey<g>. I feel this so strongly that I created my own Hero’s Journey presentation to deliver at API conferences over the years.
On a more direct note. I think many readers of Fielding’s Dissertation (for those who actually read it) miss some key points. Fielding’s paper is about designing network architecture, not about REST. REST is offered as a real-world example but it is just that; an example of his approach to information network design. There have been other designs from the same school (UC Irvine) including Justin Erenkrantz’s Computational REST (CREST) and Rohit Kare’s Asynchronous REST (A-REST). These were efforts that got the message of Fielding: “Let’s design networked software systems!”
But that is much more abstract work that most real-world developers need to deal with. They have to get code out the door and up and running quickly and consistently. Fielding’s work, he admitted, was on the “scale of decades” – a scale most developers are not paid to consider.
In the long run, I think it amazing that a PhD dissertation from almost a quarter-century ago has had such a strong influence on day-to-day developers. That’s pretty rare.
Question: Hyperview, the mobile hypermedia that Adam Stepinski created, was very explicitly based on your books. Have you looked at his system?
I have looked over Hyperview and like what I see. I must admit, however, that I don’t write mobile code anymore so I’ve not actually written any hyperview code myself. But I like it.
I talked to Adam in 2022 about Hyperview in general and was impressed with his thoughts. I’d like to see more people talking about and using the Hyperview approach.
Something I am pretty sure I mentioned to Adam at the time is that Hyperview reminds me of Wireless Markup Language (WML). This was another XML-based document model aimed at rendering early web content on feature phones (before smartphone technology). Another XML-based hypermedia domain-specific document format is VoiceXML. I still think there are great applications of hypermedia-based domain-specific markup languages (DSML) and would like to see more of them in use.
Question: It’s perhaps wishful thinking, but I feel there is a resurgence in interest in the ideas of hypermedia and REST (real REST.) Are you seeing this as well? Do you have a sense if businesses are starting to recognize the strengths of this approach?
I, myself, think there is a growth in hypermedia-inspired designs and implementations and I’m glad to see it. I think much of the work of APIs in general has been leading the market to start thinking about how to lower the barrier of entry for using and interoperating with remote, independent services. And the hypermedia control paradigm (the one you and your colleagues talk about in your paper “Hypermedia Controls: Feral to Formal”) offers an excellent way to do that.
I think the biggest hurdle for using more hypermedia in business is was laid out pretty conclusively by Leonard Richardson several years ago. He helped build a powerful hypermedia-based book-sharing server and client system to support public libraries around the world. He noted that, in the library domain, each site is not a competitor but a partner. That means libraries are encouraged to make it easier to loan out books and interoperate with other libraries.
Most businesses operate on the opposite model. They typically succeed by creating barriers of entry and by hoarding assets, not sharing them. Hypermedia makes it easier to share and interact without the need of central control or other types of “gatekeeping.”
Having said that, I think a ripe territory for increased use of hypermedia to lower the bar and increase interaction is at the enterprise level in large organizations. Most big companies spend huge amounts of money building and rebuilding interfaces in order to improve their internal information system. I can’t help but think designing and implementing hypermedia-driven solutions would yield long-term savings, and near-term sustainable interoperability.
Question: Are there any concepts in hypermedia that you think we are sleeping on? Or, maybe said another way, some older ideas that are worth looking at again?
Well, as I just mentioned, I think hypermedia has a big role to play in the field of interoperability. And I think the API-era has, in some ways, distracted us from the power of hypermedia controls as a design element for service-to-service interactions.
While I think Nelson, Berners-Lee and others have done a great job of laying out the possibilities for human-to-machine interaction, I think we’ve lost sight of the possibilities hypermedia gives us for machine-to-machine interactions. I am surprised we don’t have more hypermedia-driven workflow systems available today.
And I think the rise in popularity of LLM-driven automation is another great opportunity to create hypermedia-based, composable services that can be “orchestrated” on the fly. I am worried that we’ll get too tied up in trying to make generative AI systems look and act like human users and miss the chance to design hypermedia workflow designed specifically to take advantage of the strengths of statistical language models.
I’ve seen some interesting things in this area including Zdenek Nemec’s Superface project which has been working on this hypermedia-driven workflow for several years.
I just think there are lots of opportunities to apply what we’ve learned from the last 100 years (when you include Otlet) of hypermedia thinking. And I’m looking forward to seeing what comes next.
Carson Gross 27/01/2025 | Source: > htmx - high power tools for html
“Vendoring” software is a technique where you copy the source of another project directly into your own project.
It is an old technique that has been used for time immemorial in software development, but the term “vendoring” to describe it appears to have originated in the ruby community.
Vendoring can be and is still used today. You can vendor htmx, for example, quite easily.
Assuming you have a /js/vendor
directory in your project, you can just download the source into your own project like
so:
curl https://raw.githubusercontent.com/bigskysoftware/htmx/refs/tags/v2.0.4/dist/htmx.min.js > /js/vendor/htmx-2.0.4.min.js
You then include the library in your head
tag:
<script src="/js/vendor/htmx-2.0.4.min.js"></script>
And then you check the htmx source into your own source control repository. (I would even recommend considering using the non-minimized version, so you can better understand and debug the code.)
That’s it, that’s vendoring.
OK, great, so what are some strengths of vendoring libraries like this?
It turns out there are quite a few:
On the other hand, vendoring also has one massive drawback: there typically isn’t a good way to deal with what is called the transitive dependency problem.
If htmx had sub-dependencies, that is, other libraries that it depended on, then to vendor it properly you would have to start vendoring all those libraries as well. And if those dependencies had further dependencies, you’d need to install them as well… And on and on.
Worse, two dependencies might depend on the same library, and you’ll need to make sure you get the correct version of that library for everything to work.
This can get pretty difficult to deal with, but I want to make a paradoxical claim that this weakness (and, again, it’s a real one) is actually a strength in some way:
Because dealing with large numbers of dependencies is difficult, vendoring encourages a culture of independence.
You get more of what you make easy, and if you make dependencies easy, you get more of them. Making dependencies, especially transitive dependencies, more difficult would make them less common.
And, as we will see in a bit, maybe fewer dependencies isn’t such a bad thing.
That’s great and all, but there are significant drawbacks to vendoring, particular the transitive dependency problem.
Modern software engineering uses dependency managers to deal with the dependencies of software projects. These tools allow you to specify your projects dependencies, typically via some sort of file. They then they will install those dependencies and resolve and manage all the other dependencies that are necessary for those dependencies to work.
One of the most widely used package managers is NPM: The Node Package Manager. Despite having
no runtime dependencies, htmx uses NPM to specify 16 development dependencies. Development dependencies are dependencies
that are necessary for development of htmx, but not for running it. You can see the dependencies at the bottom of
the NPM package.json
file for the project.
Dependency managers are a crucial part of modern software development and many developers today couldn’t imagine writing software without them.
So dependency managers solve the transitive dependency problem that vendoring has. But, as with everything in software
engineering, there are tradeoffs associated with them. To see some of these tradeoffs, let’s take a look at the
package-lock.json
file in htmx.
NPM generates a package-lock.json
file that contains the resolved transitive closure of dependencies for a project, with
the concrete versions of those dependencies. This helps ensure that the same dependencies are used unless a user
explicitly updates them.
If you take a look at the package-lock.json
for htmx, you will find that the original 13 development dependencies have
ballooned into a total of 411 dependencies when all is said and done.
htmx, it turns out, relies on a huge number of packages, despite priding itself on being a relatively lean. In fact,
the node_modules
folder in htmx is a whopping 110 megabytes!
But, beyond this bloat there are deeper problems lurking in that mass of dependencies.
While writing this essay I found that htmx apparently depends on the
array.prototype.findlastindex
, a
polyfill for a JavaScript feature introduced in
2022.
Now, htmx 1.x is IE compatible, and I don’t want polyfills for anything: I want to write code that will work in IE without any additional library support. And yet a polyfill has snuck in via a chain of dependencies (htmx does not directly rely on it) that introduces a dangerous polyfill that would let me write code that would break in IE, as well as other older browsers.
This polyfill may or may not be available when I run the htmx test suite (it’s hard to tell) but that’s the point: some dangerous code has snuck into my project without me even knowing it, due to the number and complexity of the (development) dependencies it has.
This demonstrates significant cultural problem with dependency managers:
They tend to foster a culture of, well, dependency.
A spectacular example of this was the infamous left-pad incident, in which an engineer took down a widely used package and broke the build at companies like Facebook, PayPal, Netflix, etc.
That was a relatively innocuous, although splashy, issue, but a more serious concern is supply chain attacks, where a hostile entity is able to compromise a company via code injected unwittingly via dependencies.
The larger our dependency graph gets, the worse these problems get.
I’m not the only person thinking about our culture of dependency. Here’s what some other, smarter folks have to say about it:
Armin Ronacher, creator of flask recently said this on the ol’twits:
The more I build software, the more I despise dependencies. I greatly prefer people copy/pasting stuff into their own code bases or re-implement it. Unfortunately the vibe of the time does not embrace that idea much. I need that vibe shift.
He also wrote a great blog post about his experience with package management in the Rust ecosystem:
It’s time to have a new perspective: we should give kudos to engineers who write a small function themselves instead of hooking in a transitive web of crates. We should be suspicious of big crate graphs. Celebrated are the minimal dependencies, the humble function that just quietly does the job, the code that doesn’t need to be touched for years because it was done right once.
Please go read it in full.
Back in 2021, Tom Macwright wrote this in Vendor by default
But one thing that I do think is sort of unusual is: I’m vendoring a lot of stuff.
Vendoring, in the programming sense, means “copying the source code of another project into your project.” It’s in contrast to the practice of using dependencies, which would be adding another project’s name to your package.json file and having npm or yarn download and link it up for you.
I highly recommend reading his take on vendoring as well.
Some good news, if you are an open source developer and like the idea of vendoring, is that there is a simple way to make your software vendor-friendly: remove as many dependencies as you can.
DaisyUI, for example, has been in the process of removing their dependencies, going from 100 dependencies in version 3 to 0 in version 5.
There is also a set htmx-adjacent projects that are taking vendoring seriously:
None of these JavaScript projects are available in NPM, and all of them recommend vendoring the software into your own project as the primary installation mechanism.
The last thing I want to briefly mention is a technology that combines both vendoring and dependency management: vendor-first dependency managers. I have never worked with one before, but I have been pointed to vend, a common lisp vendor oriented package manager (with a great README), as well as go’s vendoring option.
In writing this essay, I also came across vendorpull and git-vendor, both of which are small but interesting projects.
These all look like excellent tools, and it seems to me that there is an opportunity for some of them (and tools like them) to add additional functionality to address the traditional weaknesses of vendoring, for example:
With these additional features I wonder if vendor-first dependency managers could compete with “normal” dependency managers in modern software development, perhaps combining some of the benefits of both approaches.
Regardless, I hope that this essay has helped you think a bit more about dependencies and perhaps planted the idea that maybe your software could be a little less, well, dependent on dependencies.
Augusto Campos 26/01/2025 | Source: TRILUX
Todos os dias o meu dia começa muito bem, porque eu preparo o meu café da manhã na sacada, que tem vista pra uma área de preservação permanente com mata preservada, cheia de pássaros, eventuais macaquinhos etc.
Hoje, apesar de ser domingo, as circunstâncias me levaram a acordar pelas 6h. Fui para a rotina usual de fazer café, e aparentemente o netinho que está visitando os vizinhos de cima também madrugou, e estava sozinho na sacada deles, fazendo o que crianças que acordam antes dos adultos da casa fazem: barulhos divertidos.
Aí estava lá eu na minha rotina, e a criança imitando vários animais, com pausas que nitidamente serviam para verificar se eles, lá da reserva, respondiam.
Naturalmente não me furtei ao dever, e fiz o que se faz nessas ocasiões: esperei a criança imitar um bicho mais fácil, e respondi assim que ela parou pra escutar. A partir daí, ficou ela imitando um lobo de lá, e eu respondendo de cá, até o café ficar pronto
🐺📞🐺
No final já havia algum adulto acordado, e ainda ouvi a criança entusiasmada contando o que estava acontecendo. Ganhou uma história!
O artigo "Dentro (do meu prédio) tem 2 lobos" foi originalmente publicado no site TRILUX, de Augusto Campos.
Anonymous 25/01/2025 | Source: is this it?
It’s the middle of summer and the golden tussock has overtaken its shorter agricultural cousin. Back down the Manuherikia Valley, bands of green grass run up the sides of the Dunstan Range. Productive Land they call it. Indeed it is, but visually that splash of green is at odds with the golden tussock or grand blue skies. That’s all south of the Home Hills, along with the letterboxes, farming equipment, and power lines. North of the Home Hills one gets taken back in time. There’s no longer the view over the wider Maniototo plain. Instead, the mountains close in and the St Bathans Range to the west and Hawkdun Range to the east become more prominent.
Karearea (falcons) sat perched on wooden fence posts taking a break from their usual soaring ways. I tried to spot them as we were driving so we could pull over and take a photograph. They always saw us first, and would take a short jump off the fence post before ascending into the towering skies above.
Cue an incredibly photogenic hut. Speculating here – an old musterers hut? Now maintained by the station owner for supplies and a place to keep warm when it snows. There were two dozen old oil drums spread about the place. My first thought was it’s just rubbish, keeping in tune with the skeletal remnants of a fallen trees lying about. On closer inspection the oil drums were all cleaned, tops cut off, wooden shelf mounted length ways in the barrel, laying on the side. Shelter for young stock or farm dogs during the winter. The rabbits don’t need any help.
The road is a horse shoe. We’d driven up the western side of the valley beneath the St Bathans range and were now at the most northerly point ready to cut east. It was sunset proper and the tussock covering the slopes of the Hawkduns reflected idealised light. The Hawkduns and its neighbouring ranges are free of scrub and trees. The shape of every ridge and alluvial fan are visible for all to see. In low-angled light or winter light these mountains demonstrate themselves. It’s the same characteristic or rocky alpine peaks covered in ice and snow, but with a very different form and mood.
The plan was to cross the valley and drive back down the valley beneath the Hawkduns. When we got to the Manuherikia river we saw the bridge had been washed out years ago. There wasn’t a ford to cross, so we turned around and drove back along the way we came. We reached the hut around blue hour. The last glow of western orange light picked up on the corrugated iron hut and surrounding tussocks, lighting it enough for the final photographs of the evening.
The strong and dynamic contrast of blue and orange turns more muted. With just silhouette, not shape, the Hawkduns retreat as a subject of the frame and the hut in its austere environment is just there.
Photographs shot on the Hasselblad 500cm with Porta 400 film.
Augusto Campos 24/01/2025 | Source: TRILUX
Esse produto (que em nada me interessa) era do nicho gamer até pouco tempo atrás, pra preservar o desempenho do celular durante partidas que exigem muito processamento e faziam o aparelho esquentar demais, ativando as suas próprias configurações de proteção que reduzem a velocidade de processamento até que ele volte à temperatura normal.
PORÉM eu comentei sobre esse fenômeno do AliExpress com duas pessoas próximas, e ambas me disseram: “poxa, meu celular tem desligado por superaquecimento no uso normal, quando ando na rua no horário do meio-dia”.
OU SEJA: o nicho gamer pode até ter sido a origem desse produto, mas quem está puxando a demanda dele é a distopia mesmo, já que agora diariamente temos temperaturas superiores às dos limites das condições de operação típicas de poucos anos atrás.
Sinais, haverá sinais.
O artigo "O aquecimento global não manda aviso, mas haverá sinais" foi originalmente publicado no site TRILUX, de Augusto Campos.
Armin Ronacher 24/01/2025 | Source: Armin Ronacher's Thoughts and Writings
Another day, another rant about dependencies. from me. This time I will ask you that we start and support a vibe shift when it comes to dependencies.
You're probably familiar with the concept of “dependency churn.” It's that never-ending treadmill of updates, patches, audits, and transitive dependencies that we as developers love to casually install in the name of productivity. Who doesn't enjoy waiting for yet another cargo upgrade just so you can get that fix for a bug you don't even have?
It's a plague in most ecosystems with good packaging solutions. JavaScript and Rust are particularly badly affected by that. A brand new Tokio project drags in 28 crates, a new Rocket project balloons that to 172, and a little template engine like MiniJinja can exist with just a single dependency — while its CLI variant slurps up 142.
If that doesn't sound like a big deal, let's consider terminal_size. It is a crate that does exactly what its name suggests: it figures out your terminal dimensions. The underlying APIs it uses have effectively been stable since the earliest days of computing terminals—what, 50 years or so? And yet, for one function, terminal-size manages to introduce three or four additional crates, depending on your operating system. That triggers a whole chain reaction, so you end up compiling thousands of other functions just to figure out if your terminal is 80x25 or 120x40. That crate had 26 releases. My own version of that that I have stuck away in a project from 10 years ago still works without a single update. Because shocker: nothing about figuring out terminal sizes has changed. [1]
So why does terminal-size have so many updates if it's so stable? Because it's build on top of platform abstraction libraries that constantly churn, so it needs to update to avoid code duplication and blowing up compile times even more.
But “big supply chain” will tell you that you must do it this way. Don't you dare to copy paste that function into your library. Or don't you date to use “unsafe” yourself. You're not qualified enough to write unsafe code, let the platform abstraction architects do that. Otherwise someone will slap you. There are entire companies who are making a living of supplying you with the tools needed to deal with your dependency mess. In the name of security, we're pushed to having dependencies and keeping them up to date, despite most of those dependencies being the primary source of security problems.
The goal of code in many ways should be to be written in a way that it does not need updates. It should eventually achieve some level of stability. In the Rust ecosystem stable code is punished. If you have a perfectly working dependency but you have a somewhat inactive bug tracker, RUSTSEC will come by and give you a chunk rating.
But there is a simpler path. You write code yourself. Sure, it's more work up front, but once it's written, it's done. No new crates, no waiting for upsteam authors to fix that edge case. If it's broken for you, you fix it yourself. Code that works doesn't necessarily need the maintenance treadmill. Your code has a corner case? Who cares. This is that vibe shift we need in the Rust world: celebrating fewer dependencies rather than more.
We're at a point in the most ecosystems where pulling in libraries is not just the default action, it's seen positively: “Look how modular and composable my code is!” Actually, it might just be a symptom of never wanting to type out more than a few lines.
Now one will make the argument that it takes so much time to write all of this. It's 2025 and it's faster for me to have ChatGPT or Cursor whip up a dependency free implementation of these common functions, than it is for me to start figuring out a dependency. And it makes sense as for many such small functions the maintenance overhead is tiny and much lower than actually dealing with constant upgrading of dependencies. The code is just a few lines and you also get the benefit of no longer need to compile thousands of lines of other people's code for a single function.
But let's face it: corporate code review culture has also has infected Open Source software. Companies are more likely to reward engineers than scold them for pulling in that new “shiny library” that solves the problem they never actually had. That creates problems, so dependabot and friends were born. Today I just dread getting dependabot pull requests but on projects but I have to accept it. I'm part of an ecosystem with my stuff and that ecosystem is all about churn, churn, churn. In companies you can also keep entire internal engineering teams busy with vendoring dependencies, internal audits and upgrading things throughout the company.
Fighting this fight is incredibly hard! Every new hire has been trained on the idea that dependencies are great, that code reuse is great. That having old code sitting around is a sign of bad engineering culture.
It's also hard to fight this in Open Source. Years ago I wrote sha1-smol which originally was just called sha1. It became the standard crate to calculate SHA1 hashes. Eventually I was pressured to donate that package name to rust-crypto and to depend on the rest of the crypto ecosystem as it was so established. If you want to use the new sha1 crate, you get to enjoy 10 dependencies. But there was just no way around it, because that name in the registry is precious and people also wanted to have trait compatibility. It feels tiring to be the only person in a conversation pushing to keep the churn down and dependencies low.
It's time to have a new perspective: we should give kudos to engineers who write a small function themselves instead of hooking in a transitive web of crates. We should be suspicious of big crate graphs. Celebrated are the minimal dependencies, the humble function that just quietly does the job, the code that doesn't need to be touched for years because it was done right once.
And sure, it's not black and white. There are the important libraries that solve hard problems. Graphics libraries that abstract over complex drivers, implementations of protocols like HTTP and QUIC. I won't be able to get rid of tokio and I have no desire to. But when you end up using one function, but you compile hundreds, some alarm bell should go off.
We need that vibe shift. To celebrate building it yourself when it's appropriate to do so. To give credit to library authors who build low to no-dependency Open Source libraries.
For instance minijinja celebrates it in the readme:
$ cargo tree minimal v0.1.0 (examples/minimal) └── minijinja v2.6.0 (minijinja) └── serde v1.0.144
And it has a PR to eventually get rid of the last dependency. And sometime this year I will make it my goal to go ahead proudly and trim down all that fat in my projects.
[1] | Disclaimer: you will need one dependency for UNIX: libc. That's because Rust does not expose the platform's libc constants to you, and they are not standarized. That however is such a common and lightweight dependency that you won't be able to avoid it anyways. |
Anonymous 23/01/2025 | Source: Irrational Exuberance
In my career, the majority of the strategy work I’ve done has been in non-executive roles, things like Uber’s service migration. Joining Calm was my first executive role, where I was able to not just propose, but also mandate, strategy.
Like almost all startups, the engineering team was scattered when I joined. Was our most important work creating more scalable infrastructure? Was our greatest risk the failure to adopt leading programming languages? How did we rescue the stuck service decomposition initiative?
This strategy is where the engineering team and I aligned after numerous rounds of iteration, debate, and inevitably some disagreement. As a strategy, it’s both basic and also unambiguous about what we valued, and I believe it’s a reasonably good starting point for any low scalability-complexity consumer product.
This is an exploratory, draft chapter for a book on engineering strategy that I’m brainstorming in #eng-strategy-book. As such, some of the links go to other draft chapters, both published drafts and very early, unpublished drafts.
To apply this strategy, start at the top with Policy. To understand the thinking behind this strategy, read sections in reverse order, starting with Explore, then Diagnose and so on. Relative to the default structure, this document has one tweak, folding the Operation section in with Policy.
More detail on this structure in Making a readable Engineering Strategy document.
Our new policies, and the mechanisms to operate them are:
We are a product engineering company. Users write in every day to tell us that our product has changed their lives for the better. Our technical infrastructure doesn’t get many user letters–and this is unlikely to change going forward as our infrastructure is relatively low-scale and low-complexity. Rather than attempting to change that, we want to devote the absolute maximum possible attention to product engineering.
We exclusively adopt new technologies to create valuable product capabilities. We believe our technology stack as it exists today can solve the majority of our current and future product roadmaps. In the rare case where we adopt a new technology, we do so because a product capability is inherently impossible without adopting a new technology.
We do not adopt new technologies for other reasons. For example, we would not adopt a new technology because someone is interested in learning about it. Nor would we adopt a technology because it is 30% better suited to a task.
We write all code in the monolith. It has been ambiguous if new code (especially new application code) should be written in our JavaScript monolith, or if all new code must be written in a new service outside of the monolith. This is no longer ambiguous: all new code must be written in the monolith.
In the rare case that there is a functional requirement that makes writing in the monolith implausible, then you should seek an exception as described below.
Exceptions are granted by the CTO, and must be in writing. The above policies are deliberately restrictive. Sometimes they may be wrong, and we will make exceptions to them. However, each exception should be deliberate and grounded in concrete problems we are aligned both on solving and how we solve them. If we all scatter towards our preferred solution, then we’ll create negative leverage for Calm rather than serving as the engine that advances our product.
All exceptions must be written. If they are not written, then you should operate as if it has not been granted. Our goal is to avoid ambiguity around whether an exception has, or has not, been approved. If there’s no written record that the CTO approved it, then it’s not approved.
Proving the point about exceptions, there are two confirmed exceptions to the above strategy:
We are incrementally migrating to TypeScript. We have found that static typing can prevent a number of our user-facing bugs. TypeScript provides a clean, incremental migration path for our JavaScript codebase, and we aim to migrate the entirety over the next six months.
Our Web engineering team is leading this migration.
We are evaluating Postgres Aurora as our primary database. Many of our recent production incidents are caused by index scans for tables with high write velocity such as tracking customer logins. We believe Aurora will perform better under these workloads.
Our Infrastructure engineering team is leading this initiative.
The current state of our engineering organization:
Our product is not limited by missing infrastructure capabilities. Reviewing our roadmap, there’s nothing that we are trying to build today or over the next year that is constrained by our technical infrastructure.
Our uptime, stability and latency are OK but not great. We have semi-frequent stability and latency issues in our application, all of which are caused by one of two issues. First, deploying new code with a missing index because it performed well enough in a test environment. Second, writes to a small number of extremely large, skinny tables have become expensive in combination with scans over those tables’ indexes.
Our infrastructure team is split between supporting monolith and service workflows. One way to measure technical debt is to understand how much time the team is spending propping up the current infrastructure. Today, that is meaningful but not overwhelming work for our team of three infrastructure engineers supporting 30 product engineers.
However, we are finding infrastructure engineers increasingly pulled into debugging incidents for components moved out of the central monolith into our service architecture. This is partially due to increased inherent complexity, but it’s more due to exposing lack of monitoring and ambiguous accountability in services’ production incidents.
Our product and executive stakeholders experience us as competing factions. Engineering exists to build and operate software in the company. Part of that is being easy to work with. We should not necessarily support every ask from Product if we believe they are misaligned with Engineering’s goals (e.g. maintaining security), but it should generally provide a consistent perspective across our team.
Today, our stakeholders believe they will get radically different answers to basic questions of capabilities and approach depending on who they ask. If they try to get a group of engineers to agree on an approach, they often find we derail into debate about approach rather than articulating a clear point of view that allows the conversation to move forward.
We’re arguing a particularly large amount about adopting new technologies and rewrites. Most of our disagreements stem around adopting new technologies or rewriting existing components into new technology stacks. For example, can we extend this feature or do we have to migrate it to a service before extending it? Can we add this to our database or should we move it into a new Redis cache instead? Is JavaScript a sufficient programming language, or do we need to rewrite this functionality in Go?
This is particularly relevant to next steps around the ongoing services migration, which has been in-flight for over a year, but is yet to move any core production code.
We are spending more time on infrastructure and platform work than product work. This is the combination of all the above issues, from the stability issues we are encountering in our database design, to the lack of engineering alignment on execution. This places us at odds with stakeholder expectation that we are predominantly focused on new product development.
Calm is a mobile application that guides users to build and maintain either a meditation or sleep habit. Recommendations and guidance across content is individual to the user, but the content is shared across all customers and is amenable to caching on a content delivery network (CDN). As long as the CDN is available, the mobile application can operate despite inability to access servers (e.g. the application remains usable from a user’s perspective, even if the non-CDN production infrastructure is unreachable).
In 2010, enabling a product of this complexity would have required significant bespoke infrastructure, along with likely maintaining a physical presence in a series of datacenters to run your software. In 2020, comparable applications are generally moving towards maintaining as little internal infrastructure as possible. This perspective is summarized effectively in Intercom’s Run Less Software and Dan McKinley’s Choose Boring Technology.
New companies founded in this space view essentially all infrastructure as a commodity bought off your cloud provider. This even extends to areas of innovation, such as machine learning, where the training infrastructure is typically run on an offering like AWS Bedrock, and the model infrastructure is provided by Anthropic or OpenAI.
Brett Cannon 23/01/2025 | Source: Tall, Snarky Canadian
When I was about to go on paternity leave, the Gleam programming language reached 1.0. It's such a small language that I was able to learn it over the span of two days. I tried to use it to convert a GitHub Action from JavaScript to Gleam, but I ran into issues due to Gleam wanting to be the top of the language stack instead of the bottom. As such I ended up learning and using ReScript. But I still liked Gleam and wanted to try writing something in it, so over the winter holidays I did another project with it from scratch.
First and foremost, their statement about community on their homepage spoke to me:
As a community, we want to be friendly too. People from around the world, of all backgrounds, genders, and experience levels are welcome and respected equally. See our community code of conduct for more.
Black lives matter. Trans rights are human rights. No nazi bullsh*t.
Secondly, the language is very small and tightly designed which I always appreciate (Python's "it fits your brain" slogan has always been one of my favourite tag lines for the language).
Third, it's a typed, functional, immutable language that is impure. I find that a nice balance of practicality while trying to write code that is as reliable as possible by knowing that if you get passed the compiler you're probably doing pretty well (which is good for projects you are not going to work on often but do have the time to put in the extra effort upfront to deal with typing and such).
Fourth, it compiles to either Erlang or JavaScript. Both have their (unique) uses which I appreciate (and in my case the latter is important).
Fifth, it has Lustre. While I liked Elm and loved TEA (The Elm Architecture), I did find Elm's lack of FFI restrictive. Lustre with Gleam fixes those issues.
And finally, my friend Dusty is a fan.
I decided I wanted to create a website to help someone choose a coding font. When I was looking for one a while back I created screenshots of code samples which were anonymous so that I could choose one without undue influence (I ended up with MonoLisa). I figured it would be a fun project to create a site that did what I wish I had when choosing a font: a tournament bracket for fonts where you entered example text and then have fonts battle it out until you had a winner. This seemed like a great fit for Lustre and Gleam since it would be all client-side and have some interaction.
Overall, the language was a pleasure to work with. While the functional typing occasionally felt tedious, I knew there was benefit to it if I wanted things to work in the long-term with as little worry as possible that I had a bug in my code. The language was nice and small, and so I didn't have issue keeping it in my head while I coded (most of my documentation reading was for the standard library). And it was powerful enough with Lustre for me to need exactly less than 200 lines of Gleam to make it all work (plus less than 90 lines of static HTML and CSS).
I'm a Python fan, and so all the curly braces weren't my favourite thing. I know its for familiarity reasons and I'm not going to cause me to not use the language in the future, but I wouldn't have minded less syntax to denote structure.
The other thing is having to specify a type's name twice for the name be usable as both the type and the constructor for a single record.
pub type Thingy {
Thingy(...)
}
Once again, it's very minor but something that I had to learn and typing the name twice always felt unnecessary and a typo waiting to happen for the compiler to catch. Having some shorthand like pub record Thingy(...)
to represent the same thing would be nice.
I would love to have a WebAssembly/WASI and Python back-end for Gleam to go along with the Erlang and JavaScript one. I have notes on writing a Python back-end and Dusty did a prototype. Unfortunately I don't think the Gleam compiler – which written in Rust – is explicitly designed for adding more back-ends, so I'm not sure if any of this will ever come to pass.
I'm happy with Gleam! I'm interested in trying it with Erlang and the BEAM somehow, although my next project for that realm is with Elixir because Phoenix LiveView is a perfect fit for that project (I suspect there's something in Gleam to compete with Phoenix LiveView, but I do want to learn Elixir). But I definitely don't regret learning Gleam and I am still motivated enough to be working my way through Exercism's Gleam track.
Anonymous 23/01/2025 | Source: is this it?
Hawkduns.
Early evening in Ida Valley.
Sunset in the Ida Valley.
Kokonga Hut, at the foot of Mt Buster.
Foot of Mt Buster.
Below, above Falls Dam.
On Raggedy Range.
Sutton at dawn.
Cambrian Common Forest.
Up the Manuherikia Valley.
Photographs shot on the Hasselblad 500cm with Porta 400 film.
Ken Shirriff 21/01/2025 | Source: Ken Shirriff's blog
Intel released the powerful Pentium processor in 1993, establishing a long-running brand of processors. Earlier, I wrote about the ROM in the Pentium's floating point unit that holds constants such as π. In this post, I'll look at some interesting circuits associated with this ROM. In particular, the circuitry is implemented in BiCMOS, a process that combines bipolar transistors with standard CMOS logic.
The photo below shows the Pentium's thumbnail-sized silicon die under a microscope. I've labeled the main functional blocks; the floating point unit is in the lower right with the constant ROM highlighted at the bottom. The various parts of the floating point unit form horizontal stripes. Data buses run vertically through the floating point unit, moving values around the unit.
The diagram below shows how the circuitry in this post forms part of the Pentium. Zooming in to the bottom of the chip shows the constant ROM, holding 86-bit words: at the left, the exponent section provides 18 bits. At the right, the wider significand section provides 68 bits. Below that, the diagram zooms in on the subject of this article: one of the 86 identical multiplexer/driver circuits that provides the output from the ROM. As you can see, this circuit is a microscopic speck in the chip.
In this section, I'll show how the Pentium is constructed from layers. The bottom layer of the chip consists of transistors fabricated on the silicon die. Regions of silicon are doped with impurities to change the electrical properties; these regions appear pinkish in the photo below, compared to the grayish undoped silicon. Thin polysilicon wiring is formed on top of the silicon. Where a polysilicon line crosses doped silicon, a transistor is formed; the polysilicon creates the transistor's gate. Most of these transistors are NMOS and PMOS transistors, but there is a bipolar transistor near the upper right, the large box-like structure. The dark circles are contacts, regions where the metal layer above is connected to the polysilicon or silicon to wire the circuits together.
The Pentium has three layers of metal wiring. The photo below shows the bottom layer, called M1. For the most part, this layer of metal connects the transistors into various circuits, providing wiring over a short distance. The photos in this section show the same region of the chip, so you can match up features between the photos. For instance, the contacts below (black circles) match the black circles above, showing how this metal layer connects to the silicon and polysilicon circuits. You can see some of the silicon and polysilicon in this image, but most of it is hidden by the metal.
The M2 metal layer (below) sits above the M1 wiring. In this part of the chip, the M2 wires are horizontal. The thicker lines are power and ground. (Because they are thicker, they have lower resistance and can provide the necessary current to the underlying circuitry.) The thinner lines are control signals. The floating point unit is structured so functional blocks are horizontal, while data is transmitted vertically. Thus, a horizontal wire can supply a control signal to all the bits in a functional block.
The M3 layer is the top metal layer in the Pentium. It is thicker, so it is better suited for the chip's main power and ground lines as well as long-distance bus wiring. In the photo below, the wide line on the left provides power, while the wide line on the right provides ground. The power and ground are distributed through wiring in the M2 and M1 layers until they are connected to the underlying transistors. At the top of the photo, vertical bus lines are visible; these extend for long distances through the floating point unit. Notice the slightly longer line, fourth from the right. This line provides one bit of data from the ROM, provided by the circuitry described below. The dot near the bottom is a via, connecting this line to a short wire in M2, connected to a wire in M1, connected to the silicon of the output transistors.
The simplified schematic below shows the circuit that I reverse-engineered. This circuit is repeated 86 times, once for each bit in the ROM's word. You might expect the ROM to provide a single 86-bit word. However, to make the layout work better, the ROM provides eight words in parallel. Thus, the circuitry must select one of the eight words with a multiplexer. In particular, each of the 86 circuits has an 8-to-1 multiplexer to select one bit out of the eight. This bit is then stored in a latch. Finally, a high-current driver amplifies the signal so it can be sent through a bus, traveling to a destination halfway across the floating point unit.
I'll provide a quick review of MOS transistors before I explain the circuitry in detail. CMOS circuitry uses two types of transistors—PMOS and NMOS—which are similar but also opposites. A PMOS transistor is turned on by a low signal on the gate, while an NMOS transistor is turned on by a high signal on the gate; the PMOS symbol has an inversion bubble on the gate. A PMOS transistor works best when pulling its output high, while an NMOS transistor works best when pulling its output low. CMOS circuitry normally uses the two types of MOS transistors in a Complementary fashion to implement logic gates, working together. What makes the circuits below interesting is that they often use NMOS and PMOS transistors independently.
The detailed schematic below shows the circuitry at the transistor and inverter level. I'll go through each of the components in the remainder of this post.
The ROM is constructed as a grid: at each grid point, the ROM can have a transistor for a 0 bit, or no transistor for a 1 bit. Thus, the data is represented by the transistor pattern. The ROM holds 304 constants so there are 304 potential transistors associated with each bit of the output word. These transistors are organized in a 38×8 grid. To select a word from the ROM, a select line activates one group of eight potential transistors. Each transistor is connected to ground, so the transistor (if present) will pull the associated line low, for a 0 bit. Note that the ROM itself consists of only NMOS transistors, making it half the size of a truly CMOS implementation. For more information on the structure and contents of the ROM, see my earlier article.
A ROM transistor can pull a line low for a 0 bit, but how does the line get pulled high for a 1 bit? This is accomplished by a precharge transistor on each line. Before a read from the ROM, the precharge transistors are all activated, pulling the lines high. If a ROM transistor is present on the line, the line will next be pulled low, but otherwise it will remain high due to the capacitance on the line.
Next, the multiplexer above selects one of the 8 lines, depending on which word is being accessed. The multiplexer consists of eight transistors. One transistor is activated by a select line, allowing the ROM's signal to pass through. The other seven transistors are in the off state, blocking those ROM signals. Thus, the multiplexer selects one of the 8 bits from the ROM.
The circuit below is the "keeper." As explained above, each ROM line is charged high before reading the ROM. However, this charge can fade away. The job of the keeper is to keep the multiplexer's output high until it is pulled low. This is implemented by an inverter connected to a PMOS transistor. If the signal on the line is high, the PMOS transistor will turn on, pulling the line high. (Note that a PMOS transistor is turned on by a low signal, thus the inverter.) If the ROM pulls the line low, the transistor will turn off and stop pulling the line high. This transistor is very weak, so it is easily overpowered by the signal from the ROM. The transistor on the left ensures that the line is high at the start of the cycle.
The diagram below shows the transistors for the keeper. The two transistors on the left implement a standard CMOS inverter. On the right, note the weak transistor that holds the line high. You might notice that the weak transistor looks larger and wonder why that makes the transistor weak rather than strong. The explanation is that the transistor is large in the "wrong" dimension. The current capacity of an MOS transistor is proportional to the width/length ratio of its gate. (Width is usually the long dimension and length is usually the skinny dimension.) The weak transistor's length is much larger than the other transistors, so the W/L ratio is smaller and the transistor is weaker. (You can think of the transistor's gate as a bridge between its two sides. A wide bridge with many lanes lets lots of traffic through. However, a long, single-lane bridge will slow down the traffic.)
Next, we come to the latch, which remembers the value read from the ROM. This latch will read its input when the load signal is high. When the load signal goes low, the latch will hold its value. Conceptually, the latch is implemented with the circuit below. A multiplexer selects the lower input when the load signal is active, passing the latch input through to the (inverted) output. But when the load signal goes low, the multiplexer will select the top input, which is feedback of the value in the latch. This signal will cycle through the inverters and the multiplexer, holding the value until a new value is loaded. The inverters are required because the multiplexer itself doesn't provide any amplification; the signal would rapidly die out if not amplified by the inverters.
The multiplexer is implemented with two CMOS switches, one to select each multiplexer input. Each switch is a pair of PMOS and NMOS transistors that turn on together, allowing a signal to pass through. (See the bottom two transistors below.)1 The upper circuit is trickier. Conceptually, it is an inverter feeding into the multiplexer's CMOS switch. However, the order is switched so the switch feeds into the inverter. The result is not-exactly-a-switch and not-exactly-an-inverter, but the result is the same. You can also view it as an inverter with power and ground that gets cut off when not selected. I suspect this implementation uses slightly less power than the straightforward implementation.
The most unusual circuit is the BiCMOS driver. By adding a few extra processing steps to the regular CMOS manufacturing process, bipolar (NPN and PNP) transistors can be created. The Pentium extensively used BiCMOS circuits since they reduced signal delays by up to 35%. Intel also used BiCMOS for the Pentium Pro, Pentium II, Pentium III, and Xeon processors. However, as chip voltages dropped, the benefit from bipolar transistors dropped too and BiCMOS was eventually abandoned.
In the Pentium, BiCMOS drivers are used when signals must travel a long distance across the chip. (In this case, the ROM output travels about halfway up the floating point unit.) These long wires have a lot of capacitance so a high-current driver circuit is needed and the NPN transistor provides extra "oomph."
The diagram below shows how the driver is implemented. The NPN transistor is the large boxy structure in the upper right. When the base (B) is pulled high, current flows from the collector (C), pulling the emitter (E) high and thus rapidly pulling the output high. The remainder of the circuit consists of three inverters, each composed of PMOS and NMOS transistors. When a polysilicon line crosses doped silicon, it creates a transistor gate, so each crossing corresponds to a transistor. The inverters use multiple transistors in parallel to provide more current; the transistor sources and/or drains overlap to make the circuitry more compact.
One interesting thing about this circuit is that each inverter is carefully designed to provide the desired current, with a different current for a high output versus a low output. The first inverter (purple boxes) has two PMOS transistors and two NMOS transistors, so it is a regular inverter, balanced for high and low outputs. (This inverter is conceptually part of the latch.) The second inverter (yellow boxes) has three large PMOS transistors and one smaller NMOS transistor, so it has more ability to pull the output high than low. This transistor turns on the NPN transistor by providing a high signal to the base, so it needs more current in the high state. The third inverter (green boxes) has one weak PMOS transistor and seven NMOS transistors, so it can pull its output low strongly, but can barely pull its output high. This transistor pulls the ROM output line low, so it needs enough current to drive the entire bus line. But this transistor doesn't need to pull the output high—that's the job of the NPN transistor—so the PMOS transistor can be weak. The construction of the weak transistor is similar to the keeper's weak transistor; its gate length is much larger than the other transistors, so it provides less current.
The diagram below shows how the functional blocks are arranged in the complete circuit, from the ROM at the bottom to the output at the top. The floating point unit is constructed with a constant width for each bit—38.5 µm—so the circuitry is designed to fit into this width. The layout of this circuitry was hand-optimized to fit as tightly as possible, In comparison, much of the Pentium's circuitry was arranged by software using a standard-cell approach, which is much easier to design but not as dense. Since each bit in the floating point unit is repeated many times, hand-optimization paid off here.
This circuit contains 47 transistors. Since it is duplicated once for each bit, it has 4042 transistors in total, a tiny fraction of the Pentium's 3.1 million transistors. In comparison, the MOS 6502 processor has about 3500-4500 transistors, depending on how you count. In other words, the circuit to select a word from the Pentium's ROM is about as complex as the entire 6502 processor. This illustrates the dramatic growth in processor complexity described by Moore's law.
I plan to write more about the Pentium so follow me on Bluesky (@righto.com) or RSS for updates. (I'm no longer on Twitter.) You might enjoy reading about the Pentium Navajo rug.
The 8-to-1 multiplexer and the latch's multiplexer use different switch implementations: the first is built from NMOS transistors while the second is built from paired PMOS and NMOS transistors. The reason is that NMOS transistors are better at pulling signals low, while PMOS transistors are better at pulling signals high. Combining the transistors creates a switch that passes low and high signals efficiently, which is useful in the latch. The 8-to-1 multiplexer, however, only needs to pull signals low (due to the precharging), so the NMOS-only multiplexer works in this role. (Note that early NMOS processors like the 6502 and 8086 built multiplexers and pass-transistor logic out of solely NMOS. This illustrates that you can use NMOS-only switches with both logic levels, but performance is better if you add PMOS transistors.) ↩
Augusto Campos 20/01/2025 | Source: TRILUX
Quem diria que um modelo de documento e um lembrete diário de preenchê-lo melhorariam minha vida?
Já fechamos a segunda semana de 2025, e venho conseguindo cumprir todos os dias o plano que adotei na virada do ano: fazer anotações diárias sobre minhas ideias, pensamentos e realizações a cada dia - para mim mesmo, e não pra compartilhar.
As minhas anotações, nesse caso, não são de produtividade, e sim pra formar perspectiva sobre como as coisas evoluem (permitindo revisões semanais do meu progresso pessoal, etc.)
Isso me permitiu ter mais perspectiva sobre os meus dias e gerou uma oportunidade diária para reflexão, que sempre prova ser valiosa para identificar que alguma coisa que na hora pareceu ser um fato isolado estava, na verdade, conectado a coisas que já vinham se formando em dias anteriores – e que de outra forma eu não perceberia a conexão. Recomendo muito!
O meu diário está sendo escrito com frases livres e soltas (na maior parte dos dias, a anotação inclui ao final até uma crônica propriamente dita), mas ter um modelo de documento pré-preenchido com algumas pistas de temas ("como dormi", "como o dia começou", "aonde eu fui", etc.) ajuda a puxar pra memória, ao fim do dia, os contextos que fazem lembrar as ideias, sensações. realizações e outros tópicos sobre os quais eu tenha algo mais fluido a registrar.
Claro que a ideia de escrever diários não é nada nova, e – mesmo dividido entre criar em papel ou no computador – eu optei por fazer o meu usando o recurso Daily Notes do aplicativo Obsidian, que eu já uso pra outras anotações minhas, e que comprovou se adequar ao meu jeito (que é mezzo crônica, mezzo bullet journal) de fazer anotações diárias.
Usando um recurso de modelos de documento do Obsidian, eu montei o texto-base pras minhas anotações diárias, aí a cada dia ele é criado (vazio, mas com as datas e links atualizados) automaticamente, e eu começo a preencher.
Eu temi, a princípio, que definir antes um conjunto de tópicos me engessaria, mas nem foi o caso - fica um relato bom e fluido, mesmo nos dias em que eu fujo pouco ao modelo e me limito a preencher as questões básicas, sem expandir.
Quando estou inspirado MESMO, o que acontece várias vezes por semana, a anotação termina com uma crônica, como no exemplo acima, trazendo algo que esteve presente naquele dia para mim.
Mas o modelo de documento que eu criei pra mim no Obsidian também tem a dimensão oposta, completamente objetiva quanto à forma de preenchimento, e que algum dia me permitirá comparar em gráficos a evolução do ano (por enquanto só estou acumulando os dados): é um placar para o dia, em várias dimensões, a ser preenchido objetivamente mas só depois da reflexão causada pelo preenchimento do restante, que é textual:
Desde então, e com a experiência de fichas de personagens de RPG, a área de Placar evoluiu no sentido de ganhar a seguinte nova descrição: “Marque X na alternativa que melhor descreve (nenhuma coluna deve ter mais do que 3 marcações)”.
Em resumo, e em conclusão: foi um hábito fácil de implementar, tem muitas formas de fazer, eu encontrei uma que funciona bem para mim, e recomendo a qualquer interessado que procure a forma que melhor o atenda, mas faça, porque vale a pena!
O artigo "Querido diário, quero registrar meu progresso todos os dias em 2025" foi originalmente publicado no site TRILUX, de Augusto Campos.
Nicole Tietz-Sokolskaya 20/01/2025 | Source: ntietz.com blog
Recently, a reader wrote to me and asked about my writing process and burnout. They had an image in their head that I could sit down at a computer and type up a full post on a given topic, but were unsure if that's the right approach when they start blogging. And they were concerned about how to keep things sustainable for themselves, too. I started to write back to them, but decided to ship my answer for everyone else, as well.
Now, to be clear: this is my writing process. There are many other very valid approaches, such as what Gabriella wrote. But I think outlining my process here will at least help break some of the mystique around writing, because I certainly do not just sit down and plunk out a post.
Well, it often looks like that if you're an outside observer1. But that's only what's visible.
The very first piece of any blog post for me is an idea, some inspiration, something I want to share. These ideas come from a variety of places. Some of them are from conversations with friends or coworkers. Others come from problems I'm solving in my own code. Or they come from things I'm reading. Very rarely, they come from just thinking about things, but those are other sources with the connection hidden by the passage of time.
After an idea comes to me, I try to capture it quickly. If I don't, it's probably gone! I store these ideas in a note in Obsidian, which currently has 162 ideas in it2. I format this file as a bulleted list, with minimal categorization so that the friction for adding something new is as low as possible. When I revisit this list to plan things to write (I do this every few weeks), I'll remove things which are either completed or which I'm confident I am not interested in anymore.
Ideas which are promoted from "want to write" to "definitely going to write" then get moved into my primary task tracking system, LunaTask. When I start typing an article, I move it from "later" or "next" to "in progress". But in reality, I start writing an article far before I start typing it.
See, the thing is, I'm almost always thinking about topics I'm going to write about. Maybe this is an element of having ADHD, or maybe it's from my obsessive interests and my deep diving into things I'm curious about. But it started happening more when I started writing on a schedule. By writing on a schedule, having to publish weekly, I've gotten in the habit of thinking about things so that I can hit the ground running when I sit down to type.
Once I've picked out an idea, then it sits in the back of my head and some background processing happens. Usually I'm not crystal clear on what I want to say on a topic, so that background process churns through it. After figuring out what I want to say, I move onto a little more active thought on how I want to say it and structure it. This happens while I'm on a run or a walk, while I'm on coffee breaks, between exercises I'm working on for my music lessons. There's a lot of background processing between snatches of active effort. Piece by piece, it crystallizes in my mind and I figure out what I want to say and how to say it.
Then, finally, I sit at the keyboard. And this is where it really looks like I just sit down and write posts in one quick sitting. That's because by then, the ideas are fully formed and it's a matter of writing the words out—and after a few years of that, it's something I got pretty quick at.
Take this blog post for example. I received the initial email on January 1st, and it sat in my inbox for a few days before I could read it and digest it. I started to write a reply, but realized it would work as a blog post. So it ended up in background processing for a week, then I started actively thinking about what I wanted to write. At that point, I created a blog post file with section headings as an outline.
This post ended up being delayed a bit, since I had to practice content-driven development for a project that has a self-imposed deadline. But the core writing (though not the typing) was done in the week after opening the original email. And then when I finally sat down to write it, it was two 20-minute sessions of typing out the words3.
After the first draft of a post is written, there's a lot of detail work remaining.
This draft is usually in the approximate shape of the final post, but it may be missing sections or have too much in some. I'll take this time to do a quick read over it by myself. Here I'm not checking for spelling errors, but checking if the structure is good and if it needs major work, or just copy edits.
If I'm unsure of some aspects of a post, I ask for feedback. I'll reach out to one or two friends who I think would have useful feedback and would enjoy reading it early, and ask them if they can read it. This request comes with specific feedback requests, since those are easier to fulfill then a broad "what do you think?" These are something along the lines of "was it clear? what parts stuck out to you positively or negatively? is there anything you found unclear or inaccurate?" Once people get me feedback, I add an acknowledgement for them (with explicit consent for if/how to share their name and where to link).
After I'm more sure of the structure and content of a post, I'll go through and do copy edits. I read through it, with a spellchecker enabled4, and fix any misspellings and improve the wording of sentences.
Then it's just the final details. I write up the blurb that goes in my newsletter and on social media (Mastodon, Bluesky, and LinkedIn, currently). I make sure the date is correct, the title is a decent one, and the slug mostly matches the title. And I make sure I did do things like acknowledgements and spellchecking.
Here's the full checklist I use for every post before publishing.
- [ ] Edited?
- [ ] Newsletter blurb written?
- [ ] Spellchecked?
- [ ] Is the date correct?
- [ ] Is the title updated?
- [ ] Does the slug match the title?
- [ ] Are tags set?
- [ ] Draft setting removed?
- [ ] Are acknowledgements added?
Once that checklist is complete, it's ready to go!
That means that on Monday morning, I just have to run make deploy
and it'll build and push out the article.
I've been writing at least one post a week since September 2022—over two years at this point, and about 200,000 words. I've avoided burning out on it, so far. That's an interesting thing, since I've burned out on personal projects in the past. What makes this different?
One aspect is that it's something that's very enjoyable for me. I like writing, and the process of it. I've always liked writing, and having an outlet for it is highly motivating for me.
I thrive in structure, so the self-imposed structure of publishing every week is helpful. This structure makes me abandon perfection and find smaller pieces that I can publish in a shorter amount of time, instead of clinging to a big idea and shipping all of it at once, or never. This is where content-driven development comes in for me, and the consistent schedule also leads me to create inspiration.
I also don't put pressure on myself besides the deadline. That's the only immovable thing. Everything else—length, content, even quality5—is flexible. Some of my earliest posts from 2022 are of the form "here's what I did this week!" which got me into a consistent rhythm.
The most important thing for me is to keep it fun. If I'm not finding something fun, I just move on to a different post or a different topic. I have some posts that would be good but have sat in my backlog for over a year, waiting to be written. Whenever I sit down to write those, my brain decides to find a bunch of other, better, posts to write instead!
So far, it's been fun and motivating to write weekly. If that ever stops, and it becomes a job and something I dread, I'll change the format to be something that works better for me instead!
If this is what it looks like then hey, stop peeking through my window.
I keep them in a bulleted list, each starting with -
, so I can find the count via:
grep "\s*-" 'Blog post ideas.md' | wc -l
Drafting documents is where I feel like my typing speed is actually a real benefit. On a typing test this evening, I got 99% accuracy at 120wpm. This lets me get through a lot of words quickly. This is a big plus for me both as a writer and as a principal engineer.
I leave my spellchecker disabled the vast majority of the time, since the squiggles are distracting! It's not a big deal to misspell things most of the time, and I'd rather be able to focus on getting words out than noticing each mistake as I make it—talk about discouraging!
That said, I think it's hard to judge your own quality in the moment. As writers, we can be very critical of ourselves. We think we wrote something boring but others find it interesting. Or something we think is quite clever, others find uninspiring. Sometimes, when I push out something I think is lazy and quick and low-quality, I'm surprised to find a very eager reception.
Anonymous 20/01/2025 | Source: Yoshua Wuyts — Blog
At the end of my last
post I mentioned that one
of the main issues with the IntoIterator
trait is that it’s kind of a pain to
write. I wasn’t around when it was first introduced, but it’s not hard to tell
that the original authors intended for Iterator
to be the primary interface
with IntoIterator
being an additional convenience.
This didn’t quite turn out to be the case though, and it’s common practice to
use IntoIterator
in both bounds and impls. In the Rust 2024 edition we’re
changing Rust’s range type to implement 1
And for example in Swift the equivalent trait to IntoIterator
rather than Iterator
.IntoIterator
(Sequence
) is
the primary interface used for iteration. With the interface equivalent to
Iterator
(IteratorProtocol
) having a much harder to use name.
Thanks to Lukas Wirth for pointing out that the range type change didn't end up making the cut for the edition. It's been a couple of months since I checked, and it seems it was removed for this edition. My understanding is that this change is still desired, and might make it in for a future edition.
So if not Iterator
, what name could we use? Well, recently I wrote a
little library called Iterate
that tries to answer that question. Let me walk
you through it.
Note: this post is intended to be a public exploration, not a concrete proposal. It's a starting point, asking: "... what if?" I'm a firm advocate for sharing ideas in public, especially if they're not yet fully fleshed out. There are a lot of good reasons to do that, but above all else: I think it's fun!
In Rust most interfaces use verbs as their names. To read bytes from a stream
you use the trait named Read
. To write bytes you use Write
. To debug
something you use Debug
. And to calculate numbers you can use Add
, Mul
(multiply), or Sub
(subtract). Most traits in the Rust stdlib are used to
perform concrete operations with, and the convention is to use verbs for that.
The stdlib does have one particularly interesting pairing in the form of Hash
(verb) and Hasher
(noun). From the documentation: "Types implementing Hash
are able to be hashed with an instance of Hasher
." Or put differently: the
trait Hash
represents the operation and the trait Hasher
represents the
state.
/// A hashable type.
pub trait Hash {
fn hash<H: Hasher>(&self, state: &mut H);
}
/// Represents state that is changed while hashing data.
pub trait Hasher {
fn finish(&self) -> u64;
fn write(&mut self, bytes: &[u8]);
}
What the trait IntoIterator
really represents is: "An iterable type". And the
trait Iterator
can be reasonably described as: "The state that is changed
while iterating over items". The verb/noun split present in Hash
/Hasher
feels like it easily applies to iteration too.
If Iterator
is the noun that represents the iteration state, what is the verb
that represents the capability? The obvious choice would be Iterate
. Which I
think ends up working out somewhat nicely. To iterate over items you implement
Iterate
, which provides you with a stateful Iterator
.
/// An iterable type.
pub trait Iterate {
type Item;
type Iterator: Iterator<Item = Self::Item>;
fn iterate(self) -> Self::Iterator;
}
/// Represents state that is changed while iterating.
pub trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
}
With our goal to make IntoIterator
less jarring to use in interfaces, the name
Iterate
doesn't seem half-bad. And it neatly follows the existing verb/noun
split pairing we're already using elsewhere in the stdlib.
People familiar with Rust's stdlib will be quick to note that Iterator
and
IntoIterator
are not the only iteration traits in use. We also have
FromIterator
that functions as the inverse of IntoIterator
. Where one exists
to convert types to iterators, the other exists to convert iterators back to
types. The latter is typically used via the Iterator::collect
function.
But IntoIterator
has a less-known but equally useful sibling: Extend
. Where
IntoIterator
collects items into new instances of types, the Extend
trait is
used to collect items into existing instances of types. It would feel pretty
weird to rename IntoIterator
to Iterate
, but then keep FromIterator
as-is.
What if instead of anchoring FromIterator
as a dual to IntoIterator
, we
instead treated it as a sibling to Extend
. The obvious verb for this would be
Collect
:
/// Creates a collection with the contents of an iterator.
pub trait Collect<A>: Sized {
fn collect<T>(iter: T) -> Self
where
T: Iterate<Item = A>;
}
It's interesting to note that the type T
in FromIterator
is bound by
IntoIterator
rather than Iterator
. Being able to use T: Iterate
as a bound
here definitely feels a little nicer. And speaking of nicer: this would also
make the provided Iterator::collect
and Iterator::collect_into
methods feel
a little better:
/// Represents state that is changed while iterating.
pub trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
/// Create a collection with the contents of this iterator.
fn collect<C>(self) -> C
where
C: Collect<Self::Item>,
Self: Sized,
/// Extend a collection with the contents of this iterator.
fn collect_into<E>(self, collection: &mut E) -> &mut E
where
E: Extend<Self::Item>,
Self: Sized;
}
I don't think this looks half-bad. And honestly: it might also be more
consistent overall, as traits representing other effects don't have an
equivalent to FromIterator
. The Future
trait only has IntoFuture
, and
variations on that in the ecosystem like
Race
.
No longer having a trait called FromIterator
would help remove some confusion.
I guess we've broached the async topic now, so I guess we might as well keep
going. We added the trait IntoFuture
to Rust in 2022 because we wanted an
equivalent to IntoIterator
but for the async effect. You can find some
motivating use cases for this in my async builders
(2019) post. We chose the name
IntoFuture
because it matched the existing convention set by
IntoIterator/Iterator
.
We already have Try
for fallibility, we just discussed using Iterate
for
iteration, what would the verb-based trait name be for asynchrony? The obvious
choice would be something like Await
, as that is the name of the operation:
trait Await {
type Output;
type Future = Future<Output = Self::Output>;
fn await(self) -> Self::Future;
}
This however runs into one major limitation: await
is a reserved keyword,
which means we can't use it as the name of the method. Which means I'm not
actually sure what this trait should be called. With iterators we're lucky that
we don't have any shortage of related words: loop
, iterate
, generate
,
while
, sequence
and so on. With async we're a little shorter on words. If
anyone has any good ideas for verbs to use here, I'd love to hear suggestions!
TLDR: I really wouldn't mind Iterate
being the main interface name for
iteration in Rust. That seems like it would be a step up from writing
IntoIterator
in bounds everywhere. Just by changing a name, without the need
for any special new language features.
Now for whether we should make this change:.. maybe? I honestly don't know. It's not just a matter of introducing a simple trait alias either: the method names and associated types are also different and we can't alias those. And I'm not particularly keen for Rust to start dabbling in additional trait hierarchies here either. Iteration is complex enough as it is, more super-traits are not going to make things any simpler here.
//! Renaming and aliasing the trait is not
//! enough, the method names and associated
//! type names would need to be aliased too.
pub trait Iterate { .. }
pub trait IntoIterator = Iterate;
So I think the only way this rename would actually make sense to follow through on is if the process to make changes like these would make that change easy. I don't believe it is today, but I definitely believe we should want it to become easy in the future. It would be nice if we could freely rename traits, methods, and maybe even types across editions without causing any breakage.
Either way though: I had a lot of fun writing this post. If you want to try the
Iterate
trait yourself today to get a better feel for it - check out the
iterate-trait
crate. It
has everything I've described in this post, as well as iterator combinators like
map
. Probably don't use if for anything serious, but definitely go having fun
with it.
Augusto Campos 19/01/2025 | Source: TRILUX
VOCÊ SABIA? Sempre que seu comportamento for julgado com base em algum livro, convém perguntar se esse mesmo livro também proíbe comer camarão e usar roupa de tecido misto – e aí promover seus ajustes conforme a resposta.
O artigo "Sobre literatura" foi originalmente publicado no site TRILUX, de Augusto Campos.
Augusto Campos 19/01/2025 | Source: TRILUX
Uma coisa que me deixa genuinamente TRISTE quando é com os outros - e que eu aprendi a ignorar quando é comigo – é uma dinâmica infelizmente comum nas redes:
1️⃣ A pessoa vitoriosa compartilha alegria por ter feito ou conseguido algo
2️⃣ Uma pessoa sem noção responde querendo ditar um próximo alvo (“já sabe espanhol, agora já pode aprender mandarim!”) a quem ainda está comemorando um feito anterior
3️⃣ Um terceiro quer saber métricas pra avaliar (“que distância?” "quantas páginas?”)
Se você faz o item 2 ou o item 3, favor melhorar urgente.
O artigo "Não atrapalhe a comemoração alheia" foi originalmente publicado no site TRILUX, de Augusto Campos.
Augusto Campos 19/01/2025 | Source: TRILUX
Quando eu era criança, morei por alguns anos ao lado de uma cidade chamada Guaramirim, que na época tinha uma peculiaridade bem pitoresca: o prefeito de lá se chamava Prefeito.
Lembrei disso hoje e fui procurar registros, pra ver se não era delírio - e foi comprovado, era mesmo fato real.
O artigo "O prefeito chamado Prefeito" foi originalmente publicado no site TRILUX, de Augusto Campos.
Armin Ronacher 19/01/2025 | Source: Armin Ronacher's Thoughts and Writings
When I developed Werkzeug (and later Flask), the most important part of the developer experience for me was enabling fast, automatic reloading. Werkzeug (and with it Flask), this is achieved by using two procsses at all times. The parent process holds on to the file descriptor of the socket on which the server listens, and a subprocess picks up that file descriptor. That subprocess restarts when it detects changes. This ensures that no matter what happens, there is no window where the browser reports a connection error. At worst, the browser will hang until the process finishes reloading, after which the page loads successfully. In case the inner process fails to come up during restarts, you get an error message.
A few years ago, I wanted to accomplish the same experience for working with Rust code which is why I wrote systemfd and listenfd. I however realized that I never really wrote here about how they work and disappointingly I think those crates, and a good auto-reloading experience in Rust are largely unknown.
Firstly one needs to monitor the file system for changes. While in theory I could have done this myself, there was already a tool that could do that.
At the time there was cargo watch. Today one might instead use it together with the more generic watchexec. Either one monitor your workspace for changes and then executes a command. So you can for instance tell it to restart your program. One of these will work:
watchexec -r -- cargo run cargo watch -x run
You will need a tool like that to do the watching part. At this point I recommend the more generic watchexec which you can find on homebrew and elsewhere.
But what about the socket? The solution to this problem I picked comes from systemd. Systemd has a “protocol” that standardizes passing file descriptors from one process to another through environment variables. In systemd parlance this is called “socket activation,” as it allows systemd to only launch a program if someone started making a request to the socket. This concept was originally introduced by Apple as part of launchd.
To make this work with Rust, I created two crates:
It's worth noting that systemfd is not exclusivly useful to Rust. The systemd protocol can be implemented in other languages as well, meaning that if you have a socket server written in Go or Python, you can also use systemfd.
So here is how you use it.
First you need to add listenfd to your project:
cargo add listenfd
Then, modify your server code to accept sockets via listenfd before falling back to listening itself on ports provided through command-line arguments or configuration files. Here is an example using listenfd in axum:
use axum::{routing::get, Router};
use tokio::net::TcpListener;
async fn index() -> &'static str {
"Hello, World!"
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let app = Router::new().route("/", get(index));
let mut listenfd = listenfd::ListenFd::from_env();
let listener = match listenfd.take_tcp_listener(0)? {
Some(listener) => TcpListener::from_std(listener),
None => TcpListener::bind("0.0.0.0:3000").await,
}?;
axum::serve(listener, app).await?;
Ok(())
}
The key point here is to accept socket 0 from the environment as a TCP listener and use it if available. If the socket is not provided (e.g. when launched without systemd/systemfd), the code falls back to opening a fixed port.
Finally you can use cargo watch / watchexec together with systemfd:
systemfd --no-pid -s http::8888 -- watchexec -r -- cargo run systemfd --no-pid -s http::8888 -- cargo watch -x run
This is what the parameters mean:
The end result is that you can edit your code, and it will recompile automatically and restart the server without dropping any requests. When you run it, and perform changes, it will look a bit like this:
$ systemfd --no-pid -s http::5555 -- watchexec -r -- cargo run ~> socket http://127.0.0.1:5555/ -> fd #3 [Running: cargo run] Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.02s Running `target/debug/axum-test` [Running: cargo run] Compiling axum-test v0.1.0 (/private/tmp/axum-test) Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.52s Running `target/debug/axum-test`
For easier access, I recommend putting this into a Makefile or similar so you can just run make devserver and it runs the server in watch mode.
To install systemfd you can use curl to bash:
curl -sSfL https://github.com/mitsuhiko/systemfd/releases/latest/download/systemfd-installer.sh | sh
Now how does this work on Windows? The answer is that systemfd and listenfd have a custom, proprietary protocol that also makes socket passing work on Windows. That's a more complex system which involves a local RPC server. However the system does also support Windows and the details about how it works are largely irrelevant for you as a user — unless you want to implement that protocol for another programming language.
I really enjoy using this combination, but it can be quite frustrating to require so many commands, and the command line workflow isn't optimal. Ideally, this functionality would be better integrated into specific Rust frameworks like axum and provided through a dedicated cargo plugin. In a perfect world, one could simply run cargo devserver, and everything would work seamlessly.
However, maintaining such an integrated experience is a much more involved effort than what I have. Hopefully, someone will be inspired to further enhance the developer experience and achieve deeper integration with Rust frameworks, making it more accessible and convenient for everyone.
Eryn Wells 19/01/2025 | Source: Eryn Rachel Wells
My birthday was this week. I’m 37, which is a pretty neat number, as it turns out.
Blue Origin launched their New Glenn rocket on Wednesday night. A very cool launch of a very large rocket.
Conversely, SpaceX launched Starship 7 and it .. blew up catastrophically and spread debris all over the Gulf of Mexico. I hate seeing the environmental and human impacts of such accidents, but it does give me small joy when Elon takes the L. 🙃
Some articles I read this week:
An acquaintence referred to Bunpro in a conversation we had on social media about Japanese grammar. This site has lots of detailed information about the language and how it works.
Ken Shirriff 18/01/2025 | Source: Ken Shirriff's blog
Addition is harder than you'd expect, at least for a computer. Computers use multiple types of adder circuits with different tradeoffs of size versus speed. In this article, I reverse-engineer an 8-bit adder in the Pentium's floating point unit. This adder turns out to be a carry-lookahead adder, in particular, a type known as "Kogge-Stone."1 In this article, I'll explain how a carry-lookahead adder works and I'll show how the Pentium implemented it. Warning: lots of Boolean logic ahead.
The die photo above shows the main functional units of the Pentium. The adder, in the lower right, is a small component of the floating point unit. It is not a general-purpose adder, but is used only for determining quotient digits during division. It played a role in the famous Pentium FDIV division bug, which I wrote about here.
The photo below shows the carry-lookahead adder used by the divider. The adder itself consists of the circuitry highlighted in red. At the top, logic gates compute signals in parallel for each of the 8 pairs of inputs: partial sum, carry generate, and carry propagate. Next, the complex carry-lookahead logic determines in parallel if there will be a carry at each position. Finally, XOR gates apply the carry to each bit. Note that the sum/generate/propagate circuitry consists of 8 repeated blocks, and the same with the carry XOR circuitry. The carry lookahead circuitry, however, doesn't have any visible structure since it is different for each bit.2
The large amount of circuitry in the middle is used for testing; see the footnote.3 At the bottom, the drivers amplify control signals for various parts of the circuit.
The problem with addition is that carries make addition slow. Consider calculating 99999+1 by hand. You'll start with 9+1=10, then carry the one, generating another carry, which generates another carry, and so forth, until you go through all the digits. Computer addition has the same problem: If you're adding two numbers, the low-order bits can generate a carry that then propagates through all the bits. An adder that works this way—known as a ripple carry adder—will be slow because the carry has to ripple through all the bits. As a result, CPUs use special circuits to make addition faster.
One solution is the carry-lookahead adder. In this adder, all the carry bits are computed in parallel, before computing the sums. Then, the sum bits can be computed in parallel, using the carry bits. As a result, the addition can be completed quickly, without waiting for the carries to ripple through the entire sum.
It may seem impossible to compute the carries without computing the sum first, but there's a way to do it.
For each bit position, you determine signals called "carry generate" and "carry propagate".
These signals can then be used to determine all the carries in parallel.
The generate signal indicates that the position generates a carry. For instance, if you add binary
1xx
and 1xx
(where x
is an arbitrary bit), a carry will be generated from the top bit,
regardless of the unspecified bits.
On the other hand, adding 0xx
and 0xx
will never produce a carry.
Thus, the generate signal is produced for the first case but not the second.
But what about 1xx
plus 0xx
? We might get a carry, for instance, 111+001
, but we might not get a carry,
for instance, 101+001
. In this "maybe" case, we set the carry propagate signal, indicating that a carry into the
position will get propagated out of the position. For example, if there is a carry out of
the middle position, 1xx+0xx
will have a carry from the top bit. But if there is no carry out of the middle position, then
there will not be a carry from the top bit. In other words, the propagate signal indicates that a carry into the top bit will be propagated out of the top
bit.
To summarize, adding 1+1
will generate a carry. Adding 0+1
or 1+0
will propagate a
carry.
Thus, the generate signal is formed at each position by Gn = An·Bn, where A and B are the inputs.
The propagate signal is Pn = An+Bn,
the logical-OR of the inputs.4
Now that the propagate and generate signals are defined, they can be used to compute the carry Cn at
each bit position:
C1 = G0: a carry into bit 1 occurs if a carry is generated from bit 0.
C2 = G1 + G0P1: A carry into bit 2 occur if bit 1 generates a carry or bit 1 propagates a carry from bit 0.
C3 = G2 + G1P2 + G0P1P2: A carry into bit 3 occurs if bit 2 generates a carry, or bit 2 propagates a carry generated from bit 1, or bits 2 and 1 propagate a carry generated from bit 0.
C4 = G3 + G2P3 + G1P2P3 + G0P1P2P3: A carry into bit 4 occurs if a carry is generated from bit 3, 2, 1, or 0 along with the necessary propagate signals.
... and so forth, getting more complicated with each bit ...
The important thing about these equations is that they can be computed in parallel, without waiting for a carry to ripple through each position. Once each carry is computed, the sum bits can be computed in parallel: Sn = An ⊕ Bn ⊕ Cn. In other words, the two input bits and the computed carry are combined with exclusive-or.
The straightforward way to implement carry lookahead is to directly implement the equations above. However, this approach requires a lot of circuitry due to the complicated equations. Moreover, it needs gates with many inputs, which are slow for electrical reasons.5
The Pentium's adder implements the carry lookahead in a different way, called the "parallel prefix adder."7 The idea is to produce the propagate and generate signals across ranges of bits, not just single bits as before. For instance, the propagate signal P32 indicates that a carry in to bit 2 would be propagated out of bit 3. And G30 indicates that bits 3 to 0 generate a carry out of bit 3.
Using some mathematical tricks,6 you can take the P and G values for two smaller ranges and merge them into the P and G values for the combined range. For instance, you can start with the P and G values for bits 0 and 1, and produce P10 and G10. These could be merged with P32 and G32 to produce P30 and G30, indicating if a carry is propagated across bits 3-0 or generated by bits 3-0. Note that Gn0 is the carry-lookahead value we need for bit n, so producing these G values gives the results that we need from the carry-lookahead implementation.
This merging process is more efficient than the "brute force" implementation of the carry-lookahead logic since logic subexpressions can be reused. This merging process can be implemented in many ways, including Kogge-Stone, Brent-Kung, and Ladner-Fischer. The different algorithms have different tradeoffs of performance versus circuit area. In the next section, I'll show how the Pentium implements the Kogge-Stone algorithm.
The Pentium's adder is implemented with four layers of circuitry. The first layer produces the propagate and generate signals (P and G) for each bit, along with a partial sum (the sum without any carries). The second layer merges pairs of neighboring P and G values, producing, for instance G65 and P21. The third layer generates the carry-lookahead bits by merging previous P and G values. This layer is complicated because it has different circuitry for each bit. Finally, the fourth layer applies the carry bits to the partial sum, producing the final arithmetic sum.
Here is the schematic of the adder, from my reverse engineering. The circuit in the upper left is repeated 8 times to produce the propagate, generate, and partial sum for each bit. This corresponds to the first layer of logic. At the left are the circuits to merge the generate and propagate signals across pairs of bits. These circuits are the second layer of logic.
The circuitry at the right is the interesting part—it computes the carries in parallel and then computes the final sum bits using XOR. This corresponds to the third and fourth layers of circuitry respectively. The circuitry gets more complicated going from bottom to top as the bit position increases.
The diagram below is the standard diagram that illustrates how a Kogge-Stone adder works. It's rather abstract, but I'll try to explain it. The diagram shows how the P and G signals are merged to produce each output at the bottom. Each line coresponds to both the P and the G signal. Each square box generates the P and G signals for that bit. (Confusingly, the vertical and diagonal lines have the same meaning, indicating inputs going into a diamond and outputs coming out of a diamond.) Each diamond combines two ranges of P and G signals to generate new P and G signals for the combined range. Thus, the signals cover wider ranges as they progress downward, ending with the Gn0 signals that are the outputs.
It may be easier to understand the diagram by starting with the outputs. I've highlighted two circuits: The purple circuit computes the carry into bit 3 (out of bit 2), while the green circuit computes the carry into bit 7 (out of bit 6). Following the purple output upward, note that it forms a tree reaching bits 2, 1, and 0, so it generates the carry based on these bits, as desired. In more detail, the upper purple diamond combines the P and G signals for bits 2 and 1, generating P21 and G21. The lower purple diamond merges in P0 and G0 to create P20 and G20. Signal G20 indicates of bits 2 through 0 generate a carry; this is the desired carry value into bit 3.
Now, look at the green output and see how it forms a tree going upward, combining bits 6 through 0. Notice how it takes advantage of the purple carry output, reducing the circuitry required. It also uses P65, P43, and the corresponding G signals. Comparing with the earlier schematic shows how the diagram corresponds to the schematic, but abstracts out the details of the gates.
Comparing the diagram to the schematic, each square box corresponds to to the circuit in the upper left of the schematic that generates P and G, the first layer of circuitry. The first row of diamonds corresponds to the pairwise combination circuitry on the left of the schematic, the second layer of circuitry. The remaining diamonds correspond to the circuitry on the right of the schematic, with each column corresponding to a bit, the third layer of circuitry. (The diagram ignores the final XOR step, the fourth layer of circuitry.)
Next, I'll show how the diagram above, the logic equations, and the schematic are related. The diagram below shows the logic equation for C7 and how it is implemented with gates; this corresponds to the green diamonds above. The gates on the left below computes G63; this corresponds to the middle green diamond on the left. The next gate below computes P63 from P65 and P43; this corresponds to the same green diamond. The last gates mix in C3 (the purple line above); this corresponds to the bottom green diamond. As you can see, the diamonds abstract away the complexity of the gates. Finally, the colored boxes below show how the gate inputs map onto the logic equation. Each input corresponds to multiple terms in the equation (6 inputs replace 28 terms), showing how this approach reduces the circuitry required.
There are alternatives to the Kogge-Stone adder. For example, a Brent-Kung adder (below) uses a different arrangement with fewer diamonds but more layers. Thus, a Brent-Kung adder uses less circuitry but is slower. (You can follow each output upward to verify that the tree reaches the correct inputs.)
The photo below shows the adder circuitry. I've removed the top two layers of metal, leaving the bottom layer of metal. Underneath the metal, polysilicon wiring and doped silicon regions are barely visible; they form the transistors. At the top are eight blocks of gates to generate the partial sum, generate, and propagate signals for each bit. (This corresponds to the first layer of circuitry as described earlier.) In the middle is the carry lookahead circuitry. It is irregular since each bit has different circuitry. (This corresponds to the second and third layers of circuitry, jumbled together.) At the bottom, eight XOR gates combine the carry lookahead output with the partial sum to produce the adder's output. (This corresponds to the fourth layer of circuitry.)
The Pentium uses many adders for different purposes: in the integer unit, in the floating point unit, and for address calculation, among others. Floating-point division is known to use a carry-save adder to hold the partial remainder at each step; see my post on the Pentium FDIV division bug for details. I don't know what types of adders are used in other parts of the chip, but maybe I'll reverse-engineer some of them. Follow me on Bluesky (@righto.com) or RSS for updates. (I'm no longer on Twitter.)
Strangely, the original paper by Kogge and Stone had nothing to do with addition and carries. Their 1973 paper was titled, "A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations." It described how to solve recurrence problems on parallel computers, in particular the massively parallel ILLIAC IV. As far as I can tell, it wasn't until 1987 that their algorithm was applied to carry lookahead, in Fast Area-Efficient VLSI Adders. ↩
I'm a bit puzzled why the circuit uses an 8-bit carry-lookahead adder since only 7 bits are used. Moreover, the carry-out is unused. However, the adder's bottom output bit is not connected to anything. Perhaps the 8-bit adder was a standard logic block at Intel and was used as-is. ↩
I probably won't make a separate blog post on the testing circuitry, so I'll put details in this footnote. Half of the circuitry in the adder block is used to test the lookup table. The reason is that a chip such as the Pentium is very difficult to test: if one out of 3.1 million transistors goes bad, how do you detect it? For a simple processor like the 8080, you can run through the instruction set and be fairly confident that any problem would turn up. But with a complex chip, it is almost impossible to come up with an instruction sequence that would test every bit of the microcode ROM, every bit of the cache, and so forth. Starting with the 386, Intel added circuitry to the processor solely to make testing easier; about 2.7% of the transistors in the 386 were for testing.
To test a ROM inside the processor, Intel added circuitry to scan the entire ROM and checksum its contents. Specifically, a pseudo-random number generator runs through each address, while another circuit computes a checksum of the ROM output, forming a "signature" word. At the end, if the signature word has the right value, the ROM is almost certainly correct. But if there is even a single bit error, the checksum will be wrong and the chip will be rejected. The pseudo-random numbers and the checksum are both implemented with linear feedback shift registers (LFSR), a shift register along with a few XOR gates to feed the output back to the input. For more information on testing circuitry in the 386, see Design and Test of the 80386, written by Pat Gelsinger, who became Intel's CEO years later. Even with the test circuitry, 48% of the transistor sites in the 386 were untested. The instruction-level test suite to test the remaining circuitry took almost 800,000 clock cycles to run. The overhead of the test circuitry was about 10% more transistors in the blocks that were tested.
In the Pentium, the circuitry to test the lookup table PLA is just below the 7-bit adder. An 11-bit LFSR creates the 11-bit input value to the lookup table. A 13-bit LFSR hashes the two-bit quotient result from the PLA, forming a 13-bit checksum. The checksum is fed serially to test circuitry elsewhere in the chip, where it is merged with other test data and written to a register. If the register is 0 at the end, all the tests pass. In particular, if the checksum is correct, you can be 99.99% sure that the lookup table is operating as expected. The ironic thing is that this test circuit was useless for the FDIV bug: it ensured that the lookup table held the intended values, but the intended values were wrong.
Why did Intel generate test addresses with a pseudo-random sequence instead of a sequential counter? It turns out that a linear feedback shift register (LFSR) is slightly more compact than a counter. This LFSR trick was also used in a touch-tone chip and the program counter of the Texas Instruments TMS 1000 microcontroller (1974). In the TMS 1000, the program counter steps through the program pseudo-randomly rather than sequentially. The program is shuffled appropriately in the ROM to counteract the sequence, so the program executes as expected and a few transistors are saved.
The bits 1+1
will set generate, but should propagate be set too?
It doesn't make a difference as far as the equations. This adder sets propagate for 1+1
but some
other adders do not.
The answer depends on if you use an inclusive-or or exclusive-or gate
to produce the propagate signal. ↩
One solution is to implement the carry-lookahead circuit in blocks of four. This can be scaled up with a second level of carry-lookahead to provide the carry lookahead across each group of four blocks. A third level can provide carry lookahead for groups of four second-level blocks, and so forth. This approach requires O(log(N)) levels for N-bit addition. This approach is used by the venerable 74181 ALU, a chip used by many minicomputers in the 1970s; I reverse-engineered the 74181 here. The 74182 chip provides carry lookahead for the higher levels. ↩
I won't go into the mathematics of merging P and G signals; see, for example, Adder Circuits, Adders, or Carry Lookahead Adders for additional details. The important factor is that the carry merge operator is associative (actually a monoid), so the sub-ranges can be merged in any order. This flexibility is what allows different algorithms with different tradeoffs. ↩
The idea behind a prefix adder is that we want to see if there is a carry out of bit 0, bits 0-1, bits 0-2, bits 0-3, 0-4, and so forth. These are all the prefixes of the word. Since the prefixes are computed in parallel, it's called a parallel prefix adder. ↩
Anonymous 17/01/2025 | Source: is this it?
St Bathans is a small town in the Maniototo with a history of gold mining. On a recent trip there I was reading a run-of-the-mill information board and noticed the photograph accompanying it. It was unexpectedly large and incredibly detailed for such an old image. Tagged in the corner was Burton Bros – a photography studio that operated in Dunedin between 1866 and 1914. The studio sold prints to the general public, and has many landscapes and portraits of old New Zealand.
Some of the original photographic plates had been preserved and purchased by Te Papa. The museum has scanned them and has made the collection available on the internet. I’ve gone through the collection, done some minor and tasteful touchups to the St Bathans photographs, and am sharing them here.
The gold rush in St Bathans started in 1862. One could think of the surface and stream-based gold mining done in the early days, giving way into the extensive hydraulic sluicing portrayed in these photographs. Some of the photographs have the date written on them, I’ve added that as a caption. The others are undated, but were likely taken between 1869 and 1879.
The photograph above is the only confirmed image from 1869. Compare the look and construction of the St Bathans township just ten years later.
Sluicing.
Sluicing.
Sluicing.
Sluicing.
Sluicing.
Welshman's Gully (Cambrians).
St Bathans.
Sluicing.
Sluicing.
St Bathans. 1879.
Sluicing.
St Bathans postcard. Perhaps early 1900s? Going by the trees.
Hawkdun Station.
Further reading:
Anonymous 16/01/2025 | Source: Irrational Exuberance
Some people I’ve worked with have lost hope that engineering strategy actually exists within any engineering organizations. I imagine that they, reading through the steps to build engineering strategy, or the strategy for navigating private equity ownership, are not impressed. Instead, these ideas probably come across as theoretical at best. In less polite company, they might describe these ideas as fake constructs.
Let’s talk about it! Because they’re right. In fact, they’re right in two different ways. First, this book is focused on explaining how to create clean, refine and definitive strategy documents, where initially most real strategy artifacts look rather messy. Second, applying these techniques in practice can require a fair amount of creativity. It might sound easy, but it’s quite difficult in practice.
This chapter will cover:
Alright, let’s dive into the many ways that praxis doesn’t quite line up with theory.
This is an exploratory, draft chapter for a book on engineering strategy that I’m brainstorming in #eng-strategy-book. As such, some of the links go to other draft chapters, both published drafts and very early, unpublished drafts.
As explored in Making engineering strategies more readable, documents that feel intuitive to write are often fairly difficult to read. That’s because thinking tends to be a linear-ish journey from a problem to a solution. Most readers, on the other hand, usually just want to know the solution and then to move on. That’s because good strategies are read for direction (e.g. when a team wants to understand how they’re supposed to solve a specific issue at hand) far more frequently than they’re read to build agreement (e.g. building stakeholder alignment during the initial development of the strategy).
However, many organizations only produce writer-oriented strategy documents, and may not have any reader-oriented documents at all. If you’ve predominantly worked in those sorts of organizations, then the first reader-oriented documents you encounter will seem artificial.
There are also organizations that have many reader-oriented documents, but omit the rationale behind those documents. Those documents feel prescriptive and heavy-handed, because the infrequent reader who does want to understand the thinking can’t find it. Further, when they want to propose an alternative, they have to do so without the rationale behind the current policies: the absence of that context often transforms what was a collaborative problem-solving opportunity into a political match.
With that in mind, I’d encourage you to see the frequent absence of these documents as a major opportunity to drive strategy within your organization, rather than evidence that these documents don’t work. My experience is that they do.
The most frequent failure mode I see for strategy is when it’s rushed, and its authors accept that thinking must stop when the artificial deadline is reached. Taking annual planning at Stripe as an example, Claire Hughes Johnson argued that planning expands to fit any timeline, and consequently set a short planning timeline of several weeks. Some teams accepted that as a fixed timeline and stopped planning when the timeline ended, whereas effective teams never stopped planning before or after the planning window.
When strategy work is given an artificially or unrealistic timeline, then you should deliver the best draft you can. Afterwards, rather than being finished, you should view yourself as starting the refinement process. An open strategy secret is that many strategies never leave the refinement phase, and continue to be tweaked throughout their lifespan. Why should a strategy with an early deadline be any different?
Well, there is one important problem to acknowledge: I’ve often found that the executive who initially provided the unrealistic timeline intended it as a forcing function to inspire action and quick thinking. If you have a discussion with them directly, they’re usually quite open to adjusting the approach. However, the intermediate layers of leadership between that executive and you often calcify on a particular approach which they claim that the executive insists on precisely following.
Sometimes having the conversation with the responsible executive is quite difficult. In that case, you do have to work with individuals taking the strategy as literal and unalterable until either you can have the conversation or something goes wrong enough that the executive starts paying attention again. Usually, though, you can find someone who has a communication path, as long as you can articulate the issue clearly.
Some engineers will argue that the only valid strategy altitude is the highest one defined by executives, because any other strategy can be invalidated by a new, higher altitude strategy. They would claim that teams simply cannot do strategy, because executives might invalidate it. Some engineering executives would argue the same thing, instead claiming that they can’t work on an engineering strategy because the missing product strategy or business strategy might introduce new constraints.
I don’t agree with this line of thinking at all. To do strategy at any altitude, you have to come to terms with the certainty that new information will show up, and you’ll need to revise your strategy to deal with that.
Uber’s service provisioning strategy is a good counterexample against the idea that you have to wait for someone else to set the strategy table. We were able to find a durable diagnosis despite being a relatively small team within a much larger organization that was relatively indifferent to helping us succeed. When it comes to using strategy, effective diagnosis trumps authority. In my experience, at least as many executives’ strategies are ravaged by reality’s pervasive details as are overridden by higher altitude strategies. The only way to be certain your strategy will fail is waiting until you’re certain that no new information might show up and require it changing.
How should you adopt LLMs? discusses how a company should plot a path through the rapidly evolving LLM ecosystem. Periods of rapid technology evolution are one reason why your strategy might encounter a pocket of chaos, but there are many others. Pockets of rapid hiring, as well as layoffs, create chaos. The departure of load-bearing senior leaders can change a company quickly. Slowing revenue in a company’s core business can also initiate chaotic actions in pursuit of a new business.
Strategies don’t require stable environments. Instead, strategies require awareness of the environment that they’re operating in. In a stable period, a strategy might expect to run for several years and expect relatively little deviation from the initial approach. In a dynamic period, the strategy might know you can only protect capacity in two-week chunks before a new critical initiative pops up. It’s possible to good strategy in either scenario, but it’s impossible to good strategy if you don’t diagnose the context effectively.
Often times, the strategy forward is very obvious if a few key decisions were made, you know who is supposed to make those decisions, but you simply cannot get them to decide. My most visceral experience of this was conducting a layoff where the CEO wouldn’t define a target cost reduction or a thesis of how much various functions (e.g. engineering, marketing, sales) should contribute to those reductions. With those two decisions, engineering’s approach would be obvious, and without that clarity things felt impossible.
Although I was frustrated at the time, I’ve since come to appreciate that missing decisions are the norm rather than the exception. The strategy on Navigating Private Equity ownership deals with this problem by acknowledging a missing decision, and expressly blocking one part of its execution on that decision being made. Other parts of its plan, like changing how roles are backfilled, went ahead to address the broader cost problem.
Rather than blocking on missing information, your strategy should acknowledge what’s missing, and move forward where you can. Sometimes that’s moving forward by taking risk, sometimes that’s delaying for clarity, but it’s never accepting yourself as stuck without options other than pointing a finger.
Sometimes you will be told to follow something which is described as a strategy, but is really just a policy without any strategic thinking behind it. This is an unavoidable element of working in organizations and happens for all sorts of reasons. Sometimes, your organization’s leader doesn’t believe it’s valuable to explain their thinking to others, because they see themselves as the one important decision maker.
Other times, your leader doesn’t agree with a policy they’ve been instructed to rollout. Adoption of “high hype” technologies like blockchain technologies during the crypto book was often top-down direction from company leadership that engineering disagreed with, but was obligated to align with. In this case, your leader is finding that it’s hard to explain a strategy that they themselves don’t understand either.
This is a frustrating situation. What I’ve found most effective is writing a strategy of my own, one that acknowledges the broader strategy I disagree with in its diagnosis as a static, unavoidable truth. From there, I’ve been able to make practical decisions that recognize the context, even if it’s not a context I’d have selected for myself.
I started this chapter by acknowledging that the steps to building engineering strategy are a theory of strategy, and one that can get quite messy in practice. Now you know why strategy documents often come across as overly pristine–because they’re trying to communicate clearly about a complex topic.
You also know how to navigate the many ways reality pulls you away from perfect strategy, such as unrealistic timelines, higher altitude strategies invalidating your own strategy work, working in a chaotic environment, and dealing with stakeholders who refuse to align with your strategy. Finally, we acknowledged that sometimes strategy work done by others is not what we’d consider strategy, it’s often unsupported policy with neither a diagnosis nor an approach to operating the policy.
That’s all stuff you’re going to run into, and it’s all stuff you’re going to overcome on the path to doing good strategy work.
Anonymous 15/01/2025 | Source: Matheus Fidelis - Engineering Blog
O ano de publicação desse texto foi marcado por interessantes experiências profissionais nas quais eu pude resolver problemas muito complexos de sistemas distribuídos utilizando o modelo Saga. Logo, por mais que tenha sido sensacional poder compilar todas as referências bibliográficas e materiais que consumi por todo esse período aqui, também foi extremamente desafiador remover as “exclusividades” que foram trabalhadas e deixar as sugestões sem um excesso de particularidades dos meus cenários.
É sempre maravilhoso poder contemplar um material finalizado sobre o tema de microserviços, arquitetura e sistemas distribuídos, mas esse capítulo em questão foi entregue com extrema felicidade. Espero que seja de bom proveito para todos que estão buscando por referências e experiências com esse tipo de implementação.
Uma transação Saga é um padrão arquitetural que visa garantir a consistência dos dados em transações distribuídas, especialmente em cenários onde essas transações dependem de execução contínua em múltiplos microserviços ou possuam uma longa duração até serem completamente finalizadas, e onde qualquer execução parcial é indesejável.
O termo Saga vem do sentido literal de Saga, que o conceito remete a uma aventura, uma história, uma jornada do herói, jornada na qual a mesma remonta vários capítulos onde o “herói” precisa cumprir objetivos, enfrentar desafios, superar certos limites e concluir um objetivo predestinado. Dentro de uma implementação do Saga Pattern, uma Saga possui uma característica sequencial, na qual a transação depende de diversos microserviços para ser concluída, com etapas que devem ser executadas uma após a outra de forma ordenada e distribuída.
A implementação dessas etapas pode variar entre abordagens Coreografadas e Orquestradas, as quais serão exploradas mais adiante. Independentemente da abordagem escolhida, o objetivo principal é gerenciar transações que envolvem dados em diferentes microserviços e bancos de dados, ou que são de longa duração, e garantir que todos os passos sejam executados sem perder a consistência e controle, e em caso de falha de algum dos componentes por erros sistemicos ou por entradas de dados inválidas ter a capacidade de notificar todos os participantes da saga a compensarem a transação executando um rollback de todos os passos já executados.
Lembrando que a principal proposta do modelo Saga é garantir confiabilidade e consistência, não performance. Inclusive, suas maiores nuâncias pagam o preço de performance para atingir esses objetivos.
Não é costume desta série de textos aprofundar demasiadamente nos detalhes acadêmicos e históricos dos tópicos abordados. Porém, vale destacar as origens do Saga Pattern e o problema que ele foi originalmente concebido para resolver.
O Saga Pattern foi publicado pela primeira vez por Hector Garcia-Molina e Kenneth Salem, em 1987, em um artigo para o Departamento de Ciências da Computação da Universidade de Princeton, intitulado “SAGAS“. O objetivo do artigo é enfrentar a problemática das Long Live Transactions (LLTs) nos computadores da época, quando já se buscava uma forma de lidar com processos que demandavam mais tempo que as operações tradicionais e não podiam simplesmente bloquear os recursos computacionais até sua conclusão.
Como mencionado, o termo “Saga” faz alusão a histórias que se desenrolam em capítulos menores, ou seja, a proposta era quebrar uma Transação de Longa Duração em várias transações menores, cada uma podendo ser confirmada ou desfeita de forma independente. Isso transformava uma operação atômica extensa em pequenas transações atômicas, com um nível de supervisão pragmática.
Portanto, embora o Modelo Saga não tenha sido inicialmente projetado para gerenciar consistência em microserviços, e sim para tratar processos computacionais em bancos de dados, ele foi revisitado ao longo do tempo. À medida que microserviços e sistemas distribuídos se tornaram mais comuns no ambiente corporativo, os princípios do Saga Pattern provarem-se úteis para lidar com falhas e garantir a consistência nessas arquiteturas modernas e distribuídas.
Uma transação distribuida é aquela que precisa acontecer em multiplos sistemas e bancos de dados para ser concluída. Por definição, entendemos que ela precisa de multiplos participantes escrevendo e commitando seus dados para que ela seja bem sucedida, e reportando o status de escrita para quem está coordenando a transação.
Vamos imaginar o sistema de pedidos de um grande e-commerce. A funcionalidade principal desse sistema é receber uma solicitação de pedido e executar todas as ações necessárias para garantir a efetivação completa desse pedido, desde a solicitação até a entrega. Para isso, é preciso interagir com diversos microserviços pertinentes a esse fluxo hipotético, como Serviço de Pedidos, Serviço de Pagamentos, Serviço de Estoque, Serviço de Entregas e um Serviço de Notificações que notifica o cliente de todas as etapas do pedido.
Exemplo de um processo distribuido inicial
Em uma arquitetura complexa com múltiplos serviços interligados, cada domínio isolado precisa garantir uma parte da sequência da execução para que o pedido seja concluído com sucesso. À medida que o número de componentes aumenta, a complexidade também cresce, aumentando a probabilidade de falhas e inconsistências.
Exemplo de um erro em uma transação distribuída
Imagine que, durante a execução dessas etapas, um dos serviços falhe por algum motivo não sistêmico em termos de resiliência, como a falta de um item no estoque ou a recepção de informações inválidas pelo serviço de estoque. Nessas situações, pode ser impossível continuar as chamadas para os serviços subsequentes, como o serviço de entregas, mesmo que etapas críticas, como o processamento do pagamento, já tenham sido concluídas com sucesso. Nesse caso, conhecer e desfazer os passos anteriores pode se tornar um problema complicado.
Esse cenário representa um grave problema de consistência distribuída. Sem mecanismos adequados, o sistema pode acabar em um estado inconsistente, onde o pagamento foi efetuado, mas o pedido não foi concluído. O Saga Pattern busca solucionar exatamente esse tipo de problema, garantindo que, mesmo em caso de falhas, o sistema mantenha a integridade dos dados e retorne a um estado consistente em todos os serviços que compõe a transação.
Em diversos cenários, processos complexos exigem um período um pouco mais longo para serem concluídos em sua totalidade. Por exemplo, uma solicitação dentro de um sistema que precisa passar por várias etapas de execução pode levar desde milissegundos até semanas ou meses para ser finalizada completamente.
O tempo de espera entre a execução de um microserviço e o serviço subsequente pode variar intencionalmente devido a fatores como agendamentos, estímulos externos, agrupamento de registros dentro de períodos e outros. Os exemplos disso incluem controle de cobrança de parcelamento, agendamento financeiro, consolidação de franquias de uso de produtos digitais, agrupamento de solicitações para processamento em batch, fechamento de faturas e controle de uso de recursos de um sistema por seus clientes.
Gerenciar o ciclo de vida dessas transações de longo prazo representa um desafio arquitetural significativo, especialmente em termos de consistência e conclusão. É necessário criar mecanismos que permitam controlar transações de ponta a ponta em cenários complexos, monitorar todas as etapas pelas quais a transação passou e determinar e gerenciar o estado atual da transação de forma transparente e duradoura. O Saga Pattern resolve esses problemas ao decompor transações longas em uma série de transações menores e independentes, cada uma gerenciada por um microserviço específico. Isso facilita a garantia de consistência, a recuperação de falhas no quesito de resiliência operacional.
Concluindo o que foi abordado anteriormente na explicação da problemática, o Saga Pattern é um padrão arquitetural projetado para lidar com transações distribuídas e dependentes da consistência eventual em multiplos microserviços.
A proposta da aplicabilidade do Saga Pattern é decompor uma transação longa e complexa em uma sequência de transações menores e coordenadas, que são gerenciadas para garantir a consistência e sucesso ou erro da execução, e principalmente garantir a consistência dos dados em diferentes serviços que sigam o modelo “One Database Per Service”.
Cada Saga corresponde a uma transação pseudo-atômica dentro do sistema, onde cada solicitação corresponde a execução de uma operação isolada. Essas sagas em questão consistem em um agrupamento de operacões menores que acontecem localmente em cada microserviço da saga. Além de proporcionar meios de garantir que todas as etapas sejam concluídas, caso uma das operações da saga falhe, o Saga Pattern define transações compensatórias para desfazer as operações já executadas, assegurando que o sistema se mantenha consistênte até mesmo durante uma falha.
A proposta da Saga quando aplicado em abordagens assincronas elimina a necessidade de bloqueios síncronos e prolongados, como o caso do Two-Phase Commit (2PC) que são computacionalmente caros e podem se tornar gargalos de desempenho em ambientes distribuídos. Esses tipos de bloqueios longos também são complicados de serem restabelecidos em caso de falhas.
Existem dois modelos principais para implementar o Saga Pattern, o Modelo Orquestrado e o Modelo Coreografado. Cada um deles possui características de coordenação e comunicação das transações Saga diferentes em termos arquiteturais. A escolha entre os modelos depende das necessidades específicas de como o sistema foi projetado, e principalmente deve levar em conta a complexidade das transações.
O Modelo Orquestrado propõe a existência de um componente centralizado de orquestração que gerencia a execução das sagas. O Orquestrador é responsável por iniciar a saga, coordenar a sequência de transações, monitorar as respostas e gerenciar o fluxo de compensação em caso de falhas. Ele atua como um control plane que envia comandos para os microserviços participantes e espera pelas respostas para decidir os próximos passos ou continuar a saga.
Exemplificação do Modelo Orquestrado
Considere que para concluir uma transação de um pedido de compra, você precisa estimular e esperar a resposta de confirmação de uma série de domínios como pagamentos, estoques, notificações e entregas. São muitos componentes distribuidos, com suas próprias limitações, capacidades de escala, modos de uso, e que possuem seus próprios contratos e precisam ser acionados de forma sequencial e lógica para que a transação seja concluída. Assumindo uma abordagem assíncrona, um orquestrador utiliza-se do pattern de command / response para acionar esses microserviços, e mediante a resposta de cada um deles acionar o próximo microserviço da saga, compensar as operações já realizadas em caso de falha, ou concluir e encerrá-la. Um orquestrador também pode trabalhar de forma síncrona se necessário, porém mecanismos de resiliência que já são “nativos” de mensageria, como backoff, retries e DLQ’s devem ser implementados manualmente para garantir uma resiliência saudável da execução da saga.
Então a função do orquestrador é basicamente montar um “mapa da saga”, com todas as etapas que precisam ser concluídas para a finaliza-la, enviar mensagens e eventos para os respectivos microserviços e, a partir de suas respostas, prosseguir e estimular o próximo passo da Saga até que a mesma esteja totalmente completa ou compensar as operações já realizadas em caso de falha.
O modelo orquestrado é dependente da implementação de um pattern de Máquina de Estado, e o mesmo deve ser capaz de gerenciar o estado atual e, mediante a respostas, mudar esse estado e tomar uma ação mediante ao novo estado. Dessa forma conseguimos controlar a orquestação de forma centralizada e segura, concentrando a complexidade de orquestração de microserviços em um único componente, onde podemos metrificar todos os passos, o início e fim da execução da saga, controle de historico e alteração de estado de forma transacional e etc.
Em implementações modernas de Saga Pattern, principalmente no modelo orquestrado, muitas das interações entre os participantes da Saga ocorrem de forma assíncrona e reativa. Nessa abordagem, o orquestrador da saga (ou um serviço solicitante, fora do saga pattern) envia um comando para outro microserviço realizar uma ação, e aguarda a resposta de forma bloqueante ou semi-bloqueante antes de prosseguir para o próximo passo da Saga.
Modelo de Comando e Resposta de Fluxos Assincronos
Isso presume que os serviços expostos precisam expor um tópico de ação, e outro para respostas daquela ação em questão, para que o orquestrador ou serviço solicitante saiba onde enviar o comando e onde aguardar pela resposta de conclusão com sucesso ou falha do mesmo.
O modelo Coreografado, ao contrário do Orquestrado que propõe um componente centralizado que conhece todos os passos da saga, propõe que os microserviços devem conhecer o serviço seguinte e o anterior. Isso significa que a saga é executada em uma abordagem de malha de serviço, onde, num caso complexo, um microserviço quando é chamado e termina seu processo conhece o microserviço seguinte e o protocolo que o mesmo usa para expor suas funcionalidades. Esse microserviço se encarrega de executar o passo seguinte e assim sucessivamente até a finalização da saga.
A mesma lógica é aplicada em operações de compensação e rollback, onde o serviço que falhou é obrigado a notificar o anterior ou acionar um “botão do pânico” para que toda a malha anterior regrida com os passos já confirmados.
O modelo coreografado, por mais que seja mais simples e com menos garantias que o orquestrado de primeiro momento, também funciona como um viabilizador de fluxos síncronos para arquiteturas sagas.
As abordagens de Saga podem variar e se extender para diversos patterns arquiteturais. Nessa sessão vamos abordar alguns dos padrões e abordagens que eu considerei mais importantes e relevantes para serem considerados quando avaliamos uma arquitetura Saga para algum projeto.
Em arquiteturas distribuídas, manter o estado de todos os passos que uma saga deve efetuar até ser considerada concluída é talvez a preocupação de maior criticidade. Esse tipo de controle nos permite identificar quais sagas ainda estão pendentes ou falharam e em que passo isso aconteceu, permitindo criar mecanismos de monitoramento, retentativas, resumos de saga e compensação em caso de erros e etc.
Uma Maquina de Estado, ou State Machine, tem a função de lidar com o estados, eventos, transições e ações.
Os Estados representam o estado atual da máquina e os estados possíveis do sistema. O estado atual corresponde descritivamente ao status da transação, literalmente como Iniciado
, Agendado
, Pagamento Concluido
, Entrega Programada
, Finalizado
e etc. Os Eventos correspondem a notificações relevantes do processo que podem ou não alterar o estado atual da máquina. Por exemplo, algum dos passos pode enviar os eventos Pagamento Aprovado
ou Item não disponível no estoque
, que são eventos que podem alterar o curso planejado da saga. Esses eventos podem ou não gerar uma Transição de Estado. As Transições correspondem a mudança de um estado válido para outro estado válido decorrente de um evento recebido. Por exemplo, se o estado de um registro for Estoque Reservado
e o sistema de pagamentos enviar o evento de Pagamento Concluído
, isso pode notificar a máquina e transicionar o estado para Agendar Entrega
. Caso o evento emitido for Pagamento Recusado
, o estado da máquina pode ser transicionado para Pedido Cancelado
por exemplo. Ao transacionar de um Estado para outro, a máquina executa uma Ação para prosseguir com a execução. No exemplo anterior, ao entrar no estado Agendar Entrega
, a máquina precisa invocar o microserviço de entregas.
E dentro de um modelo saga, entendemos que o estado atual corresponde a saga em si e eventos são as entradas e saídas dos microserviços e passos que são chamados. Uma máquina de estado precisa ser capaz de guardar o estado atual e, mediante a um evento de mudança que ela recebe de alguma forma, determinar se existirá uma nova transição de estado, e em caso positivo, qual ação ele deve tomar com relação a isso.
Imagine que a saga seja iniciada, criando um novo registro na máquina de estado que representa o início de uma saga de fechamento de pedido. Esse estado inicial poderia ser considerado NOVO
. Dentro do mapeamento da saga, entendemos que, quando o estado é NOVO
, é necessário garantir que o domínio de pedidos tenha gravado todos os dados referentes à solicitação para fins analíticos.
Exemplo do Fluxo de Transição e Ações da Saga
Assim que o serviço de pedidos confirmar a gravação do registro, o estado pode transicionar para RESERVANDO
, onde o próximo passo da saga se encarregará de reservar o item em estoque. Após receber a confirmação dessa reserva, o estado se tornará RESERVADO
, iniciando em seguida o processo de cobrança, alterando o estado para COBRANDO
. Nesse momento, o sistema de pagamentos será notificado e poderá levar algum tempo para responder, informando se o pagamento foi efetivado ou não.
Em caso de sucesso, o estado mudará para COBRADO
, e o sistema de entregas será notificado sobre quais itens devem ser entregues, bem como o endereço de destino. Assim, o estado transiciona para INICIAR_ENTREGA
. A partir daí, poderíamos ter diversos estados intermediários, nos quais ações adicionais, como o envio de notificações por e-mail, seriam realizadas. Exemplos incluem SEPARACAO
, SEPARADO
, DESPACHADO
, EM_ROTA
e ENTREGUE
. Finalmente, a saga atinge o estado FINALIZADO
, sendo considerada concluída em sua totalidade.
Por outro lado, se o sistema de pagamentos, partindo do estado COBRANDO
, mudar para um estado de falha como PAGAMENTO_NEGADO
ou NAO_PAGO
, a saga deverá notificar o sistema de reservas para liberar os itens, possibilitando que sejam novamente disponibilizados para compra, além de atualizar o estado analítico do sistema de pedidos.
De modo geral, a máquina de estado segue uma lógica semelhante a:
COBRADO COM SUCESSO
COBRANDO
COBRANDO
e eu recebo COBRADO COM SUCESSO
, para qual estado devo ir? → INICIAR_ENTREGA
INICIAR_ENTREGA
? → Notificar o sistema de entregas.Basicamente, o controle funciona questionando: “Que evento é esse?”, “Onde estou agora?”, “Para onde vou agora?” e, finalmente, “O que devo fazer aqui?”.
Manter registros de todos os passos da transação pode ser extremamente vantajoso, tanto em sagas mais simples quanto, principalmente, nas mais complexas, porém pode se tornar custoso se mantido por longo prazo. A principal vantagem de manter uma coordenação de estados é possibilitar a rastreabilidade de todas as sagas: as concluídas, as que estão em andamento ou as que foram finalizadas com erro.
Podemos considerar estruturas e modelagens de dados que permitam gerar uma rastreabilidade completa de todos os passos iniciados e finalizados. Dessa forma, o componente centralizado — no caso dos modelos orquestrados — registra e mantém documentados os passos executados, bem como as respectivas respostas, facilitando o controle pragmático ou manual.
Com isso, é possível verificar de maneira simples quais sagas apresentaram erros, mantendo esses registros na camada de dados. Esses recursos fornecem insumos para criar mecanismos de resiliência inteligentes o suficiente para monitorar, retomar, reiniciar ou tentar novamente os passos que falharam, além de auxiliar na construção de uma visão analítica da execução da jornada de serviço.
Projetar sistemas distribuídos é assumir um compromisso no qual reconhecemos que lutaremos constantemente contra problemas de consistência de dados. Os patterns de compensação dentro das transações Saga garantem que todos os passos, executados de forma sequencial, possam ser revertidos em caso de falha.
Assim como o modelo Saga é criado para garantir que todas as transações saudáveis sejam executadas com sucesso, o modelo de compensação assegura que, em caso de falha sistêmica — seja por dados inválidos, problemas de disponibilidade irrecuperáveis dentro do SLA da Saga, problemas de saldo, pagamentos, limites de crédito, disponibilidade de estoque ou dados de entrada inválidos — as ações sejam completamente revertidas, permitindo que o sistema retorne a um estado consistente e evitando que apenas parte da transação seja confirmada enquanto o restante falha.
Uma forma eficiente de projetar handlers que recebem estímulos e executam algum passo da saga, seja por meio de endpoints de API ou de listeners de eventos ou mensagens, é expor esses handlers junto aos métodos de reversão. Assim, sempre haverá um handler que execute a ação e outro que desfaça essas ações. Por exemplo, reservaPassagens()
e liberaPassagens()
, cobrarPedido()
e estornarCobranca()
, ou incrementarUso()
e decrementarUso()
.
Uma vez que dispomos das ferramentas necessárias para que o modelo de orquestração escolhido possa acionar os microserviços responsáveis pelas ações solicitadas, podemos assegurar o chamado “caminho feliz” da saga.
Com o modelo de Ação e Compensação implementado, o orquestrador da saga também pode “apertar o botão do pânico” quando necessário, notificando todos os microserviços participantes para desfazerem as ações que foram confirmadas. Em uma arquitetura orientada a eventos ou mensageria que ofereça suporte a esse tipo de transação, podemos criar um tópico de compensação da saga com múltiplos consumer groups, de modo que cada um receba a mesma mensagem e execute a compensação se a transação já tiver sido confirmada no serviço em questão.
O Dual Write é conhecido tanto como um problema quanto como um pattern clássico em arquiteturas distribuídas. Ele ocorre com frequência em cenários onde determinadas operações precisam gravar dados em dois locais diferentes — seja em um banco de dados e em um cache, em um banco de dados e em uma API externa, em duas APIs distintas ou em um banco de dados e em uma fila ou tópico. Em essência, sempre que for necessário garantir a escrita de forma atômica em múltiplos pontos, estaremos diante desse tipo de desafio.
Para ilustrar o problema na prática em uma aplicação que utiliza o Saga Pattern, consideremos um exemplo em que seja preciso confirmar a operação em um local, mas o outro esteja indisponível. Nesse caso, a confirmação não será atômica, pois as duas escritas deveriam ser consideradas juntas para manter a consistência dos dados.
No modelo coreografado, para que uma operação seja concluída em sua totalidade, cada microserviço executa localmente as ações em seu banco de dados e em seguida publica um evento no broker para o próximo serviço dar continuidade ao fluxo. Esse seria o “caminho feliz” da saga, sem problemas de consistência até aqui.
Modelo Coreografado - Exemplo de dual write
Os problemas de consistência aparecem, por exemplo, quando o dado não é salvo no banco de dados, mas o evento é emitido em sequência; ou quando o dado é salvo corretamente, porém, por indisponibilidade do broker de mensagens, o evento não é emitido. Em ambos os casos, o sistema pode se encontrar em um estado inconsistente.
Modelo Coreografado - Exemplo de falha de dual write
No modelo orquestrado, o mesmo problema pode ocorrer, ainda que de forma ligeiramente diferente. Em um cenário de comando e resposta entre orquestrador e microserviços, se um deles falhar ao tentar garantir a escrita dupla (entre suas dependências e o canal de resposta), poderemos ter uma saga perdida, em que etapas intermediárias não são confirmadas e ficam “presas” no meio do processo por falta de resposta ou confirmação.
Modelo Orquestrado - Exemplo de falha de dual write
Garantir que todos os passos sejam executados com a devida atomicidade é, talvez, a maior complexidade na implementação de um modelo Saga. Os mecanismos de controle precisam dispor de recursos sistêmicos suficientes para lidar com problemas de falhas, adotando retentativas, processos de supervisão de sagas e formas de identificar aquelas que foram iniciadas há muito tempo e ainda não foram concluídas ou estão em um estado inconsistente. A alternativa mais eficiente dentro de banco de dados ACID por exemplo, é executar a publicação do evento dentro de uma transaction no banco de dados, e só commitar a modificação dos dados quando os processos de comunicação estarem concluídos, garantindo que todos os processos, ou nenhum, sejam efetuados.
O Outbox Pattern já foi mencionado anteriormente algumas vezes, porém resolvendo problemas diferentes. Nesse caso, podemos utilizá-lo para atribuir uma característica transacional a execução e controle de steps da saga. Onde temos um processo de relay adicional em um modelo orquestrado que através de uma fila sincrona do banco, consegue verificar quais steps de quais sagas estão pendentes e somente removê-los dessa “fila” no banco quando todos os processos de execução do step forem devidamente executados.
Essa é uma abordagem interessante para se blindar contra os problemas de Dual Write e ajudar a aplicação a se garantir em questão de resiliência em períodos de indisponibilidades totais e parciais de suas dependências.
Mecanismos de Change Data Capture podem ser empregados para lidar com o transporte do dado para o sistema subsequente. Essa abordagem pode ser implementada em ambas alternativas arquiteturais do Saga Pattern, embora lidar com as transações de forma pragmática, controlando manualmente a execução, os fallbacks e as lógicas de negócio referentes aos steps da saga seja o mais indicado no padrão orquestrado pelo próprio objetivo do orquestrador.
Embora os exemplos deste capítulo tenham adotado uma característica de orquestração assíncrona para detalhar as implementações de Saga, é possível explorar tópicos que nos ajudem a manter certos níveis de consistência em um contexto síncrono, típico de uma abordagem cliente/servidor (request/reply).
O Two-Phase Commit (2PC) é um padrão bastante conhecido para tratar sistemas distribuídos. Ele propõe que, em uma transação com vários participantes, exista um coordenador capaz de garantir que todos estejam “pré-confirmados” (prontos para gravar a transação) antes de efetivamente aplicar as mudanças em seus respectivos estados, realizando, portanto, a confirmação em duas fases. Caso algum dos passos não confirme que esté pronto para comitar o estado, nenhum deles recebe o comando de commit. Além de implementações de microserviços, esse pattern é muito bem empregado em estratégias de replicação.
Two-Phase Commit executado com sucesso
Esse protocolo 2PC traz a sensação de atomicidade para serviços distribuídos que compõem uma transação, pois o coordenador envia solicitações de confirmação a cada participante antes de efetivar o commit. Tal abordagem pode ser de grande valor em transações Saga que exijam a validação de todos os passos antes da conclusão total — principalmente em cenários síncronos, nos quais o cliente aguarda uma resposta imediata e, muitas vezes, a operação pode ser abortada repentinamente, sem a possibilidade de compensar etapas já executadas.
Two-Phase Commit executado com erro
Caso algum dos serviços não responda com sucesso, ou em tempo hábil para o mecanismo de coordenação da transação, o mesmo envia sinal de rollback da transação para que todos os participantes não considerem as transações pendentes.
Esse pattern, por mais que seja muito útil, também pode se tornar um gargalo de performance em ambientes de alta demanda, por precisar gerenciar multiplas conexões abertas a todo momento em diferentes contextos. Uma forma de otimizar esse tipo de abordagem é adotar protocolos de comunicação que facilite a gestão de long-live-connections como o gRPC que pode manter conexões bidirecionais e reaproveitar a conexão para diversas requisições.
Ainda que os mecanismos de coordenação do Saga Pattern forneçam diversos “guard rails” para a execução de transações, imprevistos sistêmicos podem ocorrer, resultando em inconsistências de estado entre os microserviços. Nesse cenário, é preciso tomar decisões de negócio sobre como lidar com falhas significativas entre os participantes da saga: optar por compensações em massa ou por alguma estratégia de reinicialização de saga.
No caso de uma reinicialização de saga, é essencial que todos os microserviços implementem controles de idempotência, de forma a receber o mesmo comando múltiplas vezes sem gerar erros inesperados. Por exemplo, se um serviço de reserva de quartos de hotel receber repetidamente a mesma solicitação de reserva para o mesmo quarto e para o mesmo usuário, deve aceitar a operação sem sobrescrever ou alterar o estado, enviando a devida resposta de sucesso. Isso facilita processos de ressincronização do estado.
Quando o processo de coordenação (seja orquestrado ou coreografado) recebe estímulos para iniciar uma nova saga com identificadores únicos ou chaves de idempotência já existentes para outra saga, ele pode reiniciar a saga por completo ou verificar quais etapas ficaram incompletas, de modo a reinicializa-las a partir do ponto em que não houve resposta, garantindo assim a consistência das transações.
SAGAS - Department of Computer Science Princeton University
Saga distributed transactions pattern
The Saga Pattern in a Reactive Microservices Environmen
Enhancing Saga Pattern for Distributed Transactions within a Microservices Architecture
SAGA Pattern para microservices
Saga Pattern — Um resumo com Caso de Uso (Pt-Br)
Distributed Sagas: A Protocol for Coordinating Microservices
What is a Saga in Microservices?
Try-Confirm-Cancel (TCC) Protocol
Microservices Patterns: The Saga Pattern
Compensating Actions, Part of a Complete Breakfast with Sagas
Getting started with small-step operational semantics
Microserviços e o problema do Dual Write
Solving the Dual-Write Problem: Effective Strategies for Atomic Updates Across Systems
Outbox Pattern(Saga): Transações distribuídas com microservices
Saga Orchestration for Microservices Using the Outbox Pattern
Martin Kleppmann - Distributed Systems 7.1: Two-phase commit
Eli Bendersky 14/01/2025 | Source: Eli Bendersky's website
Automatic Differentiation (AD) is an important algorithm for calculating the derivatives of arbitrary functions that can be expressed by a computer program. One of my favorite CS papers is "Automatic differentiation in machine learning: a survey" by Baydin, Perlmutter, Radul and Siskind (ADIMLAS from here on). While this post attempts to be useful on its own, it serves best as a followup to the ADIMLAS paper - so I strongly encourage you to read that first.
The main idea of AD is to treat a computation as a nested sequence of function compositions, and then calculate the derivative of the outputs w.r.t. the inputs using repeated applications of the chain rule. There are two methods of AD:
Reverse mode AD is a generalization of the backpropagation technique used in training neural networks. While backpropagation starts from a single scalar output, reverse mode AD works for any number of function outputs. In this post I'm going to be describing how reverse mode AD works in detail.
While reading the ADIMLAS paper is strongly recommended but not required, there is one mandatory pre-requisite for this post: a good understanding of the chain rule of calculus, including its multivariate formulation. Please read my earlier post on the subject first if you're not familiar with it.
Let's start with a simple example where the computation is a linear chain of primitive operations: the Sigmoid function.
This is a basic Python implementation:
def sigmoid(x):
return 1 / (1 + math.exp(-x))
To apply the chain rule, we'll break down the calculation of to a sequence of function compositions, as follows:
Take a moment to convince yourself that is equivalent to the composition .
The same decomposition of sigmoid into primitives in Python would look as follows:
def sigmoid(x):
f = -x
g = math.exp(f)
w = 1 + g
v = 1 / w
return v
Yet another representation is this computational graph:
Each box (graph node) represents a primitive operation, and the name assigned to it (the green rectangle on the right of each box). An arrows (graph edge) represent the flow of values between operations.
Our goal is to find the derivative of S w.r.t. x at some point ,
denoted as . The process starts by running the computational
graph forward with our value of
. As an example, we'll use
:
Since all the functions in this graph have a single input and a single output, it's sufficient to use the single-variable formulation of the chain rule.
To avoid confusion, let's switch notation so we can explicitly see which
derivatives are involved. For and as before, we can
write the derivatives like this:
Each of these is a function we can evaluate at some point; for example, we
denote the evaluation of at as .
So we can rewrite the chain rule like this:
Reverse mode AD means applying the chain rule to our computation graph, starting with the last operation and ending at the first. Remember that our final goal is to calculate:
Where S is a composition of multiple functions. The first composition we unravel is the last node in the graph, where v is calculated from w. This is the chain rule for it:
The formula for S is , so its derivative is 1. The formula for v is , so its derivative is . Substituting the value of w computed in the forward pass, we get:
Continuing backwards from v to w:
We've already calculated in the previous step. Since , we know that , so:
Continuing similarly down the chain, until we get to the input x:
We're done; the value of the derivative of the sigmoid function at is 0.24; this can be easily verified with a calculator using the analytical derivative of this function.
As you can see, this procedure is rather mechanical and it's not surprising that it can be automated. Before we get to automation, however, let's review the more common scenario where the computational graph is a DAG rather than a linear chain.
The sigmoid sample we worked though above has a very simple, linear computational graph. Each node has a single predecessor and a single successor; moreover, the function itself has a single input and single output. Therefore, the single-variable chain rule is sufficient here.
In the more general case, we'll encounter functions that have multiple inputs, may also have multiple outputs [1], and the internal nodes are connected in non-linear patterns. To compute their derivatives, we have to use the multivariate chain rule.
As a reminder, in the most general case we're dealing with a function that has
n inputs, denoted , and m
outputs, denoted . In other words, the function is
mapping .
The partial derivative of output i w.r.t. input j at some point a is:
Assuming f is differentiable at a, then the complete derivative of f w.r.t. its inputs can be represented by the Jacobian matrix:
The multivariate chain rule then states that if we compose (and assuming all the dimensions are correct), the derivative is:
This is the matrix multiplication of and
.
As a warmup, let's start with a linear node that has a single input and a single output:
In all these examples, we assume the full graph output is S, and its derivative by the node's outputs is . We're then interested in finding . Since since , the Jacobian is just a scalar:
And the chain rule is:
No surprises so far - this is just the single variable chain rule!
Let's move on to the next scenario, where f has two inputs:
Once again, we already have the derivative available, and we're interested in finding the derivative of S w.r.t. the inputs.
In this case, , so the Jacobian is a 1x2 matrix:
And the chain rule here means multiplying a 1x1 matrix by a 1x2 matrix:
Therefore, we see that the output derivative propagates to each input separately:
In the most general case, f may have multiple inputs but its output may also be used by more than one other node. As a concrete example, here's a node with three inputs and an output that's used in two places:
While we denote each output edge from f with a different name, f has a single output! This point is a bit subtle and important to dwell on: yes, f has a single output, so in the forward calculation both and will have the same value. However, we have to treat them differently for the derivative calculation, because it's very possible that and are different!
In other words, we're reusing the machinery of multi-output functions here. If f had multiple outputs (e.g. a vector function), everything would work exactly the same.
In this case, since we treat f as , its Jacobian is a 2x3 matrix:
The Jacobian is a 1x2 matrix:
Applying the chain rule:
Therefore, we have:
The key point here - which we haven't encountered before - is that the derivatives through f add up for each of its outputs (or for each copy of its output). Qualitatively, it means that the sensitivity of f's input to the output is the sum of its sensitivities across each output separately. This makes logical sense, and mathematically it's just the consequence of the dot product inherent in matrix multiplication.
Now that we understand how reverse mode AD works for the more general case of DAG nodes, let's work through a complete example.
Consider this function (a sample used in the ADIMLAS paper):
It has two inputs and a single output; once we decompose it to primitive operations, we can represent it with the following computational graph [2]:
As before, we begin by running the computation forward for the values of at which we're interested to find the derivative. Let's take and :
Recall that our goal is to calculate and . Initially we know that [3].
Starting with the node, let's use the fan-in formulas developed earlier:
Next, let's tackle . It also has a fan-in configuration, so we'll use similar formulas, plugging in the value of we've just calculated:
On to . It's a simple linear node, so:
Note the (1) superscript though! Since is a fan-out node, it will have more than one contribution to its derivative; we've just computed the one from . Next, let's compute the one from . That's another fan-in node:
We've calculated the other contribution to the derivative, and the first out of two contributions for the derivative. Next, let's handle :
Finally, we're ready to add up the derivative contributions for the input arguments. is a "fan-out" node, with two outputs. Recall from the section above that we just sum their contributions:
And:
And we're done! Once again, it's easy to verify - using a calculator and the analytical derivatives of - that these are the right derivatives at the given points.
A quick note on reverse mode AD vs forward mode (please read the ADIMLAS paper for much more details):
Reverse mode AD is the approach commonly used for machine learning and neural networks, because these tend to have a scalar loss (or error) output that we want to minimize. In reverse mode, we have to run AD once per output, while in forward mode we'd have to run it once per input. Therefore, when the input size is much larger than the output size (as is the case in NNs), reverse mode is preferable.
There's another advantage, and it relates to the term vector-jacobian product (VJP) that you will definitely run into once you start digging deeper in this domain.
The VJP is basically a fancy way of saying "using the chain rule in reverse mode AD". Recall that in the most general case, the multivariate chain rule is:
However, in the case of reverse mode AD, we typically have a single output
from the full graph, so is a row vector. The chain rule
then means multiplying this row vector by a matrix representing the node's
jacobian. This is the vector-jacobian product, and its output is another
row vector. Scroll back to the Fan-out sample to see an example of this.
This may not seem very profound so far, but it carries an important meaning in terms of computational efficiency. For each node in the graph, we don't have to store its complete jacobian; all we need is a function that takes a row vector and produces the VJP. This is important because jacobians can be very large and very sparse [4]. In practice, this means that when AD libraries define the derivative of a computation node, they don't ask you to register a complete jacobian for each operation, but rather a VJP.
This also provides an additional way to think about the relative efficiency of reverse mode AD for ML applications; since a graph typically has many inputs (all the weights), and a single output (scalar loss), accumulating from the end going backwards means the intermediate products are VJPs that are row vectors; accumulating from the front would mean multiplying full jacobians together, and the intermediate results would be matrices [5].
Enough equations, let's see some code! The whole point of AD is that it's automatic, meaning that it's simple to implement in a program. What follows is the simplest implementation I could think of; it requires one to build expressions out of a special type, which can then calculate gradients automatically.
Let's start with some usage samples; here's the Sigmoid calculation presented earlier:
xx = Var(0.5)
sigmoid = 1 / (1 + exp(-xx))
print(f"xx = {xx.v:.2}, sigmoid = {sigmoid.v:.2}")
sigmoid.grad(1.0)
print(f"dsigmoid/dxx = {xx.gv:.2}")
We begin by building the Sigmoid expression using Var values (more on this later). We can then run the grad method on a Var, with an output gradient of 1.0 and see that the gradient for xx is 0.24, as calculated before.
Here's the expression we used for the DAG section:
x1 = Var(2.0)
x2 = Var(5.0)
f = log(x1) + x1 * x2 - sin(x2)
print(f"x1 = {x1.v:.2}, x2 = {x2.v:.2}, f = {f.v:.2}")
f.grad(1.0)
print(f"df/dx1 = {x1.gv:.2}, df/dx2 = {x2.gv:.2}")
Once again, we build up the expression, then call grad on the final value. It will populate the gv attributes of input Vars with the derivatives calculated w.r.t. these inputs.
Let's see how Var works. The high-level overview is:
Here's the Var class:
class Var:
def __init__(self, v):
self.v = v
self.predecessors = []
self.gv = 0.0
v is the value (forward calculation) of this Var. predecessors is the list of predecessors, each of this type:
@dataclass
class Predecessor:
multiplier: float
var: "Var"
Consider the v5 node in DAG sample, for example. It represents the calculation v4-v3. The Var representing v5 will have a list of two predecessors, one for v4 and one for v3. Each of these will have a "multiplier" associated with it:
Let's see some overloaded operators of Var [6]:
def __add__(self, other):
other = ensure_var(other)
out = Var(self.v + other.v)
out.predecessors.append(Predecessor(1.0, self))
out.predecessors.append(Predecessor(1.0, other))
return out
# ...
def __mul__(self, other):
other = ensure_var(other)
out = Var(self.v * other.v)
out.predecessors.append(Predecessor(other.v, self))
out.predecessors.append(Predecessor(self.v, other))
return out
And some of the custom math functions:
def log(x):
"""log(x) - natural logarithm of x"""
x = ensure_var(x)
out = Var(math.log(x.v))
out.predecessors.append(Predecessor(1.0 / x.v, x))
return out
def sin(x):
"""sin(x)"""
x = ensure_var(x)
out = Var(math.sin(x.v))
out.predecessors.append(Predecessor(math.cos(x.v), x))
return out
Note how the multipliers for each node are exactly the derivatives of its output w.r.t. corresponding input. Notice also that in some cases we use the forward calculated value of a Var's inputs to calculate this derivative (e.g. in the case of sin(x), the derivative is cos(x), so we need the actual value of x).
Finally, this is the grad method:
def grad(self, gv):
self.gv += gv
for p in self.predecessors:
p.var.grad(p.multiplier * gv)
Some notes about this method:
The full code for this sample is available here.
The goal of this post is to serve as a supplement for the ADIMLAS paper; once again, if the topic of AD is interesting to you, I strongly encourage you to read the paper! I hope this post added something on top - please let me know if you have any questions.
Industrial strength implementations of AD, like autograd and JAX, have much better ergonomics and performance than the toy implementation shown above. That said, the underlying principles are similar - reverse mode AD on computational graphs. To explore how such a system works, see my radgrad project.
[1] | In this post we're only looking at single-output graphs, however, since these are typically sufficient in machine learning (the output is some scalar "loss" or "error" that we're trying to minimize). That said, for functions with multiple outputs the process is very similar - we just have to run the reverse mode AD process for each output variable separately. |
[2] | Note that the notation here is a bit different from the one used for the sigmoid function. This notation is adopted from the ADIMLAS paper, which uses for all temporary values within the graph. I'm keeping the notations different to emphasize they have absolutely no bearing on the math and the AD algorithm. They're just a naming convention. |
[3] | For consistency, I'll be using the partial derivative notation throughout this example, even for nodes that have a single input and output. |
[4] | For an example of gigantic, sparse jacobians see my older post on backpropagation through a fully connected layer. |
[5] | There are a lot of additional nuances here to explain; I strongly recommend this excellent lecture by Matthew Johnson (of JAX and autograd fame) for a deeper overview. |
[6] | These use the utility function ensure_var; all it does is wrap the its argument in a Var if it's not already a Var. This is needed to wrap constants in the expression, to ensure that the computational graph includes everything. |
Eryn Wells 14/01/2025 | Source: Eryn Rachel Wells
Today is my birthday! 🥳
37 is prime, and also sexy prime. It has the interesting property of being the number most likely offered when you ask someone to pick a number between 1 and 1001. Thirty seven is neat!
Dennis is 37. He’s not old!
Eta Boötes, 37 lightyears away, is now in my light cone. NGC 2169 is an open star cluster in Orion that looks remarkably like the number 37.
thirty-seven.org is a website dedicated to the collection of artifacts marked with the number 37.
This Veritasium video is a cool roundup of the interesting things about 37.
When you control for 69 and 42, which have certain culteral significance. ↩︎