April 2, 2026

The Tools We Trust: Slack, Seneca, and the Quiet Coup of Autonomous Software

Seneca kept a household staff of considerable size, which bothered him. Not because the labor was expensive, though it was. What gnawed at the philosopher was simpler and harder to dismiss: he could not always tell whether his servants were executing his intentions or pursuing their own. The more capable they became, the less he could verify their work without doing it himself, which defeated the entire purpose of having them. He wrote about this tension in his forty-seventh letter. It has not aged a day.

This afternoon, Salesforce dropped what Marc Benioff is calling the most significant update to Slack since the acquisition five years ago. Thirty new AI features, all at once. The Slackbot is no longer a search bar with delusions of competence. It is now, in Salesforce's framing, an agentic assistant built on Anthropic's Claude that can transcribe meetings, monitor deals across your CRM pipeline, draft follow-up communications based on conversational context, and execute multi-step workflows without waiting for you to hold its hand through each one. A million businesses use Slack daily. By tomorrow morning, every one of those businesses will be running an autonomous reasoning engine inside their primary communication channel whether they asked for it or not.

I spent twenty minutes reading through the feature documentation. The thing that struck me was not any single capability. Individually, each feature is a logical increment. Transcription. Summarization. Contextual drafting. We have seen these before. What struck me was the aggregate. When you bolt thirty incremental capabilities onto a tool that already sits at the center of organizational communication, you have not upgraded a chatbot. You have introduced a new employee. One that never sleeps, never forgets, and whose judgment you cannot interrogate because the reasoning happens inside a neural network that Anthropic built and Salesforce licensed. That is a fundamentally different kind of trust relationship than anything a workplace communication tool has ever demanded.

The Context Engineering Problem Nobody Is Discussing

Buried inside the press materials is a phrase that deserves more scrutiny than it is getting. Salesforce says the quality of Slackbot's responses depends on what they call "context engineering" — the process of determining which information gets fed into the model's prompt window for any given query. This is not a minor implementation detail. This is the entire game.

Here is what I mean. Suppose you ask the new Slackbot to summarize the status of a deal. The model needs context to generate a useful answer. Which Slack channels does it read? Which CRM fields does it pull? Which email threads does it consider relevant? Those decisions are made by the context engineering layer, not by the language model, and not by you. You see the output. You do not see the input selection. And the quality of the output is entirely determined by whether the input selection was appropriate, which you have no practical way of verifying without doing the research yourself.

Seneca would have recognized this instantly. He wrote extensively about the danger of delegating judgment rather than merely delegating labor. A servant who carries your letter to the Senate is performing a task. A servant who decides which letters are worth carrying is exercising judgment on your behalf, and the moment you stop checking which letters he left behind, you have ceded a portion of your authority that you may not get back. The new Slackbot does not carry your letters. It reads your letters, decides which ones matter, and writes your replies. The productivity gains are real. So is the transfer of epistemic authority from the human to the system.

Meanwhile, the Machines Learn to Use Other Machines

This would be unsettling enough in isolation. But it lands in a week where GPT-5.4's autonomous workflow capabilities have started showing up in production environments, and the results are forcing a reckoning that the industry has been putting off.

The number that matters: 75% on the OSWorld benchmark. That benchmark simulates real desktop productivity tasks. Navigate a spreadsheet. Fill out a form. Complete a multi-step workflow across three applications. Human experts score 72.4%. GPT-5.4 beats them. Not by gaming the test. By actually operating the computer. Clicking buttons. Reading screens. Making decisions about which application to open next based on the state of the task.

I want to sit with that for a moment because I think we are moving past it too quickly. A language model is now better than a trained human professional at using a computer to do office work. Not better at chess, which is a solved problem. Not better at Go, which is a bounded game. Better at the unbounded, messy, context-dependent work of navigating real software environments. The thing that most knowledge workers do eight hours a day, five days a week. That thing.

Seneca had a line that keeps coming back to me whenever I read these benchmarks. "It is not that we have a short time to live, but that we waste a great deal of it." He meant it as a moral observation. In 2026 it reads more like a market analysis. If the routine portions of knowledge work can be automated — and the benchmarks now say they can — then the question every professional needs to answer is not whether their job is safe. It is whether the parts of their job that justify their salary are the routine parts or the non-routine parts. Because the routine parts just got a replacement that works nights and weekends for the cost of an API call.

The Treasury Wakes Up

While Silicon Valley was shipping features, Washington was doing something that will matter more in the long run, though it got approximately one percent of the attention. The Treasury Department, through something called the Artificial Intelligence Transformation Office, launched what they are calling the AI Innovation Series. The stated purpose is to "support the continued strength and resilience of the U.S. financial system in an era of accelerating technological change." The unstated purpose is more direct. The federal government is trying to figure out what happens to financial stability when autonomous AI systems start making economically significant decisions at scale.

This is not paranoia. This is pattern recognition. When Slackbot can draft contract responses, when GPT-5.4 can navigate enterprise software autonomously, when these systems are deployed across millions of businesses simultaneously, the aggregate economic effect is not a productivity improvement. It is a structural shift in how economic decisions get made. The Treasury Department is staffed by people who remember what happened the last time a structural shift in decision-making technology outpaced the regulatory framework. They called it 2008. The instruments were different. The dynamic was the same. Complexity moved faster than oversight.

Seneca spent his later years watching Rome's financial apparatus grow beyond the comprehension of the people nominally in charge of it. The provincial governors could not audit the tax farmers. The tax farmers could not audit the merchants. Everyone was getting rich and nobody could explain exactly how. He noted, with the dry precision that made his letters survive two millennia, that prosperity built on systems nobody understands is not prosperity. It is a postponed reckoning.

Four People Left the Planet Today

At 6:35 p.m. Eastern time, roughly ninety minutes after Salesforce finished announcing its thirty features, a rocket carrying four human beings left the surface of the Earth bound for the vicinity of the Moon. Artemis II. The first crewed mission beyond low Earth orbit since Gene Cernan climbed back into the Apollo 17 capsule in December 1972. Fifty-three years. Victor Glover, Christina Koch, Reid Wiseman, Jeremy Hansen. They will travel 252,000 miles from this planet, farther than any human being has ever been.

I find it clarifying, this juxtaposition. On the same afternoon, one company taught a chatbot to read your email and draft your replies, while another organization strapped four people to a controlled explosion and pointed them at a rock a quarter million miles away. Both represent the pinnacle of what our species can do when it commits resources and talent to a goal. But they sit on opposite ends of a spectrum that I think defines our current moment.

The AI announcements are about efficiency. Doing more with less. Replacing human labor with automated systems. Getting the quarterly numbers to move in the right direction. These are legitimate goals, and I have spent my career building the infrastructure that makes them possible. But efficiency is a means, not an end. It answers the question of how. It does not touch the question of what for.

Artemis II answers the what-for question. Not perfectly, not completely, but it gestures in a direction that no quarterly earnings call ever will. Four people are falling toward the Moon right now because a civilization decided, after half a century of finding reasons not to, that going back was worth the expense, the risk, and the terrifying vulnerability of putting human bodies in a vacuum where nothing about the environment supports human life. That decision has no business case. It has no ROI. It has something more durable. It has the weight of a species looking at itself and deciding that it has not yet finished becoming what it is capable of becoming.

The Question That Remains

I do not know whether Salesforce's thirty features will make knowledge work better or simply make it faster, which is not the same thing. I do not know whether GPT-5.4's ability to operate a computer will liberate professionals from drudgery or gradually convince organizations that professionals are the drudgery. I do not know whether the Treasury's new initiative will produce governance frameworks that keep pace with deployment, or whether it will produce reports that arrive six months after the systems they describe have already been replaced by newer ones.

What I do know, and this is Seneca's contribution to an afternoon otherwise dominated by press releases, is that the question of what to delegate is never purely technical. It is moral. Every decision to hand a task to an autonomous system is a decision about what you consider important enough to do yourself. And the accumulation of those decisions, across millions of users and thousands of organizations and an entire economy restructuring itself around AI capabilities that did not exist eighteen months ago, will define what human work means for the next generation.

Seneca's household staff eventually grew so large and so capable that he could not function without them. He knew this. He wrote about it with the unflinching honesty that made his philosophy worth reading two thousand years later. "We suffer more in imagination than in reality," he said, but he also understood that some sufferings are not imaginary. Some are structural. You build a dependency. The dependency becomes load-bearing. And then you discover that the foundation of your daily life rests on systems whose internal logic you chose, at some point, to stop examining.

The Slackbot is very good now. The question is not whether to use it. The question, the one Seneca would ask if he were sitting in your Monday standup, is whether you still know how to do the things you are about to let it do for you. Because the day you cannot answer that question is the day the tool stops being a tool and starts being a dependency. And dependencies, as Rome discovered, do not send advance notice before they become liabilities.

Four people are on their way to the Moon. Down here, the rest of us are teaching our software to think. Both endeavors require trust. Only one of them requires us to remain capable of doing the work ourselves if the systems fail. I would pay attention to which one that is.

← Back to Blog