Deere's Physical-World Data Loop: See & Spray and the Berkshire Test

A Sprayer Boom in an Iowa Field

Picture a Deere sprayer running a soybean field in late June. The boom is 120 feet wide, pulled by a tractor at fifteen miles per hour. Mounted along the boom is a row of cameras and the GPU compute that reads them, each camera covering a strip of ground a few feet wide. The system is looking for green-on-green: weeds growing inside the canopy of the crop the farmer planted. The compute decides which of the boom's individual nozzles to open and which to leave shut, in roughly a hundredth of a second per frame. A nozzle opens over a pigweed; the next nozzle, eighteen inches further along the boom, stays closed because the patch below it is bare soil. The farmer, watching the as-applied map on the cab screen, sees chemical consumption come in at a fraction of a broadcast pass.

That is John Deere's See & Spray, and it is the cleanest example I have found of AI applied at the bottleneck of a heavy industrial operation. The economics that matter for a row-crop farmer in 2026 are dollars-per-acre at the spraying step and bushels-per-acre at harvest. See & Spray attacks the first one directly. Every pass labels its own training data, because each nozzle decision is paired with a downstream observation: did the weed die, did the crop survive, did the herbicide tank empty at the rate the model predicted.

Where the Money Lives in a Row-Crop Field

The bottleneck in row-crop agriculture is herbicide cost per acre at the spraying step, and the yield it produces at harvest. A farmer's revenue per acre is set by yield times price; price is mostly out of the farmer's hands. On the cost side, chemical inputs sit near the top of the line items after seed and machinery depreciation. Herbicide is the input where the technology has not changed in twenty years: a broadcast pattern across the whole field, chemical hitting crop and weeds and bare ground indifferently. The farmer pays for the gallons that went on the ground regardless of whether each gallon hit something the chemical needed to kill.

In Goldratt's vocabulary the spraying step is where the next dollar of operating cost stops producing throughput. Broadcasting herbicide across an acre that is 95% bare soil and 5% weeds is exactly the wrong place to optimize: the chemical gets consumed, but only 5% of the spray volume reached a weed. The boom is the bottleneck. Anything that does not improve what happens at the boom is the AI-everywhere reflex Goldratt warned against.

The location matters for one more reason. Yield at harvest is a downstream label on every decision the sprayer made three months earlier. A weed that survived the spray competes with the crop for water, nitrogen, and light through the back half of the growing season, and shows up as a yield reduction at the combine. The combine, with its own yield-monitoring instrumentation, labels each square meter the sprayer treated. The two axes are coupled at the level of physics, not at the level of software.

How See & Spray Works

See & Spray is the lever. The system places a computer-vision model at the boom, pairs it with a per-nozzle solenoid that can open or close in milliseconds, and discharges herbicide only on the pixels the model identified as a weed. Deere has described the product in successive earnings calls and in its SEC filings; the Investor Relations page is where the most recent quarterly disclosures live. Technically, the architecture is a vision model running on the sprayer itself, with a control loop into the hardware. Operationally, it is a fleet of sprayers, each one a labeled-data factory running at fifteen miles per hour across the row-crop acreage of North America.

A second lever sits in the background. The John Deere Operations Center is the cloud-side counterpart to the in-field equipment. It aggregates machine telemetry, agronomic data, prescription maps, and yield results from connected machines across hundreds of millions of acres. It is where a particular weed's spray decision in June gets joined with that same square meter's yield reading in October. The vision model at the boom is the visible end of the loop. The Operations Center is the part that turns each pass into next year's training data.

The model at the boom makes a perceptual decision, many times per second per camera, about whether the pixels in front of it are crop or weed. The strategic decisions about rotation, seed varieties, and herbicide programs are made by the farmer and the agronomist. The AI's job is narrower and more load-bearing: a perceptual layer that lets the sprayer act on individual weeds rather than on the average over the field. The narrowness is the point.

Condition	Score (0–4)	Evidence sentence
Proprietary data origin	4	The vision corpus is generated by Deere's equipment fleet on customers' fields and joined to outcome data only Deere observes via the Operations Center; no third party can buy the same corpus.
Self-labeling workflow	4	Every spray pass produces paired image-and-outcome data: each nozzle decision is labeled by the downstream combine yield map and by the spray-tank consumption telemetry.
Decreasing marginal cost	3	Once the in-cab vision infrastructure, the per-nozzle solenoid retrofit, and the Operations Center pipeline are paid for, each additional connected acre adds to the corpus at near-zero marginal infrastructure cost.
Defensible asymmetry	4	The embedded equipment fleet is the moat; a vision-only competitor without the sprayer install base cannot label its own data, and the sprayer-install-base barrier compounds over equipment-replacement cycles measured in years.

The Four Conditions

Condition 1 — Proprietary data origin. The vision corpus is generated by Deere equipment, on Deere customers' fields, joined to outcome data those customers' machines transmitted to Deere's Operations Center. James Currier's analysis of data network effects names the test directly: you are where the data is generated. Deere is. There is no third-party dataset of in-canopy weed images at the per-square-meter resolution See & Spray produces, joined to the yield label from the same square meter four months later, that a competitor could license. The corpus does not live on the open internet, cannot be purchased from a satellite-imagery vendor, and exists only as the residue of Deere's customers running Deere's equipment over Deere's connected acreage. The provenance is the moat.

Condition 2 — Self-labeling workflow. Each spray pass produces a per-nozzle decision log, geo-tagged to a square meter of field. The downstream combine, running over the same square meter at harvest, produces a yield reading that labels the sprayer's decisions made three months earlier. The chemical-consumption telemetry from the spray tank is a second label, available in near real time: the model's prediction of how many ounces would be discharged on this pass should match what the tank actually emptied, and the residual is itself a signal. This is the AI factory pattern Iansiti and Lakhani named in Competing in the Age of AI, instantiated in steel and fiberglass. There is no human in the loop hand-labeling weed-versus-crop pixels at scale; the operation labels itself.

Condition 3 — Decreasing marginal cost per cycle. The infrastructure that makes See & Spray work is paid for once per piece of equipment: camera array, per-nozzle solenoid retrofit, in-cab compute, cellular telemetry, Operations Center ingestion pipeline. After that capital is sunk, each additional acre that the machine crosses adds to the training corpus at a marginal cost approaching the cost of the diesel that moved the tractor. The J-curve Brynjolfsson, Rock and Syverson described in The Productivity J-Curve is the shape of the Deere build-out: heavy intangible investment up front, suppressing measured productivity in the install years, followed by a long tail of compounding return as the install base grows. I score this condition at 3 rather than 4 because the diesel and the equipment-replacement cycle remain a non-trivial floor on the marginal cost per acre.

Condition 4 — Defensible asymmetry. The asymmetry is the embedded equipment fleet. A pure-software competitor with a vision model and no install base cannot reach the same corpus, because the corpus is generated by the act of spraying, and the act of spraying is done by the equipment. A competitor with a different equipment line (a CNH Industrial, an AGCO) has conceptual access to the loop but has to build out a comparable install base of sprayers. They have to retrofit those sprayers with cameras and solenoids. They have to connect them to a cloud backbone of comparable maturity. And they have to do all of that on the replacement cycle of agricultural equipment, which is measured in years, not quarters. That cycle is itself the moat. This is Carlota Perez's deployment-phase pattern: the incumbent with the embedded operational footprint wins by out-positioning the entrant on the install base the model needs to do its work.

The skeptical case worth naming is the one Casado and Lauten lay out in The Empty Promise of Data Moats. Most claimed data moats plateau within a quarter or two: more data stops helping past the point where the model has learned the easy patterns. Casado is correct about the general case. The Deere loop survives the critique because the corpus is not just images; it is images joined to outcomes joined to a specific square meter at a specific moment in the growing season. Hand a competitor ten million weed images scraped from the internet and they will train a vision model that plateaus in a quarter. Hand them ten million in-canopy images each joined to a yield reading four months later and they have something that does not plateau.

The Easier Wrong Choice: AgriGPT

The wrong place. Imagine a Deere that, in the early 2020s, had spent its AI budget on a more obviously fashionable bet. Call it AgriGPT: a farmer-facing chatbot, deployed through the John Deere Operations Center mobile app, that lets a grower ask questions in natural language about weather, agronomy, equipment status, and prescription planning. The product would have been a fine demo. It would have generated press coverage. To a board reviewing the 2023 capital allocation, it would have looked like the obvious AI move for an industrial company entering the LLM era.

Why it would have looked attractive. The early signals would have been encouraging. Adoption metrics in a pilot rollout would probably have looked strong for the first two quarters; farmers like answers, and a chatbot trained on Deere documentation, agronomic literature, and equipment manuals would have delivered useful ones. The internal team would have shown a deck of farmer testimonials, screenshots of helpful answers, and a usage curve trending up and to the right. Cost-savings projections from reduced call-center load would have been straightforward to model. The board would have approved a second round of investment.

The failure mechanics. The failure would have arrived in the eighteen-to-twenty-four-month window. Climate Corporation's FieldView, Granular, and a half-dozen agritech startups would have shipped similar farmer-facing chatbots within a year, on the same commodity LLM capabilities Deere was using. Luke Sernau's argument in We Have No Moat, And Neither Does OpenAI lands with full force here: a capability built on commodity infrastructure is available to everyone who buys the same infrastructure, and first-mover advantage decays in months rather than years. The chatbot would have become table stakes across the industry by 2025. None of the conversations would have labeled anything that fed back into Deere's equipment. Herbicide cost per acre at the spraying step would have been unchanged by the entire investment.

The time to failure. Roughly eighteen months from the second-round funding decision, after which the differentiator collapses and the chatbot becomes a free feature in every competitor's app. By then the executive who sponsored the bet has likely moved on, and the chatbot is a sunk-cost line item the next CIO inherits.

The early-warning signal. A careful observer could have seen it up front, in one question: does the chatbot have a label loop? It does not. Each conversation produces a usage event, not a paired outcome the next training cycle can ingest. The vision model at the boom has a label loop, because the combine in October labels the spray in June. The chatbot's most generous read is that user thumbs-up-or-down is a label about the chatbot's helpfulness, not about the underlying agronomic question the farmer was trying to answer. Compounding requires the labels to live on the same axis as the bottleneck, and a chatbot's labels do not.

What Deere Teaches

The Deere loop teaches one rule. Vision-plus-action at a physical bottleneck beats conversation-plus-recommendation at a knowledge bottleneck, when the physical bottleneck is where the operating cost lives. The chatbot would have been a Roman Candle precisely because it was attached to the wrong step in the value chain. The cameras on the boom are attached to the right step, and the rightness is observable in the way the labels flow.

A perceptual model at the bottleneck, with a per-decision outcome label arriving downstream, does not generalize to most AI products being built in 2026, which are knowledge surfaces rather than perception-and-action surfaces. But it does generalize to a recognizable class of industrial problems. The class is: there is a physical step where the unit cost lives, the step can be instrumented with sensing and per-unit action, and a downstream measurement labels each per-unit decision. That class is small, valuable, and underbuilt.

What You Can Do

If you run an industrial operation, look at your value chain and ask the labeling question first. For the step where the next dollar of unit cost lives, is there a downstream measurement that could be paired back to each decision made at that step? If yes, you have the pre-condition for a Deere-style loop, and the AI investment belongs at that step with sensing-plus-action, not at the front of the value chain with a chatbot. If no, the labeling pre-condition is what to build first. Any AI investment that ignores the missing label loop is a candidate Roman Candle regardless of how good the model is. The test is not whether the AI is impressive. The test is whether the work produces the labels the next cycle of the AI will train on, at the place in the operation where the cost actually lives.

Back to the framework: The Berkshire Test for AI.

Continue the series: Progressive's Risk-Selection Flywheel — the textbook case where all four conditions hold strongly. Mastercard's Network of Labeled Outcomes — chargebacks as the label substrate, with a candid asterisk on the Visa peer. Mayo Clinic's Outcome-Labeled Corpus — longitudinal clinical outcomes labeling decade-old ECG readings.

Deere's Physical-World Data Loop

A Sprayer Boom in an Iowa Field

Where the Money Lives in a Row-Crop Field

How See & Spray Works

The Four Conditions

The Easier Wrong Choice: AgriGPT

What Deere Teaches

What You Can Do

Related

Mastercard's Network of Labeled Outcomes

Progressive's Risk-Selection Flywheel

AI Built Layer 1. The Seam Still Needed a Human Adversary.