Skip to main content
← Back to archive
Edition #37 April 2026

The ClickedOn AI Wrap — Week of 07 Apr 2026

GPT-5.4 just beat humans at using a computer and that changes the marketing playbook.

Another seven days, another shift in what AI actually means for your business. This week we saw a frontier model quietly cross a line humans held for decades, a free open model land that rivals last year's paid giants, and an AI answer engine arrive on 250 million desktops.

TL;DR

  • GPT-5.4 with native computer use: OpenAI's new model scored 75.0% on OSWorld-Verified, beating the 72.4% human benchmark for the first time.
  • Gemma 4 goes fully open: Google shipped a four-size family under Apache 2.0, with the 31B variant beating much larger closed models on human preference scoring.
  • Yahoo Scout launches: A Claude-powered AI answer engine goes live across 250 million US Yahoo users, adding a third major AI search surface marketers have to optimise for.
  • Anthropic changes Claude tool billing: Pro and Max subscribers can no longer route plan usage through third-party frameworks like OpenClaw without paying extra from 4 April.
  • Microsoft ships MAI voice, image and transcription models: A clear signal Microsoft is reducing its OpenAI dependence for first-party features.
  • Utah greenlights AI prescription renewals: The first US state to let AI systems renew drug prescriptions, a meaningful move from diagnosis to decisioning.
Frontier Models

GPT-5.4 becomes the first model to beat humans at using a computer

OpenAI released GPT-5.4 this week, and the headline number is the one that should make every operations and marketing leader pay attention. On OSWorld-Verified — the benchmark that measures whether a model can actually drive a desktop through screenshots, mouse and keyboard — GPT-5.4 hit 75.0%. GPT-5.2 managed 47.3% only a few months ago. Human expert testers score 72.4%.

This is the first general-purpose model OpenAI has shipped with computer use built directly into the base model rather than bolted on as a separate tool. It means GPT-5.4 can open software, configure settings, fill forms, navigate browsers and organise folders without needing a custom scaffolding layer.

Why it matters for marketers and business owners: the things you pay people to do inside ad platforms, analytics dashboards, CRMs and CMSs are exactly the tasks OSWorld measures. Campaign uploads, bulk edits, reporting exports, landing page QA, bid adjustments — the repetitive desktop work that eats junior marketer hours is now something a model can do reliably enough to trust with supervision. We are not at full autonomy, but we are past the point where "AI can't really use my tools" is a safe assumption.

The practical move this week is to audit your team's recurring desktop workflows and rank them by how rules-based they are. The tasks near the top of that list are the ones to pilot with a computer-use agent over the next quarter. The tasks near the bottom — the ones that need judgement, context and stakeholder communication — are the ones your people should be doing instead. This is how you get measurable growth from AI rather than a pile of half-finished experiments.

OpenAI & The Next Web

Open Models

Gemma 4 resets the open model bar under Apache 2.0

Google DeepMind released Gemma 4 on 2 April, and it is the most important open model drop of the year so far. The family ships in four sizes — 2B, 4B, 26B MoE and 31B dense — all under a fully permissive Apache 2.0 licence. That licence change matters as much as the benchmarks: unlike earlier Gemma versions with custom terms, you can now use, modify and redistribute these models commercially with no restrictions.

The 31B dense variant posted an Arena ELO of 1,452, beating several closed models with far more parameters on human preference scoring. All sizes are multimodal across text and image, the edge models add audio input, and context windows run up to 256K tokens. Google is positioning Gemma 4 as the foundation for the next generation of Gemini Nano, so the same code targets on-device deployment from Raspberry Pi through to high-end GPUs.

For businesses, this is the moment where "running your own model" stops being a research project and becomes a real option for privacy-sensitive workloads. If you have data you would rather not send to a third-party API — customer records, internal documents, confidential campaign data — a Gemma 4 4B or 26B deployment on your own infrastructure now performs well enough to handle drafting, summarisation, classification and retrieval-augmented generation for most use cases. Combine that with the permissive licence and you can finally build AI into products you ship without an ongoing per-token bill.

Google DeepMind & Hugging Face

AI Search & GEO

Yahoo Scout puts a third AI answer engine in front of 250 million people

Yahoo quietly turned on Scout this week, an AI answer engine built on Anthropic's Claude and grounded with Microsoft's Bing grounding API. It is now live across Yahoo Search, Mail, News, Finance and Sports, which together reach 250 million US users and sit on top of 500 million user profiles.

Why this is the most underrated story of the week: until now the AI search conversation has been Google AI Overviews versus ChatGPT Search versus Perplexity. Scout adds a fourth major surface, and crucially it is one most marketers are not tracking. Yahoo's demographic skews older and higher-income than Google's, and the audience overlap with Finance and Mail means Scout will be used for a lot of commercial and research queries — exactly the ones that drive purchases.

What this means for your GEO and AI visibility work: optimising for Google AI Overviews alone is no longer enough. You need to measure citations across at least four answer engines, and the weighting of each depends on where your customers actually are. For Australian brands selling into US markets — or running US campaigns — Scout is now a meaningful new surface. Bing Webmaster Tools already added AI Performance reporting earlier this year, which gives you the first direct signal of when your content is cited across AI-driven experiences grounded on Bing, including Scout.

The practical play is to pull your last 90 days of high-intent content, check it against Bing's AI Performance report, and identify which pages are being cited and which are not. The gap is your roadmap.

Real Internet Sales & Microsoft Advertising

Quick Hits

  • Anthropic tightens Claude subscriptions: From 4 April, Pro and Max users can no longer route plan usage through third-party tools like OpenClaw. Existing subscribers get a one-time credit until 17 April. TechCrunch
  • Microsoft ships MAI models: New MAI voice, image and transcription models rolled out this week, reducing Microsoft's reliance on OpenAI for first-party features. Labla
  • Gemini API adds Flex and Priority tiers: Google introduced explicit cost, latency and reliability tradeoffs on the Gemini API so developers can tune inference per workload. Google Cloud
  • Anthropic partners with Australia on AI safety: A new federal partnership focused on safety research and tracking the economic impact of frontier models on Australian industry. Anthropic
  • Utah approves AI drug prescription renewals: The first US state to let AI systems renew prescriptions without a human clinician in the loop. A clear escalation from diagnostic assist to treatment decisioning. AI and News
  • Agentic AI market hits $7.51B: Enterprise agentic AI spend is now growing at 27.3% CAGR, per fresh analyst figures released this week. BIA
  • ASOS rolls out virtual try-ons with AIUTA: The retailer is live-testing AI-generated try-ons across body types, heights and skin tones, a useful signal for ecommerce brands weighing up generative imagery. CNBC
  • Anthropic retires 1M context beta on Sonnet 4.5: From 30 April, the 1M token context beta on Claude Sonnet 4.5 and Sonnet 4 goes away. Opus 4.6 and Sonnet 4.6 now support 300k max tokens via Batches API. Anthropic Docs

The ClickedOn Take

Three stories this week all point in the same direction: AI has quietly moved from the "suggest things to a human" era into the "do things for a human" era. GPT-5.4 can now operate your desktop better than most of the people you would hire to do the same work. Gemma 4 means you can run capable models on your own infrastructure without a licence fee. And Scout means your customers are getting AI-generated answers on surfaces you are not yet measuring.

For Australian businesses and marketers, the practical response is not to panic-buy ten new tools. It is to pick one workflow, one content gap and one measurement blind spot, and fix them this fortnight. That might look like piloting computer-use automation on your weekly PPC reporting pack, deploying a small Gemma 4 instance for internal document Q&A, and adding Bing's AI Performance report to your GEO tracking. None of those are glamorous and all of them compound.

The brands winning in AI search right now are not the ones with the biggest budgets. They are the ones being cited by AI models because they publish clear, differentiated, factually tight content that models find easy to trust. That is a transparent, performance-driven game, and it is the one we think will define marketing for the rest of 2026.

Tool of the Week

Gemma 4 on Ollama

Google's newly Apache-licensed Gemma 4 is already available on Ollama. Pull the 4B variant and have a capable multimodal model running on your laptop in under five minutes, handling text, images and long context up to 256K tokens. Cheapest way to experiment with private, on-device AI. Keep Claude or GPT-5.4 for reasoning-heavy work and use Gemma for high-volume, lower-complexity tasks where privacy and cost matter more than raw capability.

Sources This Week

OpenAI, Google DeepMind, Hugging Face, The Next Web, TechCrunch, Anthropic, Google Cloud, Microsoft Advertising, Real Internet Sales, CNBC, Boston Institute of Analytics, AI and News, Labla, Ollama.

Ready to grow?

Let's talk about how AI-powered performance marketing can drive measurable results for your business.