Anthropic Computer Control, Ideogram's Canvas and SD3.5

Top 10 AI News #weekly

# 1 Claude AI tool can now carry out jobs such as filling forms and booking trips

Claude is now introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. They are also introducing a new capability in beta: computer use.

Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text.

Further Reading

# 2 AI video startup Genmo launches Mochi 1, an open source rival to Runway, Kling, and others

Genmo, an AI company focused on video generation, has announced the release of a research preview for Mochi 1, a new open-source model for generating high-quality videos from text prompts — and claims performance comparable to, or exceeding, leading closed-source/proprietary rivals such as Runway’s Gen-3 Alpha, Luma AI’s Dream Machine, Kuaishou’s Kling, Minimax’s Hailuo, and many others.

Further Reading

# 3 Ideogram launches infinite Canvas for manipulating, combining generated images that can also produce highly accurate text baked into the image itself

Canadian AI image startup Ideogram, founded last year by former AI researchers from Google Brain, has made a new for itself among AI creators with its text-to-image models that produce a wide range of styles from realistic to fantastical, and most impressively of all, highly accurate text baked into the image itself (something other leading image generators, including Midjourney, took a while to implement and still struggle to generate reliably).

Further Reading

# 4 Stable Diffusion 3.5 Updates With New Models and Expanded Features, allowing both commercial and non-commercial use.

The SD 3.5 family is designed to run on consumer-grade systems—even low end by some standards—making advanced image generation more accessible than ever. And yes, they’ve heard the complaints about the previous version so this one promises to be a lot better.

Another important aspect of this release is the new licensing model. Stable Diffusion 3.5 comes under a more permissive license, allowing both commercial and non-commercial use. Small businesses and people who make less than $1,000,000 in revenue from the tool can use and build on these models for free.

Further Reading

# 5 ElevenLabs Introduces Voice Design: Describe the age, accent, tone or character and create a new voice in seconds

ElevenLabs just introduced Voice Design, a new AI voice generation that allows you to generate a unique voice from a text prompt alone.

You can describe the age, accent, tone, or character itself to generate a new and accurate AI voice in seconds. The new Voice Design is fairly easy to use, and ElevenLabs has also stated that the API will be available in 1 week.

Further Reading

# 6 Google tool makes AI-generated writing easily detectable

Google DeepMind has been using its AI watermarking method on Gemini chatbot responses for months – and now it’s making the tool available to any AI developer.

Further Reading

# 7 Microsoft introduces ‘AI employees’ that can handle client queries

Microsoft is introducing autonomous artificial intelligence agents, or virtual employees, that can perform tasks such as handling client queries and identifying sales leads.

The US tech company is giving customers the ability to build their own AI agents as well as releasing 10 off-the-shelf bots that can carry out a range of roles including supply chain management and customer service.

Further Reading

# 8 Anthropic Introduces New Analysis Tool in Claude That Can Write and Run JavaScript Code

Anthropic’s Claude chatbot can now write and run JavaScript code.

Today, Anthropic launched a new analysis tool that helps Claude respond with what the company describes as “mathematically precise and reproducible answers.” With the tool enabled — it’s currently in preview — Claude can perform calculations and analyze data from files like spreadsheets and PDFs, rendering the results as interactive visualizations.

Further Reading

# 9 Microsoft just dropped OmniParser model,a general screen parsing tool

OmniParser is a general screen parsing tool, which interprets/converts UI screenshot to structured format, to improve existing LLM based UI agent.

Further Reading

# 10 Google is developing J.A.R.V.I.S. that can takes over a person’s web browser to complete tasks

Google is developing J.A.R.V.I.S. that can takes over a person’s web browser to complete tasks such as gathering research, purchasing a product or booking a flight.

google preps jarvis ai

Google is “developing artificial intelligence that takes over a person’s web browser to complete tasks such as gathering research, purchasing a product or booking a flight.”

“Project Jarvis” — in a nod to J.A.R.V.I.S. in Iron Man — would operate in Google Chrome and is a consumer-facing (rather than enterprise) feature to “automate everyday, web-based tasks.” The article doesn’t specify whether this would be for mobile or desktop.

At I/O, Pichai showed off “Gemini and Chrome working together to help you do a number of things to get ready: Organizing, reasoning, synthesizing on your behalf.” That on-stage scenario was generically happening via gemini.google.com with no other UI shown off compared to the previous example happening through Gemini for Android.

Further Reading

Best Prompt(s) #weekly

Prompt from Anthropic Claude Sonnet 3.5

Prompt from Anthropic for the new Claude Sonnet 3.5.

Zerox OCR

Zerox OCR is a zero-configuration OCR powered by GPT-4o mini, which can easily translate documents into Markdown formats such as PDFs, Word files and images.

Agent.exe is a simple Electron app that lets Claude 3.5 Sonnet control your local computer directly.

Latest AI Tools In CogList #weekly

# Moz Pro

Moz is a SEO tool that can help rank higher, drive qualified traffic to your website with its keyword research, link building, site audits, and rank tracking.

Moz Pro Review

# Ahrefs

Ahrefs is a keyword research tool with AI for indie hackers, digital marketers, and SEO specialists.

Ahrefs Review

# Google Trends

Google Trends is a trending tool that can help marketers, researchers, and business analyze the popularity of search queries in Google during a period of time.

Google Trends Review

# Rebecca AI

Rebecca AI is an AI ideal validation tool built to analyze, develop and validate business ideas with efficiency for startup founders.

Rebecca AI Review

# DimeADozen AI

DimeADozen AI is a business idea validation tool that generates a 40-section detailed report on various aspects of a business idea, including risks, MVP etc.

DimeADozen AI Review

This Week's Summary

In this edition, we have explored a variety of developments in the world of artificial intelligence. Big moments include Anthropic's new "computer use" ability for its Claude AI model, allowing it to automate tasks by directly controlling computers; Ideogram's "Infinite Canvas" for manipulating and combining generated images with accurate text; Genmo's open-source Mochi 1 model for high-quality video generation; and ElevenLabs' innovative Voice Design tool that can generate personalized AI voices from text prompts. We also cover news on Microsoft's AI employees, Google's AI watermarking tool, and updates to Stable Diffusion 3.5.

From Anthropic's Computer Control to Ideogram's Infinite Canvas, This Week is Filled with Key Insights You Won't Want to Miss

Top 10 AI News #weekly

Best Prompt(s) #weekly

Prompt from Anthropic Claude Sonnet 3.5

Best Open Source Alternatives to Proprietary Software#weekly

Zerox OCR

Agent.exe

Latest AI Tools In CogList #weekly

This Week's Summary

U.S. AI Companies that Raised $100 Million or More So Far in 2024

AI startups in Google's Academy,Photoshop's Perfect Blend, Perplexity Space,GPT-4 Audio Preview and more

OpenAI Introduces Canvas for Collaborative Writing and Coding

From Anthropic's Computer Control to Ideogram's Infinite Canvas, This Week is Filled with Key Insights You Won't Want to Miss

Top 10 AI News #weekly

Best Prompt(s) #weekly

Prompt from Anthropic Claude Sonnet 3.5

Best Open Source Alternatives to Proprietary Software#weekly

Zerox OCR

Agent.exe

Latest AI Tools In CogList #weekly

This Week's Summary

Related

U.S. AI Companies that Raised $100 Million or More So Far in 2024

AI startups in Google's Academy,Photoshop's Perfect Blend, Perplexity Space,GPT-4 Audio Preview and more

OpenAI Introduces Canvas for Collaborative Writing and Coding

Modal Title