The AI Operating System

An AI Agent that uses your computer, including the browser, Excel, and PowerPoint, to do tasks

đź”· Subscribe to get breakdowns of the most important developments in AI in your inbox every morning.

Gist: An AI Agent that uses your computer, including the browser, Excel, and PowerPoint, to do tasks.

Who: Shanghai AI Lab plus others

What did they do:

  • Built an agent using a mix of Python code and GPT-4 language model prompts called Friday; that

  • controls a Linux or Mac OS computer,

  • including browser, Excel, and PowerPoint, to perform tasks; and

  • self-improves

How did they do it?

  • Created a set of sequential prompts and code, grouped into agents such as:

    • Planner - decompose user requests into smaller tasks

    • Configurator - middleware to take each task and configure it with data from memory or how-tos from tool repositor before passing to Executor

    • Declarative memory - user profile and history of previous actions

    • Tool repository - tools available

    • Working memory - where the next steps for tasks and previous history are kept

    • Executor - generates executable command

    • Critic - assessing whether a task has been completed successfully or whether iteration is needed

  • GPT-4 was the underlying AI model

Generating python code to set dark mode on an app

What did they find?

  • Friday (their agent framework) outperformed GPT-4 with Plugins on a benchmark for general agents

  • It could perform tasks in both Excel and PowerPoint

Comparison of FRIDAY agent on the GAIA agent benchmark

What are the implications?

  • This is actually a working demonstration of Andrej Karpathy’s proposal for an AI Operating System

  • Ideas have been circulating for a while now

  • These systems will get better

Become a subscriber for daily breakdowns of what’s happening in the AI world:

Reply

or to participate.