Recently, AI researcher Simon Willison wanted to add up his charges from using a cloud service, but the payment values and dates he needed were scattered among a dozen separate emails. Inputting them manually would have been tedious, so he turned to a technique he calls "video scraping," which involves feeding a screen recording video into an AI model, similar to ChatGPT, for data extraction purposes.
What he discovered seems simple on its surface, but the quality of the result has deeper implications for the future of AI assistants, which may soon be able to see and interact with what we're doing on our computer screens.