Robot vacuum cleaner products are by far the largest category of consumer robots. They roll around on floors, hoovering up dust and dirt so we don’t have to, all while avoiding obstacles.

The industry leader, iRobot, has been cleaning up the robot vacuum market for two decades. Over this time, the company has steadily gained fans and a sterling reputation, including around security and privacy.

And then, something shocking happened. Someone posted on Facebook a picture of a woman sitting on the toilet in her home bathroom — a picture taken by a Roomba.

And the world responded: “Wait, what?!”

Why robots are taking pictures in bathrooms

iRobot has continuously improved its products since introducing the first Roomba roughly 20 years ago. In 2015, these improvements started involving cameras. The 7th-generation Roomba sported an upward-facing camera for identifying “landmarks” like lamps and sofas to improve navigation. It also included a downward-facing infrared sensor for mapping the floor. These sensors work in tandem with onboard intelligence to process the visual data. Roombas can now recognize some 80 different objects commonly found in homes.

In other words, computer vision, which requires cameras, is one key to making home vacuum robots work better.

From a cybersecurity perspective, sensor-laden rolling robots of any kind fall into a special class. They’re equivalent to multiple IoT devices that can move all over the house, all while maintaining an internet connection.

Why pictures taken by robots ended up on Facebook

According to an investigation by MIT Technology Review, iRobot claims that the bathroom-photo scandal has been misunderstood by the public.

That bathroom photo was one of a series that was snapped in 2020 by an experimental R&D Roomba unit. People with these units had supposedly agreed to appear in photos, though iRobot declined to prove it by showing consent agreements. The photos, iRobot claimed, were part of the company’s ongoing efforts to train its AI algorithms. And they were captured with specially modified robot vacuums designed to share such pictures and also to notify nearby humans using lights that pictures were being taken.

There’s a kind of irony in the image sharing. Ideally, the vacuum’s AI would automatically avoid taking pictures of humans. But to do that, iRobot first needs to use pictures of people to teach the AI what a human looks like.

In other words, Roombas are sharing pictures of people to prevent Roombas from sharing pictures of people.

A focus On security

iRobot has long enjoyed a reputation for solid privacy and security, using encryption, regular security patches, transparency to customers about how it uses all data collected and limiting which employees have access to consumer data. But data policies and technologies don’t change the inherent privacy risk of widely and globally handled imagery.

iRobot says 95% of its images for training AI come from the real homes of volunteers, including some iRobot employees, who know they’re participating in product-improving computer vision research. The company recently started asking customers through its app if they consent to share user-selected photos for the purposes of training AI. iRobot specifies in its privacy policy that it will collect audio, video and pictures only when customers explicitly share them with the company.

The way iRobot’s competitors handle photos, on the other hand, runs the gamut. Some don’t use customer photos at all. Others have built-in features for sharing customer photos with the company and everything in between.

The unknowable future of iRobot’s privacy

It’s possible that iRobot’s sterling privacy reputation doesn’t matter. The company signed an agreement in August to be acquired by Amazon for $1.7 billion. Pending the Federal Trade Commission’s antitrust investigation, the deal is likely to go through.

Critics of the deal point out that Amazon’s “Astro” home robot, launched to the public last year and which navigates homes like a Roomba, largely failed in the market. Privacy advocates also criticized that product.

Astro’s features include the ability to use cameras to map interior spaces, to live-stream audio and video to users on a mobile app, to use face recognition technology to recognize people in the home and more. The device and software also enable customers to switch off data collection and opt out of data sharing with Amazon.

Another concern is the data Amazon captures through its eCommerce site, its Echo smart speakers and Ring doorbell cameras. Mashing all this data together could give Amazon a very detailed picture of customers and their homes.

The iRobot acquisition adds another home personal data stream for Amazon to add to the mix.

What we’ve learned from the scandal

One major takeaway from all this is that it’s difficult in practice to assure total privacy in the world of AI. iRobot is a perfect example of why that’s true: Pictures, including privacy-violating photos, destined for AI training tend to travel globally.

iRobot, which gathers photos from users all over the world, partnered on AI with San Francisco-based Scale AI. That company uses contract workers abroad — in this case, with gig workers in Venezuela (who posted the pictures online “in violation of a written non-disclosure agreement” between iRobot and Scale AI, according to iRobot, and the relationship has been terminated). Those gig workers’ job is to analyze photos and document what’s in them as part of a process of training the AI called “data annotation”. The posting on social media involved data annotation workers looking for help identifying objects and generally discussing their work.

The fact is that many companies in many industries share a huge quantity of private photos across the globe, and the practice will likely grow as those companies develop and implement more visually oriented AI technologies. This global sharing of photos is already far more widespread than the public imagines.

In fact, while iRobot shared just 15 images with MIT Technology Review, they claim to have shared more than 2 million with Scale AI and an unknown number with other unspecified data annotation companies. So just one company is responsible for millions of photos, and there are thousands of companies doing similar AI-training development work. Without laws, industry best practices or significant pushback by consumers, pictures are bound to get leaked.

Robot data gathering remains a sticky issue

We’re quickly moving into a world of ubiquitous AI and computer vision. And these technologies need to be trained with real-world data. Locking that down, especially when these technologies involve hundreds or thousands of people around the world, is extremely difficult and likely to result in errors, leaks and hacks.

The only advice security experts can offer to consumers is this: Read privacy policies. Buy reputable brands. Look for new security labels for consumer IoT devices. Know how to shut off data captures and sharing. And assume that home appliances with cameras and microphones are watching and listening.

Or: Buy a broom and sweep the floor yourself.

More from Artificial Intelligence

Preparing for the future of data privacy

4 min read - The focus on data privacy started to quickly shift beyond compliance in recent years and is expected to move even faster in the near future. Not surprisingly, the Thomson Reuters Risk & Compliance Survey Report found that 82% of respondents cited data and cybersecurity concerns as their organization’s greatest risk. However, the majority of organizations noticed a recent shift: that their organization has been moving from compliance as a “check the box” task to a strategic function.With this evolution in…

Cloud Threat Landscape Report: AI-generated attacks low for the cloud

2 min read - For the last couple of years, a lot of attention has been placed on the evolutionary state of artificial intelligence (AI) technology and its impact on cybersecurity. In many industries, the risks associated with AI-generated attacks are still present and concerning, especially with the global average of data breach costs increasing by 10% from last year.However, according to the most recent Cloud Threat Landscape Report released by IBM’s X-Force team, the near-term threat of an AI-generated attack targeting cloud computing…

Testing the limits of generative AI: How red teaming exposes vulnerabilities in AI models

4 min read - With generative artificial intelligence (gen AI) on the frontlines of information security, red teams play an essential role in identifying vulnerabilities that others can overlook.With the average cost of a data breach reaching an all-time high of $4.88 million in 2024, businesses need to know exactly where their vulnerabilities lie. Given the remarkable pace at which they’re adopting gen AI, there’s a good chance that some of those vulnerabilities lie in AI models themselves — or the data used to…

Topic updates

Get email updates and stay ahead of the latest threats to the security landscape, thought leadership and research.
Subscribe today