Hospitals pledge to protect patient privacy. Almost all their websites leak visitor data like a sieve

Every hospital in America promises to protect the privacy of its patients and the details of their medical care. And almost every one of them uses sophisticated data tools to track and share the personal information of visitors as soon as they start clicking on their websites.

A new study found that 99% of U.S. hospitals employed online data trackers in 2021 that transmitted visitors’ information to a broad network of outside parties, including major technology companies, data brokers, and private equity firms.

The data captured included visits to pages on specific conditions such as depression, breast cancer, and Alzheimer’s disease. The ubiquitous use of the tracking tools may clash with the privacy expectations — if not the legal protections — that consumers take for granted as they browse online in search of medical care and information.


“The scale and scope of this continues to shock me even as I work on this research,” said Matthew McCoy, a co-author of the study and assistant professor of medical ethics and health policy at the University of Pennsylvania. “One cannot really access a hospital website in this country without being exposed to really significant levels of tracking.”

The study found that hospitals were not only commonly sharing visitor information with the online advertising giants Meta and Alphabet, but also with companies such as AT&T, Verizon, Amazon, the media giant Nielsen, and Golden Gate Capital, a San Francisco-based private equity company.


The data trade forms the backbone of a multi-billion dollar economy that quietly compiles information on consumers to target advertisements and help make decisions about how to recruit employees and distribute products such as prescription drugs and medical devices. Because such decisions are made behind corporate walls, it remains unclear how much personal information these companies gather, and exactly how they use it.

The federal privacy rules created under HIPAA, which governs the sharing of personal information collected on patients, prohibits the disclosure of certain pieces of information that could identify patients. In December 2022, the federal Department of Health and Human Services clarified that those rules apply to hospital websites that use tracking codes to collect and share information such as patients’ IP addresses, health conditions, and symptoms.

That doesn’t necessarily mean that the information scraping spotlighted in the study, published Monday in Health Affairs, constitutes a HIPAA violation, said Brad Malin, director of the health information privacy lab at Vanderbilt University. That’s because it involved data transmitted on the hospital home pages and public-facing areas, not portals where patients share specific information about their conditions and health needs with their doctors.

“If the user had logged in to these sites, such that the trackers were on pages associated with their diagnosis…then it would be a violation of HIPAA without a doubt,” Malin said.

To conduct the study, researchers at the University of Pennsylvania used an open-source tool known as webXray to record third-party tracking tools present on hospital websites during a three-day period in August 2021. The researchers also recorded the presence of “cookies,” or snippets of data stored on a user’s web browser that allow them to be tracked across multiple sites. They used a webXray database to link the tracking domains to their parent companies so they could see where the data were being routed.

Hospitals use tracking tools supplied by technology companies for the same reason many other businesses do: They want data on the use of their web pages as consumers interact with them online.

“Companies have become hyper-specialized in providing this type of support, such that the health care organizations are going to take it because it’s cheap and it’s useful for them,” Malin said. “But it ends up creating a view into an individual’s life that the (hospitals) probably were not really considering” when they created their websites.

The study found that the home pages of more than 3,700 hospitals initiated a median of 16 data transfers to third parties. It also found that the tracking tools were equally present on pages used by patients to research specific medical conditions. Malin said that it is difficult to know what other information the companies receiving the data already have about a person, such as consumer data on shopping or personal interests.

Although the study found nearly all hospitals used such tools, it also revealed that nonprofit hospitals with medical school affiliations and those serving urban areas tended to expose patients to higher levels of third-party tracking.

The issue of health data tracking extends beyond hospitals: In December, an investigation by STAT and The Markup found that dozens of direct-to-consumer telehealth companies were collecting sensitive information from users and sharing it with the world’s largest advertising platforms. The Federal Trade Commission has started to crack down on that type of data sharing, and has reached settlements with both BetterHelp and GoodRx for health data leaks this year.

But ultimately, the burden still largely falls on consumers to protect themselves as they seek out health care services online —  even if their ability to do so is significantly constrained by the volume of information now floating around about them. Those data may be used to shape both the information and opportunities that surround them on a daily basis.

“It might also be that you don’t get shown an ad for a particular job because of things that are found out from your health-related tracking,” said Ari Friedman, a co-author of the study and physician at the University of Pennsylvania. “The remedy there is hard because the details are so obscure, and so difficult to access.”

This story is part of a series examining the use of artificial intelligence in health care and practices for exchanging and analyzing patient data. It is supported with funding from the Gordon and Betty Moore Foundation.

Source: STAT