Vermont scientists mine #hashtags to hunt for unknown side effects

A team of scientists has invented a new technique for discovering potentially dangerous drug interactions and unknown side-effects — before they show up in medical databases, like PubMed, or even before doctors and researchers have heard of them at all.

The far-seeing tool? A computer program that can efficiently search millions of tweets on Twitter for the names of many drugs and medicines — and build a map of how they’re connected, using the #hashtags that link them.

“Our new algorithm is a great way to make discoveries that can be followed-up and tested by experts like clinical researchers and pharmacists,” said Ahmed Abdeen Hamed, a computer scientist at the University of Vermont who led the creation of the new tool. A report on how the algorithm works, and its preliminary discoveries, was published online, June 8, in the Journal of Biomedical Informatics.

“We may not know what the interaction is, but with this approach we can quickly find clear evidence of drugs that are linked together via hashtags,” Hamed said.

Matching PubMed

The new approach could also be used to generate public alerts, Hamed said, before a clinical investigation is started or before health care providers have received updates. “It can tell us: we may be seeing a drug/drug interaction here,” Hamed said. “Beware.”

And the research team also aims to help overcome a long-standing problem in medical research: published studies are too often not linked to new scientific findings, because digital libraries “suffer infrequent tagging,” the scientists write, and updating keywords and metadata associated with studies is a laborious manual task, often delayed or incomplete.

“Mining Twitter hashtags can give us a link between emerging scientific evidence and PubMed,” the massive database run by the U.S. National Library of Medicine, Hamed said. Using their new algorithm, the Vermont team has created a website that will allow an investigator to explore the connections between search terms (say “albuterol”), existing scientific studies indexed in PubMed — and Twitter hashtags associated with the terms and studies.

Heeding #hashtags

Previous studies have shown that Twitter can be mined for bad drug interactions, but the Vermont team advances this idea by focusing on the distinctive information contained in hashtags — like “#overprescribed,” “#kidneystoneprobs,” and “#skinswelling” — to find new associations. “Each individual hashtag functions almost like a neuron in the human brain, sending a specific signal,” the scientists write, that can reveal a surprising pathway between two or more drugs.

The team’s approach involves building what they call a “K-H network” — essentially a dense map of links between keywords and hashtags — and then pruning out a lot of the “noise and trash,” Hamed says, “this is Twitter!” — to find the terms that are central to the network. Then the algorithm, called HashPairMiner, searches this cleaned-up network for the shortest paths between a pair of search terms and their intervening hashtags.

The overall goal of the project, supported by the National Science Foundation, is to “discover any relationship between two drugs that is not known,” said Hamed. But to “ground-truth the hypothesis” — that data-mining in Twitter can find unknown drug interactions — the team wanted to demonstrate that their approach “can produce interactions that are already known,” says Tamer Fandy. He’s a professor of pharmaceutical sciences at the Albany College of Pharmacy’s campus in Vermont, and a co-author on the new study with Ahmed Abdeen Hamed and two other computer scientists, Xingdong Wu and Robert Erickson, professors in UVM’s College of Engineering and Mathematical Sciences.

“It does,” said Hamed. In one example from the new study, a path between aspirin and the allergy medication benadryl, that are known to interact, was detected by the algorithm; in one instance, the two drugs were linked — perhaps not too surprisingly — by the hashtag “#happythanksgiving.”

Marijuana and memory

The new system began with what UVM’s Hamed initially thought was as error in November of 2013. An earlier version of the current algorithm “discovered something shocking: ibuprofen and medical marijuana — which you would think have nothing to do with each other — were linked by a hashtag called #Alzheimer's,” Hamed says.

“I thought that has to be an error. I looked at my code. I repeated my experiment. I gathered different tweet data sets — and I got the same result,” he said. But he couldn’t find any support for the association on PubMed or other databases of clinical literature. In fact, the only study he could find, from 1989, suggested the opposite, that there was no interaction between ibuprofen and marijuana.

It turned out that Hamed had inadvertently discovered people in the Twitterverse who were sharing the results of a brand-new peer-reviewed study suggesting that ibuprofen has some ability to block or reduce the memory-damaging effects of regular marijuana use, which has been associated with the development of Alzheimer’s disease. “It appeared on Twitter before PubMed,” Hamed said.

As more states legalize marijuana, Hamed said, there may be increasing discussion of its interactions with other drugs — ahead of researchers capacity to study these interactions. “If we’re able to detect concerns — say chatter about headaches or drops in blood pressure or whatever,” he said, “that may lead pharmacists or researchers to a hypothesis that can be followed up by a clinical trial or other medical test.”

PUBLISHED

06-29-2015
Joshua E. Brown