It’s easy to do bad things with Facebook data. From targeting ads for bizarrely specific T-shirts to manipulating an electorate, the questionable purposes to which the social media behemoth can be put are numerous. But there are also some people out there trying to use Facebook for good—or, at least, to improve the diagnosis of mental illness. On December 3, a group of researchers reported that they had managed to predict psychiatric diagnoses with Facebook data—using messages sent up to 18 months before a user received an official diagnosis.

The team worked with 223 volunteers, who all gave the researchers access to their personal Facebook messages. Using an artificial intelligence algorithm, the researchers leveraged attributes extracted from these messages, as well as the Facebook photos each participant had posted, to predict whether they had a mood disorder (like bipolar or depression), a schizophrenia spectrum disorder, or no mental health issues. According to their results, swear words were indicative of mental illness in general, and perception words (like see, feel, hear) and words related to negative emotions were indicative of schizophrenia. And in photos, more bluish colors were associated with mood disorders.

To evaluate how successful their algorithm was, the researchers used a common metric in artificial intelligence that measures the trade-off between false positives and false negatives. As the algorithm categorizes more and more participants as positive (say, as having a schizophrenia spectrum disorder), it will miss fewer participants who really do have schizophrenia (a low false negative rate), but it will mislabel some healthy participants as having schizophrenia (a high false positive rate). A perfect algorithm can have no false positives and no false negatives at the same time; such an algorithm would be assigned a score of 1. An algorithm that guessed randomly would have a score of 0.5. The research team achieved scores ranging from 0.65 to 0.77, depending on the specific predictions they asked the algorithm to make. Even when the researchers restricted themselves to messages from over a year before the subjects received a diagnosis, they could make these predictions substantially better than would have been expected by chance.

According to H. Andrew Schwartz, an assistant professor of computer science at Stony Brook University who was not involved in the study, these scores are comparable to those achieved by the PHQ-9, a standard, 10-question survey used to screen for depression. This result raises the possibility that Facebook data could be used for mental illness screening—potentially long before a patient would otherwise have received a diagnosis.

Michael Birnbaum, an assistant professor at the Feinstein Institutes for Medical Research in Manhasset, New York, who led the study, believes that this sort of AI tool could make an enormous difference in the treatment of psychiatric illnesses. “We now understand this idea that cancer has many different stages,” Birnbaum says. “If you catch cancer at Stage I, it’s drastically different than if you catch it once it metastasizes. In psychiatry, we have a tendency to start working with people once it’s already metastasized. But there’s the potential to catch people earlier.”

Birnbaum is far from the first researcher to have used social media data to predict the presence of mental illness. Previously, researchers have used Facebook statuses, tweets, and Reddit posts to identify diagnoses ranging from depression to attention deficit hyperactivity disorder. But he and his team broke new ground by working directly with patients who had existing psychiatric diagnoses. Other researchers haven’t, in general, been able to work off of clinically confirmed diagnoses—they have taken subjects’ word for their diagnoses, asked them for self-diagnoses, or had them take questionnaires like the PHQ-9 as a proxy for diagnosis. Everyone in Birnbaum’s study, in contrast, had an official diagnosis from a psychiatric professional. And since the researchers had definitive dates for when these diagnoses were made, they could try to make predictions from messages sent before the patients knew about their mental illnesses.