New machine learning model could revolutionize early autism detection

A recent study published in JAMA Network Open introduces an advanced machine-learning model that predicts autism spectrum disorder in young children using limited information, with nearly 80% accuracy for children under two years old. The model, called AutMedAI, was designed to use basic behavioral and medical information that is often available during routine pediatric visits, making it both accessible and practical for wide-scale application in healthcare settings. This model could be instrumental for early autism detection, helping provide necessary interventions sooner to enhance developmental outcomes.

Autism spectrum disorder is a neurodevelopmental condition that affects how individuals perceive and interact with the world around them. It is characterized by challenges in social communication, repetitive behaviors, and limited interests. People with autism may experience difficulties in understanding social cues, forming relationships, and adapting to new environments, with symptoms ranging from mild to severe.

While the causes of autism are complex and involve a mix of genetic and environmental factors, early intervention has been shown to greatly benefit children with autism, particularly in improving social, communication, and behavioral skills. However, diagnosing autism can be challenging, as it often relies on observing specific behaviors that may not fully emerge until after the first few years of life. This has led to a gap between when early signs of autism first appear and when a diagnosis is typically made, delaying potentially helpful interventions.

The motivation behind this study lies in addressing the limitations of current autism screening and diagnostic tools. Traditional screening often relies on questionnaires and checklists, which are useful but can miss subtle signs, may be influenced by interpretation biases, and often require specialized knowledge for accurate assessment. These tools may delay diagnosis, as they typically target children who are already showing pronounced signs of autism, often around age three or later.

Researchers at the Karolinska Institutet in Sweden aimed to develop a more accessible, accurate tool that could identify autism risk in very young children using readily available medical and developmental data. By creating a machine-learning model that analyzes common early-life factors—such as age at first smile or language milestones—they hoped to facilitate earlier identification of autism risk. This early detection could open doors to timely intervention and better developmental support, ultimately improving outcomes for children with autism and their families.

The researchers used data from the SPARK (Simons Foundation Powering Autism Research for Knowledge) database, one of the largest autism research datasets in the United States. The SPARK database includes detailed medical and background information on more than 30,000 children, both with and without autism. For this study, the team focused on a sample of approximately 12,000 children from SPARK to train and validate their machine-learning models. The data was selected to include only information that would typically be available from routine medical visits during a child’s early years, such as age at key developmental milestones and specific behavioral traits.

To build the model, the researchers used 28 distinct factors, carefully chosen to be accessible, non-invasive, and easily reportable by parents. These factors included observable milestones such as when a child first smiled, formed short sentences, or had difficulty with certain foods. The study’s focus was on children under 24 months of age, a critical period for developmental assessment. The team used a variety of machine-learning algorithms, including logistic regression and random forest, to explore different ways to interpret this data. Their best-performing model, AutMedAI, was ultimately chosen after multiple rounds of testing and refinement to maximize predictive accuracy while remaining user-friendly and based on readily available data.

AutMedAI was trained and validated on the SPARK dataset, which was split into multiple subsets to allow for rigorous cross-validation. Specifically, the data was divided so that 60% was used for training, 20% for tuning model parameters, and the remaining 20% for final validation. This method helped ensure that the model was accurate not only within the sample used to train it but also for “unseen” data, mimicking real-world application. The researchers further refined the model by optimizing it to prevent overfitting, ensuring that it could generalize well to new cases.

The AutMedAI model was evaluated on a sample of around 12,000 children and achieved approximately 80% accuracy in predicting autism, correctly identifying a large portion of children who had autism spectrum disorder. The model was particularly effective in flagging children with more profound difficulties in social interaction and cognitive functioning, two areas closely associated with autism.

“The results of the study are significant because they show that it is possible to identify individuals who are likely to have autism from relatively limited and readily available information,” said study first author Shyam Rajagopalan, an affiliated researcher at the Karolinska Institutet and currently an assistant professor at the Institute of Bioinformatics and Applied Technology in India.

Several specific factors emerged as strong predictors within the model, including the age of the child’s first smile, when they began using short sentences, and the presence of eating difficulties. This combination of predictors was both insightful and practical, showing that common developmental milestones could be powerful indicators of autism risk when analyzed collectively.

The researchers emphasized that AutMedAI is not meant to replace detailed clinical assessments but rather to serve as an initial screening tool. By flagging children who may need further evaluation, the model could help ease the strain on diagnostic services and provide families with earlier insights into their child’s development.

Early intervention is especially important for children with autism, as targeted therapies and support systems can significantly improve long-term outcomes, particularly in communication and social skills. The model’s accessibility also holds promise for rural or underserved areas where specialized autism diagnostic services may be less available, offering a valuable option for preliminary screening.

One of the most promising aspects of AutMedAI is its reliance on data that can be gathered without invasive testing or extensive clinical assessments, making it feasible to integrate into routine pediatric care. The researchers plan to conduct further testing and validation in clinical settings to confirm the model’s reliability outside of research environments. They are also exploring the potential to include genetic information in future iterations of the model, which could further improve accuracy and enable even more personalized screening.

“To ensure that the model is reliable enough to be implemented in clinical contexts, rigorous work and careful validation are required. I want to emphasize that our goal is for the model to become a valuable tool for health care, and it is not intended to replace a clinical assessment of autism,” said Kristiina Tammimies, an associate professor at KIND, the Department of Women’s and Children’s Health, Karolinska Institutet and senior author of the study.

The study, “Machine Learning Prediction of Autism Spectrum Disorder From a Minimal Set of Medical and Background Information,” was authored by Shyam Sundar Rajagopalan, Yali Zhang, Ashraf Yahia, and Kristiina Tammimies.