W&M Professor Denys Poshyvanyk Recognized for Essential Role in AI Development
In 2025, large language models from the likes of OpenAI and Google are more capable and accessible than ever before. In addition to helping users write their essays, summarize documents, and plan their trips, companies and individuals alike have been flocking to AI tools for software development assistance.
Many of the developments seen today in computer generated code can be traced back to foundational research in the area of machine learning driven code completion performed by Chancellor Professor of Computer Science Denys Poshyvanyk.
While machine learning and AI have existed in the academic realm for many years, progress was held back by lack of hardware power and data availability. In 2015, when Poshyvanyk and his graduate students published their paper “Toward Deep Learning Software Repositories,” code completion was rudimentary and based upon syntactical “dictionary-like” methods. Ten years later, Poshyvanyk and his co-authors are set to receive a “most influential paper” award at the 2025 MSR conference (collocated with ICSE) in Ottawa for the impact their work has had over the last decade.
“Although language models were used before, this was the very first paper that applied neural large language models for the task of code completion,” Poshyvanyk said. “[We] showed that it requires only a few bits of information to encode what comes next, and this was a pretty significant improvement on the latest and greatest paper at the time, which was just a canonical language model.”
Poshyvanyk’s paper helped to kick off the AI and machine learning research boom leading to the development of consumer-focused tools that allow a wider range of individuals to craft their own case-specific scripts and software with little to no programming experience. “I'm personally very happy to see that programming has been democratized to the degree where I hear people who sort of never took a programming class, can program literally in English, which is great,” Poshyvanyk said.
Despite the success the paper has had over the last decade, its journey from idea to acceptance was not completely smooth, and it began when Poshyvanyk took on Martin White as a PhD student. White sought to complete a substantive dissertation, and after his former advisor moved to North Carolina, he approached Poshyvanyk and the two began to discuss research topics.
“We actually had maybe a couple hour brainstorming conversation, where, at the time, I noticed all these papers which were being published and language models started showing up,” Poshyvanyk said.
White and Poshyvanyk both had extensive prior experience with machine learning and realized the potential of applying neural networks in software development areas. With a tight timeline however, they would need to bring others on board.
“We had three months before the deadline for the paper,” Poshyvanyk said. “We couldn't just do it quickly with him and me in three months, so I involved my other PhD students at the time, Christopher Vendome and Mario Linares-Vasquez. Christopher was an expert in mining data from software repositories, so we got him to help us quickly obtain vast amounts of software data that we could train these models on.”
With the team assembled, the group narrowed their focus to code completion and produced their paper in time for the ICSE conference, where it received some interest and support, but intense scrutiny, and was ultimately rejected.
“Some of these new papers are like field establishing papers. Usually, it's not easy to get them accepted, what reviewers like is incremental and polished research, they like things that they understand,” said Poshyvanyk of the initial rejection.
Undeterred, the group resubmitted their paper the following year, and the paper remained controversial to reviewers.
“Again, we had one camp of reviewers who loved it, who said, wow, this is the future, this is one of the most innovative papers we've seen in years. And the second camp, again, started arguing on applicability and some technical details. So literally, our scores for the paper were strong accept and strong reject, like love and hate, and it was really surprising,” Poshyvanyk said.
Upon acceptance, Poshyvanyk and his co-authors received attention from interested colleagues and field experts. Their paper represented a successful start to White’s publishing career, who would go on to publish two more articles that would together with the initial paper become his PhD dissertation. For Poshyvanyk, who had performed research on machine learning on and off for years prior to 2015, the paper became a major highlight of his academic career.
“I was a weird kid. I wanted to do a PhD when I was a freshman,” Poshyvanyk joked. “I was always attracted to the idea of doing research and science, and somehow I was kind of laser focused.”
Poshyvanyk would put his research dreams on hold following college graduation in Ukraine, as PhD programs were mostly part-time and unfunded. After a stint working in software development industry, Poshyvanyk would make his way to graduate school in the U.S. following recommendations from his colleagues.
“I had no idea that you could do research full time as a student and be paid for this. To me, given my Ukrainian understanding of things, this was a dream thing to do,” Poshyvanyk said. “You can publish papers and go to conferences and research here is important and it's valid. And I couldn't believe it.”
Despite a career full of accolades prior, Poshyvanyk maintained that “Toward Deep Learning Software Repositories” was some of his most important work in his time at the College.
“This [award] is very special, because [the paper] was done just with my PhD students, and down the road, this idea really was the first idea in this area,” Poshyvanyk said. “So probably in terms of actually what happened after this paper, it had the largest impact, in my opinion, of all the research that we’ve done, and I was very happy that we were among the very first pioneers in this field.”
After the paper was published in 2015, fresh attention came to Poshyvanyk and his research at William & Mary, and soon the National Science Foundation came knocking.
“This particular paper, even before being recognized as a most influential paper, made a huge difference in my career at the time, because I remember the National Science Foundation noticed this work that I was doing, and actually right after this paper, I got a grant funded to do more of this,” Poshyvanyk explained.
A few years later in 2019, the NSF recognized Poshyvanyk as a leader in the field, and he was selected to head a workshop which brought together members of both the software engineering and machine learning communities with the goal of discussing next steps for research.
Much can change in ten years, and a community that was skeptical of a paper covering generative code completion, is now flushed with research and publication in the area.
“The paper was barely accepted, and then 10 years later, 400 citations later, we got this award,” Poshyvanyk said.
Although it was a long road for Poshyvanyk and his students, the influence of their once controversial paper is certainly undeniable now.