Majors
Computer Science and Data Science
Project title
“Function Call Graphs with Code Embeddings from Source Code”
Project description
Alfred developed a tool that enabled future research into program classification source code authorship, plagiarism detection, malware identification, and others. The novel aspect of this tool is the representation of a program as a combination of code2vec embeddings and a call graph. Code2vec is a neural model for representing snippets of code as continuous distributed vectors. Representing a program as a graph allows the relationships between different functions to be preserved which will serve as a better representation of the overall program.
Faculty Mentor
Dr. Marco Alvarez, Department of Computer Science and Statistics
“I created IR (Intermediate Representation) representations of source code in the form of graphs to be fed into a GNN (Graph Neural Network). We got preliminary results of 70% accuracy for a device mapping task (Determining whether a code kernel will run faster on a GPU or CPU) and are continuing to improve the model to increase performance. Working with Professor Alvarez was great and I am continuing working with him this semester on his research. I hope to continue doing research as an undergraduate and am possibly pursuing research as a career.”