Complementing Machine Learning Classifiers Via Dynamic Symbolic Execution: Human vs. Bot Generated Tweets.

Published in RAISE, 2018

Recommended citation: Shrestha, Sohil L., Saroj Panda, and Christoph Csallner. "Complementing Machine Learning Classifiers via Dynamic Symbolic Execution: Human vs. Bot Generated Tweets." 2018 IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE). IEEE, 2018.

Download paper here

Abstract

Recent machine learning approaches for classifying text as human written or bot-generated rely on training sets that are large, labeled diligently, and representative of the underlying domain. While valuable, these machine learning approaches ignore programs as an additional source of such training sets. To address this problem of incomplete training sets, this paper proposes to systematically supplement existing training sets with samples inferred via program analysis. In our preliminary evaluation, training sets enriched with samples inferred via dynamic symbolic execution were able to improve machine learning classifier accuracy for simple string generating programs.