MATLAB/Simulink is a de-facto standard tool in several safety- critical industries such as automotive, aerospace, healthcare, and industrial automation for system modeling and analysis, compiling models to code, and deploying code to embedded hardware. On one hand, testing cyber-physical system (CPS) development tools such as MathWorks’ Simulink is important as a bug in the toolchain may propagate to the artifacts they produce. On the other hand, it is equally important to understand modeling practices and model evolution to support engineers and scientists as they are widely used in design, simulation, and verification of CPS models. Existing work in this area is limited by two main factors, i.e., (1) inefficiencies of state-of-the-art testing schemes in finding critical tool-chain bugs and (2) the lack of a reusable corpus of public Simulink models. In my thesis, I propose to (1) curate a large reusable corpus of Simulink models to help understand modeling practices and model evolution and (2) leverage such a corpus with deep-learning based language models to test the toolchain.