SA
Skip to main content

Learning Execution Semantics from Micro-Traces for Binary Similarity

  • Attended CS Colloquium on 2023-03-28.
  • [2012.08680] Trex: Learning Execution Semantics from Micro-Traces for Binary Similarity
  • In Natural Language, semantics usually release info about surroundings
  • In Binary, not so much.
  • We can't do Dynamic Analysis due to Scalability problems.
  • Teach AI approximate learning on binary code instructions and make mental executions.
    • Dynamic Execution has coverage problems.
    • So here, we set arbitrary registers to run small code. This is for pre-training purposes, not to make accurate extrapolations.
  • Masked Language Modeling. Mask certain portions of binaries and make the AI guess.
  • Questions may arise in Microexecutions and Logical reasoning. Employing Higher level programming language mapping is possible often, but not researched through.
  • Now, fine-tune this to use static analysis.
  • Can analyze Semantically Similar Binary Functions
  • Function Dependencies and Signatures are also predictable
  • Can find vulnerabilities
  • Limitations: in-code only. Cannot find multimodal programming vulnerabilities. i.e., involving many interacting components.