From ca839f1e922e9bb3d1cbde708bc02feddf6c9069 Mon Sep 17 00:00:00 2001 From: Audrey Dutcher Date: Wed, 10 Sep 2025 07:50:44 -0700 Subject: [PATCH] stragglers --- install.sh | 0 notes | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 66 insertions(+) mode change 100755 => 100644 install.sh create mode 100644 notes diff --git a/install.sh b/install.sh old mode 100755 new mode 100644 diff --git a/notes b/notes new file mode 100644 index 0000000..712e0cd --- /dev/null +++ b/notes @@ -0,0 +1,66 @@ +DESIGN: +- output of analysis is a full-program dataflow graph + - nodes are atoms annotated with properties (usage notes) + - edges are directed, annotated with + - op sequences indicating how the data is transformed or related at the two locations (can be concatenated, simplified, inverted) + - control flow indicating how to treat this edge with respect to context sensitivity +- graph is traversed through a "relative atom graph" + - initialized with respect to a single atom + - each node of the RAG corresponds to a single node of the atom graph plus context sensitivity info + - nodes are annotated with the relationship between the seed atom and the target atom (the sum of the ops along the path, simplified) + - in case of multiple paths to a single node, store all the op-paths. + - these paths can traverse reverse edges, the ops are inverted in this case + - control flow info prevents meaningless paths from being analyzed + - op sequences can be used as functions to transform the node properties of the target atom to be "about" the seed atom + - this graph is technically infinite, so it starts off with a single node and a "frontier" of edges that have yet to be explored + - methods for expanding the graph from each edge on the frontier +- the complexity of this graph is managed through a hierarchical graph, a view on the RAG + - arbitrary collapsing of groups of atoms into single "group" nodes, recursively + - efficient algorithms + +passes: ++ per-block live data flow and manipulations ++ inter- and intra-block passive register data flow, data flowing across callsites +- passive memory flow - given a store, track where live references to it propagate and are loaded +- constant propagation - need to re-evaluate ops which are marked as variable-dependant? + +ideas for reducing chaos: +- only allow paths which follow only forward or only backward edges (add edges from referents to references) +- discard-with-connection nodes which have no interesting properties +- group atoms which refer to the same variable +- use sfdp on large graphs +- set initial positions based on codeloc position in disassembly graph or something + +next in the queue: +- memcpy procedure +- recursive mark layout algorithm +- search for node text in current view +- start from disassembly atoms +- start from function argument atoms +- navigate to decompilation + +TODO +- Attempt to expand graph automatically by refusing to recurse +- globals... + +Research directions +- Type confusion as weird structures +- Pick more applications + +ReMind +Why Johnny Can't reveres malware https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7546501 +Dan Voytika +Hackers vs testers +something ghidra - Michell Mazerik x3 + +https://dl.acm.org/doi/abs/10.1145/2896499 + +applications: +Type inference +decompilation +taint tracking + +targets: +rust +golang +c++ virtual inheritance