Skip to content
Snippets Groups Projects

Draft: Repairable

Closed Xavier Routh requested to merge repairable into main

Sample of debug_info with repairable. Can easily be implemented for partitions with slight changes to pass.rs and implmenting repairable trait for plans.

New implementation of gvn with FunctionEditor interface, automatically generates FunctionChanges. Still slightly bugged as changes sets are not always completely merged (nondeterministic). (xavier is working on it)

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
4
5 /*
6 * This file contains untilities to help debug the hercules compiler. It is not meant to support
7 * emitting debug symbols into compiled code.
8 */
9
10
11 #[derive(Debug, Clone)]
12 pub struct DebugInfo {
13 pub names: HashMap<NodeID, String>, // NodeInfo<String>, // TODO: Optimize this to a vec.
14 }
15
16 impl Repairable for DebugInfo {
17 fn repair(&mut self, function: &Function, function_changes: &FunctionChanges) -> () {
18 match function_changes {
19 FunctionChanges::GravestoneUpdates(vec) => {
  • I don't think we should have a separate path for gravestone updates, which I assume is what DCE generates by just deleting nodes? I think we can accomplish the "same thing" by using a disjoint set mapping each deleted node to the empty set.

  • Author Developer

    It can be represented by the disjoint set mapping, just gravestone updates are probably the case for half of the repairs, so it made sense to leave room to optimize the common case.

  • I think we should stick to implementing the general case at first - this will be a non-insignificant technical contribution in the paper, so having a single general mechanism will be valuable in that sense as well.

  • Please register or sign in to reply
  • rarbore2
  • 26 };
    27 new_names.insert(b, new_name.to_string());
    28 }
    29
    30 self.names = new_names;
    31 }
    32 FunctionChanges::DisjointSets(disjoint_set_bi_map) => {
    33 let mut new_names: HashMap<NodeID, String> = HashMap::new();
    34
    35 // Each right is a union of all left names, unless left got deleted (checked by checking
    36 // function.nodes[left.idx()].is_start())
    37 for right in (0..function.nodes.len()).map(NodeID::new) {
    38 let names = disjoint_set_bi_map.get_by_right(&right)
    39 .drain()
    40 .filter_map(|node|
    41 // TODO: (@xrouth) remove the && false condition
  • rarbore2
  • 23 /* TODO (@xrouth): custom implementation of this, if this proves to be too slow. */
    24
    25 #[derive(Debug, Clone)]
    26 pub struct DisjointSetBiMap<L: ElementType + std::hash::Hash, R: ElementType + std::hash::Hash> {
    27 map: bimap::BiMap<L,R>, // If identity mapping is default / required, we may not need this.
    28 left_sets: UnionFind<L>,
    29 right_sets: UnionFind<R>,
    30
    31 left_size: usize, // Elements in left are L::from_usize(0..left_size)
    32 right_size: usize, // Likewise for right_size
    33 }
    34
    35 impl <L: ElementType + std::hash::Hash, R: ElementType + std::hash::Hash> DisjointSetBiMap<L, R> {
    36 // O(n^2)
    37 pub fn print_map(&self) {
    38 // FIXME: UnionFind crate simply lies, these aren't the representative elements.
  • rarbore2
  • 52 let right_sets: UnionFind<R> = UnionFind::new(len);
    53
    54 let mut map = BiMap::new();
    55
    56 for (l, r) in (0..len).map(|i| (L::from_usize(i).unwrap(), R::from_usize(i).unwrap())) {
    57
    58 let left = left_sets.find(l);
    59 let right = right_sets.find(r);
    60
    61 map.insert(left, right);
    62 }
    63
    64 DisjointSetBiMap { map, left_sets, right_sets, right_size: len, left_size: len}
    65 }
    66
    67 /** Given an element in right, moves it from one set in right to another. */
    • Is the thinking here that an optimization starts with an identity map and calls remap_ functions as it modifies the IR?

    • Author Developer

      Kinda, each optimization starts with an identity map, and then merges sets by left or by right element. It feels awkward but if we don't care about deletions, we don't actually use any remap_ functions, and we might be able to get away with it.

    • So the way we'd represent deletions rn is by right-unioning the deleted node ID with the new node IDs (i.e. if one node became like four after a rewrite)?

    • Author Developer

      Yeah.

    • I think we can make it so the bimap entry would be {old} -> {new1, new2, new3} without needing to delete entries from disjoint sets. I think the current design tries to automatically infer the changes from just adding/removing nodes and replacing uses in the function. However, I don't think this is enough info on its own (especially if we don't support deletion in the disjoint sets). Although I'm now seeing why replace_all_uses_with currently does affect the bimap while being an "edge only" edit. I'll think about this more.

    • Please register or sign in to reply
  • rarbore2
  • 54 let mut map = BiMap::new();
    55
    56 for (l, r) in (0..len).map(|i| (L::from_usize(i).unwrap(), R::from_usize(i).unwrap())) {
    57
    58 let left = left_sets.find(l);
    59 let right = right_sets.find(r);
    60
    61 map.insert(left, right);
    62 }
    63
    64 DisjointSetBiMap { map, left_sets, right_sets, right_size: len, left_size: len}
    65 }
    66
    67 /** Given an element in right, moves it from one set in right to another. */
    68 pub fn remap_right(&mut self, ele: &R, to: &R) -> () {
    69 // TODO (@xrouth): Unimplementable with current UnionFind library.
    • This would involve splitting an individual element out of a disjoint set right? I think this might be tricky because each such operation would need to iterate the entire UnionFind.

    • Author Developer

      Yeah, wanted to avoid that so didn't implement. Not sure if this is needed.

    • I think this boils down to if this is needed in any of the APIs in the function editor. We should list all of these out somewhere so we can determine which bimap operations are needed to support them.

      Edited by rarbore2
    • Please register or sign in to reply
  • rarbore2
  • 97 97 .map(NodeID::new)
    98 98 .collect::<Vec<_>>()
    99 99 };
    100
    101 // Temo:
  • rarbore2
  • 27 // node_numbers: HashMap<NodeID, u32>,
    28
    29 edits: DisjointSetBiMap<NodeID, NodeID>,
    30 // deleted_nodes: HashSet<NodeID>,
    31 // edits: FunctionChanges, // Keep track of changes
    32
    33 // Mutable DefUse Information
    34 users: NodeInfo<HashSet<NodeID>> // Map NodeID -> Vec<NodeID>
    35 }
    36
    37
    38
    39 /* Some design decisions to make this feel safer, these may be modified if a use case is found.
    40 * 1) We don't mutate / reuse nodeIDs, any changes to a node result in a new NodeID.
    41 * the appropriate pattern is to grab the node data, copy over fields you want to keep the same, and set
    42 * fields you want to chagne to new values, and then re-insert.
  • rarbore2
  • 39 /* Some design decisions to make this feel safer, these may be modified if a use case is found.
    40 * 1) We don't mutate / reuse nodeIDs, any changes to a node result in a new NodeID.
    41 * the appropriate pattern is to grab the node data, copy over fields you want to keep the same, and set
    42 * fields you want to chagne to new values, and then re-insert.
    43 *
    44 * We have two replace functions: TODO: (these need to be renamed / designed better).
    45 * replace_all_uses_with
    46 * - Used for when we are doing the mutation pattern that results in a new NodeID, and automatically
    47 * updates use-def information.
    48 *
    49 * replace_with (DANGEROUS)
    50 * - New node with/ same ID, so external datastructures tracking the ID don't need to be repaired.
    51 *
    52 * Errors:
    53 * - right now these funcitons juist panic on different errors, such as trying to use a node that
    54 * isn't in this editors mutable nodes list
  • rarbore2
  • 35 }
    36
    37
    38
    39 /* Some design decisions to make this feel safer, these may be modified if a use case is found.
    40 * 1) We don't mutate / reuse nodeIDs, any changes to a node result in a new NodeID.
    41 * the appropriate pattern is to grab the node data, copy over fields you want to keep the same, and set
    42 * fields you want to chagne to new values, and then re-insert.
    43 *
    44 * We have two replace functions: TODO: (these need to be renamed / designed better).
    45 * replace_all_uses_with
    46 * - Used for when we are doing the mutation pattern that results in a new NodeID, and automatically
    47 * updates use-def information.
    48 *
    49 * replace_with (DANGEROUS)
    50 * - New node with/ same ID, so external datastructures tracking the ID don't need to be repaired.
    • I think even just for the sake of potentially catching bugs earlier on in the compiler, we should always use replace_all_uses_with (i.e. always allocate new node ID in passes, and separately do ID compaction / gravestone deletion)

    • Author Developer

      Yes, agree. There was something about GVN that made just using a normal replace_with make sense, will have to look at that again to remember what it was. I don't think it will be very common to use replace_with over replace_all_uses_with. Also, these should both be named better.

      Edited by Xavier Routh
    • Please register or sign in to reply
  • rarbore2
  • 50 * - New node with/ same ID, so external datastructures tracking the ID don't need to be repaired.
    51 *
    52 * Errors:
    53 * - right now these funcitons juist panic on different errors, such as trying to use a node that
    54 * isn't in this editors mutable nodes list
    55 *
    56 */
    57 impl <'f> FunctionEditor<'f> {
    58
    59
    60 // TODO (@xrouth): Think about what we want the type of predicate to be, if it should be a
    61 // closure that can captrue any environment, if it should be a pointer that just takes a
    62 // Function and NodeID, or if it should take the same as PassManagers predicates.
    63 // Also is this should be generic or a Boxed<> Closure.
    64 pub fn function_editor(function: &'f mut Function, predicate: impl Fn(&Function, NodeID) -> bool,
    65 map: &ImmutableDefUseMap) -> FunctionEditor<'f> {
  • rarbore2
  • 49 * replace_with (DANGEROUS)
    50 * - New node with/ same ID, so external datastructures tracking the ID don't need to be repaired.
    51 *
    52 * Errors:
    53 * - right now these funcitons juist panic on different errors, such as trying to use a node that
    54 * isn't in this editors mutable nodes list
    55 *
    56 */
    57 impl <'f> FunctionEditor<'f> {
    58
    59
    60 // TODO (@xrouth): Think about what we want the type of predicate to be, if it should be a
    61 // closure that can captrue any environment, if it should be a pointer that just takes a
    62 // Function and NodeID, or if it should take the same as PassManagers predicates.
    63 // Also is this should be generic or a Boxed<> Closure.
    64 pub fn function_editor(function: &'f mut Function, predicate: impl Fn(&Function, NodeID) -> bool,
    • My guess is that predicate will need to be a closure, since it'll probably need to capture partitions & partition ID at some point. I didn't know Rust had this "inline" impl syntax, the usual way I'd do this is having fn function_editor(..., predicate: F, ...) -> ... where F: Fn(NodeID) -> bool (then it can also capture &function if it needs to)

    • Please register or sign in to reply
  • rarbore2
  • 65 map: &ImmutableDefUseMap) -> FunctionEditor<'f> {
    66
    67 let users_vec = vec![HashSet::new(); function.nodes.len()];
    68
    69 let mut mutable_nodes = HashSet::new();
    70
    71 for node in (0..function.nodes.len()).map(NodeID::new).filter(|node| predicate(&function, *node)) {
    72 mutable_nodes.insert(node);
    73 }
    74
    75 println!("mutable nodes: {:?}", mutable_nodes);
    76
    77 // let changes = FunctionChanges::disjoint_set(function);
    78
    79 // Map every node to the same ID in output.
    80 let changes = DisjointSetBiMap::new_identity(function.nodes.len());
  • rarbore2
  • 109 mutable_nodes,
    110 users: users_vec,
    111 };
    112
    113 s.init_users_from_map(map);
    114 s
    115 }
    116
    117 pub fn init_users_from_map(&mut self, map: &ImmutableDefUseMap) {
    118 // TODO (@xrouth): Optimize!
    119 self.users.clear();
    120 self.users = vec![HashSet::new(); self.function.nodes.len()];
    121
    122 for node in (0..self.function.nodes.len()).map(NodeID::new) {
    123 // TODO (@xrouth): Unecessary allocation here in .to_owned() here?
    124 let this_node_users: HashSet<NodeID> = HashSet::from_iter(map.get_users(node).to_owned());
    • since you're using from_iter, I think you can use .as_ref instead of .to_owned? maybe .as_ref().into_iter()/iter(), I forget (maybe you also need to .map(|x|*x))

    • Please register or sign in to reply
  • rarbore2
  • 132 }
    133
    134 pub fn node_is_mutable(&self, id: &NodeID) -> bool {
    135 self.mutable_nodes.contains(id)
    136 }
    137
    138 pub fn node_data(&self, id: NodeID) -> &Node {
    139 &self.function.nodes[id.idx()]
    140 }
    141
    142 pub fn function(&self) -> &Function {
    143 &self.function
    144 }
    145
    146 // TODO (@xrouth): Optimize to *not* go through immutable def use map, OR refactor and change memory layout
    147 // ahgkjawehgjkwaehlfkjawhefljkwahelfkjh of def use map.
    • I'm not sure I understand why this is needed? If we're tracking individual updates to the function, we should be able to repair the mutable def-use iteratively

    • Author Developer

      This is useful for the current replace_node implementation. Instead of figuring out how to edit the users, just recalculate def use from scratch after the node has been modified. replace_node definitely should be optimized to something reasonable, then we can get rid of this.

    • Please register or sign in to reply
  • rarbore2
  • 139 &self.function.nodes[id.idx()]
    140 }
    141
    142 pub fn function(&self) -> &Function {
    143 &self.function
    144 }
    145
    146 // TODO (@xrouth): Optimize to *not* go through immutable def use map, OR refactor and change memory layout
    147 // ahgkjawehgjkwaehlfkjawhefljkwahelfkjh of def use map.
    148 pub fn repair_users(&mut self) {
    149 let map = def_use(&self.function);
    150 self.init_users_from_map(&map);
    151 }
    152
    153 pub fn get_nodes_using(&self, node: NodeID) -> impl Iterator<Item=NodeID> {
    154 self.users[node.idx()].clone().into_iter()
  • rarbore2
  • 151 }
    152
    153 pub fn get_nodes_using(&self, node: NodeID) -> impl Iterator<Item=NodeID> {
    154 self.users[node.idx()].clone().into_iter()
    155 }
    156
    157 /* */
    158 /* Self standing iterator w/ no long living reference to the Function */
    159 pub fn get_uses_cloned(&self, node: NodeID ) -> impl IntoIterator<Item=NodeID> {
    160 let binding = self.get_uses(node);
    161 let temp: Vec<_> = binding.as_ref().iter().cloned().collect();
    162 temp
    163 }
    164
    165 pub fn get_uses(&self, node: NodeID) -> NodeUses {
    166 // TODO (@xrouth): Possibly make iteartor for NodeUses and return impl Iterator?
  • rarbore2
  • 157 /* */
    158 /* Self standing iterator w/ no long living reference to the Function */
    159 pub fn get_uses_cloned(&self, node: NodeID ) -> impl IntoIterator<Item=NodeID> {
    160 let binding = self.get_uses(node);
    161 let temp: Vec<_> = binding.as_ref().iter().cloned().collect();
    162 temp
    163 }
    164
    165 pub fn get_uses(&self, node: NodeID) -> NodeUses {
    166 // TODO (@xrouth): Possibly make iteartor for NodeUses and return impl Iterator?
    167 get_uses(&self.function.nodes[node.idx()])
    168 }
    169
    170 // Mutating api functions:
    171 /** Replace all uses of replaced in function with replaced_by */
    172 pub fn replace_all_uses_with(&mut self, replaced: NodeID, replaced_by: NodeID) -> () {
  • rarbore2
  • 182 // TODO (@xrouth): How to set only one use? How to specify that use?
    183 // Is there a use case for setting only one use out of multiple?
    184 pub fn set_uses(&mut self, user: NodeID, replaced: NodeID, replaced_by: NodeID) -> () {
    185
    186 // These node IDs aren't guaranteed to exist in the left set of edits (they may be created by some transformation, so )
    187 self.edits.union_by_right(&replaced, &replaced_by);
    188
    189 assert!(self.mutable_nodes.contains(&user));
    190 let user_node = &mut self.function.nodes[user.idx()];
    191 for u in get_uses_mut(user_node).as_mut() {
    192 if **u == replaced {
    193 **u = replaced_by
    194 }
    195 }
    196
    197 let user_set = &mut self.users[user.idx()];
    • Is this correct? The user's use of replaced get's switched to replaced_by, but the users of that user are unaffected. I think the user set that needs to be updated (or rather created de novo) is the user set of replaced_by (and the user set of replaced should be emptied)

    • Author Developer

      Yeah seems wrong.

    • Please register or sign in to reply
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading