Graph replacement mechanisms

These notes were inspired by the Graal Technical Documentation

Once GraphBuilderPhase is done, the graph undergoes "reductions", in the form of canonicalization, lowering, simplification, and replacements. A replacement, the focus of this section, involves replacing a "from" node with a "to" subgraph. That can be done programmatically (as with canonicalization, lowering, and simplification) but there's a more maintainable way: expressing the "to" subgraph as a Java method body. We'll use the term "splicee" instead of "to subgraph".

A method defining a replacement can be recognized by the annotations described below, which mirror the shape of the splicee of interest:

Annotation Splicee
@Fold constant (primitive, string, or Class instance value)
@NodeIntrinsic a Node instance
@Snippet a method-graph, obtained from compiling a Graal-defined method including all of its (direct or indirect) callees.
@MethodSubstitution a method-graph (in this aspect similar to @Snippet but otherwise quite different, as discussed below)
@MacroNode an instance of a MacroNode subclass

Common to all the annotations above is the lexical scope where callsites (targeting methods annotated by them) may occur. Given those callsites denote a graph replacement, the callsite must occur in a @Snippet-annotated method, or in a direct or indirect callee of a @Snippet-annotated method; similarly for @MethodSubstitution and for @MacroNode (ie such callsites may also occur in methods annotated with @MethodSubstitution or @MacroNode, or any of their callees, at any depth).

At the risk of some imprecision, the replacement workflow can be summarized as: replacements of the @Snippet and @MethodSubstitution varieties are registered automatically by Graal, before any replacements are made. Afterwards, callsites targeting replacement-defining-methods are expanded as they are found. The remaining replacements (@Fold, @NodeIntrinsic, and @MacroNode) must be contained in the body of a previous replacement in order to be expanded, their expansion performed upon traversing the result of a previous replacement.

Each section below covers a replacement variety. Although the presentation tries to keep cross-references to a minimum, some are unavoidable, due to the interplay that's available by design.

@Fold

A @Fold-annotated method is a side-effect-free method invoked via reflection during compilation, returning a constant value for its arguments (which must be compile-time constants). Such value is spliced-in as a ConstantNode (wrapping a Constant of the appropriate kind) into the graph under compilation. Typically a folded method encapsulates:

There are no examples of @Fold methods taking params, which would be rarely needed anyway because node canonicalization does the job already.

@NodeIntrinsic

A callsite targeting a @NodeIntrinsic-annotated method denotes a Node instance, with inputs given by the actual arguments at the callsite. Accordingly, the callsite's argument must allow selecting a single constructor (of the Node subclass indicated by the @NodeIntrinsic annotation).

An example of a shorthand for DeoptimizeNode is a callsite targeting:


    @NodeIntrinsic
    public static native void deopt(@ConstantNodeParameter DeoptimizationAction action, @ConstantNodeParameter DeoptimizationReason reason);
  

Usually a method annotated @NodeIntrinsic lacks a body, but here's an exception: CurrentJavaThreadNode (a floating-node class) defines an intrinsifying method:


    @NodeIntrinsic(setStampFromReturnType = true)
    public static Word get() {
        return Word.unsigned(unsafeReadWord(Thread.currentThread(), eetopOffset()));
    }
  

In short, the body of a @NodeIntrinsic annotated method exists purely so that the (@MethodSubstitution annotated) calling method can be interpreted should it be deoptimized.

But, we aren't done with the example. The eetopOffset() callsite above (more on this in a moment) refers to helper method also in CurrentJavaThreadNode:


    private static int eetopOffset() {
        try {
            return (int) UnsafeAccess.unsafe.objectFieldOffset(Thread.class.getDeclaredField("eetop"));
        } catch (Exception e) {
            throw new GraalInternalError(e);
        }
    }
  

One of the callsites to the intrinsifying CurrentJavaThreadNode.get() can be found in ThreadSubstitutions:


/**
 * Substitutions for {@link java.lang.Thread} methods.
 */
@ClassSubstitution(java.lang.Thread.class)
public class ThreadSubstitutions {

    @MethodSubstitution
    public static Thread currentThread() {
        return (Thread) CurrentJavaThreadNode.get().readObject(threadObjectOffset(), LocationIdentity.FINAL_LOCATION);
    }
  

Finally, two more questions can be answered:

  1. the body of CurrentJavaThreadNode.get() contains an invocation of the method it intends to substitute (java.lang.Thread.currentThread()). Doesn't this lead to infinite rewriting? No way, because the @MethodSubstituion where that callsite occurs won't be subject itself (rather, its graph) to replacement.
  2. why isn't eetopOffset() marked @Fold? As before, its graph will be spliced-in anyway, because the default "inlining policy" (think of it as "splicing-in policy") is "everything" for @MethodSubstitution.

@Snippet

Let's jump into an example:


/**
 * Snippet used for lowering {@link CheckCastDynamicNode}.
 */
public class CheckCastDynamicSnippets implements Snippets {

    @Snippet
    public static Object checkcastDynamic(Word hub, Object object) {
        if (probability(NOT_FREQUENT_PROBABILITY, object == null)) {
            isNull.inc();
        } else {
            BeginNode anchorNode = BeginNode.anchor();
            Word objectHub = loadHubIntrinsic(object, getWordKind(), anchorNode);
            if (!checkUnknownSubType(hub, objectHub)) {
                DeoptimizeNode.deopt(InvalidateReprofile, ClassCastException);
            }
        }
        BeginNode anchorNode = BeginNode.anchor();
        return piCast(verifyOop(object), StampFactory.forNodeIntrinsic(), anchorNode);
    }
  

The code shown thus far provides the basis for specializing a snippet template (specialization involves arguments-based lookup followed by instantiation if needed). The specialization aspect for the example is discussed briefly in section "Under the hood".

@MethodSubstitution

The motivation for @MethodSubstitution becomes clear from its main use case: replacing an InvokeNode (to a resolved method, for example a native method) with the snippet of a Graal-defined method. As a benefit, the callee becomes amenable to optimization (via inlining, escape analysis) as opposed to a native method which must be regarded as a black box with heap-effect "may-access-everywhere".

A (mostly self-contained) example of the above is:


/**
 * Substitutions for {@link CRC32}.
 */
@ClassSubstitution(value = CRC32.class)
public class CRC32Substitutions {

    /**
     * Gets the address of {@code StubRoutines::x86::_crc_table} in {@code stubRoutines_x86.hpp}.
     */
    @Fold
    private static long crcTableAddress() {
        return runtime().getConfig().crcTableAddress;
    }

    @MethodSubstitution(isStatic = true)
    static int update(int crc, int b) {
        int c = ~crc;
        int index = (b ^ c) & 0xFF;
        int offset = index << 2;
        int result = Word.unsigned(crcTableAddress()).readInt(offset);
        result = result ^ (c >>> 8);
        return ~result;
    }
  

@MacroNode

A method annotated with @MacroNode indicates an instance of a MacroNode subclass should be instantiated upon replacement. Similarities and differences with @NodeIntrinsic and @MethodSubstitution are discussed next.

As with @NodeIntrinsic, the callsite's arguments (one in this case) allows picking a constructor, ie the single constructor of a MacroNode subclass must take an InvokeNode as single argument. Unlike @NodeIntrinsic, the resulting node can't be floating but must subclass FixedWithNextNode (this is taken care of by MacroNode which extends AbstractMemoryCheckpoint).

The main differences as compared to @MethodSubstitution are:

A concise example of MacroNode is ClassGetComponentTypeNode, reproduced next:


/**
 * {@link MacroNode Macro node} for {@link Class#getComponentType()}.
 * 
 * @see ClassSubstitutions#getComponentType(Class)
 */
public class ClassGetComponentTypeNode extends MacroNode implements Canonicalizable {

    public ClassGetComponentTypeNode(Invoke invoke) {
        super(invoke);
    }

    private ValueNode getJavaClass() {
        return arguments.get(0);
    }

    @Override
    public Node canonical(CanonicalizerTool tool) {
        ValueNode javaClass = getJavaClass();
        if (javaClass.isConstant()) {
            Class c = (Class) javaClass.asConstant().asObject();
            if (c != null) {
                Class componentType = c.getComponentType();
                return ConstantNode.forObject(componentType, tool.getMetaAccess(), graph());
            }
        }
        return this;
    }
}
  

In case canonical() doesn't manage to remove the ClassGetComponentTypeNode, the fallback substitution is performed:


    @MacroSubstitution(macro = ClassGetComponentTypeNode.class, isStatic = false)
    @MethodSubstitution(isStatic = false)
    public static Class getComponentType(final Class thisObj) {
        Word klass = loadWordFromObject(thisObj, klassOffset());
        if (klass.notEqual(0)) {
            if ((readLayoutHelper(klass) & arrayKlassLayoutHelperIdentifier()) != 0) {
                return piCast(klass.readObject(arrayKlassComponentMirrorOffset(), LocationIdentity.FINAL_LOCATION), Class.class, true, true);
            }
        }
        return null;
    }
  

Under the hood

Registration

In order for a method or macro substitution to be available, its defining method must be declared in a class annotated with @ClassSubstitution, and that class must be fed to:


  com.oracle.graal.nodes.spi.Replacements::registerSubstitutions(Class)
  

Registering the method and macro substitutions of a class doesn't compute right away any splicee (ie no graph is prepared just yet). That's deferred till invoking


  Replacements::getMethodSubstitution(ResolvedJavaMethod)
  

The lookup of a macro-substitution-method doesn't give back a StructuredGraph but a Class (the MacroNode subclass).

Registering a snippet comprises a few steps:

  1. a class defining one or more snippets needs to:
  2. the constructor of a template subclass registers the snippets it knows about, caching a SnippetInfo for each.

Specialization of snippet templates

Once a snippet has been registered, obtaining a graph for it involves again a few steps. In a nutshell, "non-specialized snippet graphs" are cached. When about to perform a particular replacement, a specialized copy is obtained, ie specialized to the (non-constant) input nodes for the replacement in question. Caching is provided by AbstractTemplates, a subclass exists for each snippet. Continuing with the CheckCastDynamicSnippets example, its AbstractTemplates subclass appears below, with method lower() encapsulating the code for arguments-based lookup followed by template instantiation if needed:


    public static class Templates extends AbstractTemplates {

        private final SnippetInfo dynamic = snippet(CheckCastDynamicSnippets.class, "checkcastDynamic");

        public Templates(Providers providers, TargetDescription target) {
            super(providers, target);
        }

        public void lower(CheckCastDynamicNode checkcast, LoweringTool tool) {
            StructuredGraph graph = checkcast.graph();
            ValueNode object = checkcast.object();

            Arguments args = new Arguments(dynamic, graph.getGuardsStage(), tool.getLoweringStage());
            args.add("hub", checkcast.hub());
            args.add("object", object);

            SnippetTemplate template = template(args);
            Debug.log("Lowering dynamic checkcast in %s: node=%s, template=%s, arguments=%s", graph, checkcast, template, args);
            template.instantiate(providers.getMetaAccess(), checkcast, DEFAULT_REPLACER, args);
        }
    }
  

Detecting replacement opportunities

With registration out of the way, it remains to explain where in the pipeline are "candidate callsites" checked for substitutions. It's all part of InliningPhase:


    private void tryToInline(GraphInfo callerGraphInfo, MethodInvocation calleeInfo, MethodInvocation parentInvocation, int inliningDepth, HighTierContext context) {
        InlineInfo callee = calleeInfo.callee();
        Assumptions callerAssumptions = parentInvocation.assumptions();

        if (inliningPolicy.isWorthInlining(context.getReplacements(), callee, inliningDepth, calleeInfo.probability(), calleeInfo.relevance(), true)) {
            doInline(callerGraphInfo, calleeInfo, callerAssumptions, context);
        } else if (context.getOptimisticOptimizations().devirtualizeInvokes()) {
            callee.tryToDevirtualizeInvoke(context.getMetaAccess(), callerAssumptions);
        }
        metricInliningConsidered.increment();
    }
  

Briefly, isWorthInlining() caters for the substitutions case (that's why it receives a Replacements as actual argument, ie the registry of substitutions). Afterwards, the actual substitution is performed inside doInline(). Details can be gathered from the following methods in InliningUtil:

Both @Fold and @NodeIntrinsic are expanded by com.oracle.graal.replacements.NodeIntrinsificationPhase. After their expansion, NodeIntrinsificationVerificationPhase checks whether any such replacement has been omitted.

A few pointers to additional details

  1. ReplacementsProvider and its subclasses:
    
    /**
     * Interface for service providers that register replacements with the compiler.
     */
    public interface ReplacementsProvider {
    
        void registerReplacements(MetaAccessProvider metaAccess, LoweringProvider lowerer, Replacements replacements, TargetDescription target);
    }
      
  2. InjectedNodeParameter
    
        /**
         * Denotes an injected parameter in a {@linkplain NodeIntrinsic node intrinsic} constructor. If
         * the constructor is called as part of node intrinsification, the node intrinsifier will inject
         * an argument for the annotated parameter. Injected parameters must precede all non-injected
         * parameters in a constructor.
         */
        @Retention(RetentionPolicy.RUNTIME)
        @Target(ElementType.PARAMETER)
        public static @interface InjectedNodeParameter {
        }
      
  3. BranchProbabilityNode and its intrinsic:
    
        /**
         * This intrinsic should only be used for the condition of an if statement. The parameter
         * condition should also only denote a simple condition and not a combined condition involving
         * && or || operators. It injects the probability of the condition into the if statement.
         * 
         * @param probability the probability that the given condition is true as a double value between
         *            0.0 and 1.0.
         * @param condition the simple condition without any && or || operators
         * @return the condition
         */
        @NodeIntrinsic
        public static boolean probability(double probability, boolean condition) {
            assert probability >= 0.0 && probability <= 1.0;
            return condition;
        }