These notes were inspired by the Graal Technical Documentation
Once GraphBuilderPhase
is done, the graph undergoes "reductions", in the form of canonicalization, lowering, simplification, and replacements.
A replacement, the focus of this section, involves replacing a "from" node with a "to" subgraph.
That can be done programmatically (as with canonicalization, lowering, and simplification) but there's a more maintainable way: expressing the "to" subgraph as a Java method body.
We'll use the term "splicee" instead of "to subgraph".
A method defining a replacement can be recognized by the annotations described below, which mirror the shape of the splicee of interest:
Annotation | Splicee |
---|---|
@Fold | constant (primitive, string, or Class instance value) |
@NodeIntrinsic | a Node instance |
@Snippet | a method-graph, obtained from compiling a Graal-defined method including all of its (direct or indirect) callees. |
@MethodSubstitution | a method-graph (in this aspect similar to @Snippet but otherwise quite different, as discussed below) |
@MacroNode | an instance of a MacroNode subclass |
Common to all the annotations above is the lexical scope where callsites (targeting methods annotated by them) may occur.
Given those callsites denote a graph replacement, the callsite must occur in a @Snippet
-annotated method, or in a direct or indirect callee of a @Snippet
-annotated method;
similarly for @MethodSubstitution
and for @MacroNode
(ie such callsites may also occur in methods annotated with @MethodSubstitution
or @MacroNode
, or any of their callees, at any depth).
At the risk of some imprecision, the replacement workflow can be summarized as: replacements of the @Snippet
and @MethodSubstitution
varieties are registered automatically by Graal,
before any replacements are made. Afterwards, callsites targeting replacement-defining-methods are expanded as they are found.
The remaining replacements (@Fold
, @NodeIntrinsic
, and @MacroNode
) must be contained in the body of a previous replacement in order to be expanded,
their expansion performed upon traversing the result of a previous replacement.
Each section below covers a replacement variety. Although the presentation tries to keep cross-references to a minimum, some are unavoidable, due to the interplay that's available by design.
@Fold
A @Fold
-annotated method is a side-effect-free method invoked via reflection during compilation,
returning a constant value for its arguments (which must be compile-time constants).
Such value is spliced-in as a ConstantNode
(wrapping a Constant
of the appropriate kind) into the graph under compilation.
Typically a folded method encapsulates:
@Fold
public static int pageSize() {
return Unsafe.getUnsafe().pageSize();
}
@Fold
public static Kind getWordKind() {
return runtime().getTarget().wordKind;
}
@Fold
private static boolean logging() {
return Boolean.getBoolean("graal.logExceptionHandlerStub");
}
There are no examples of @Fold
methods taking params, which would be rarely needed anyway because node canonicalization does the job already.
@NodeIntrinsic
A callsite targeting a @NodeIntrinsic
-annotated method denotes a Node
instance, with inputs given by the actual arguments at the callsite. Accordingly, the callsite's argument must allow selecting a single constructor (of the Node
subclass indicated by the @NodeIntrinsic
annotation).
An example of a shorthand for DeoptimizeNode
is a callsite targeting:
@NodeIntrinsic
public static native void deopt(@ConstantNodeParameter DeoptimizationAction action, @ConstantNodeParameter DeoptimizationReason reason);
Usually a method annotated @NodeIntrinsic
lacks a body, but here's an exception: CurrentJavaThreadNode
(a floating-node class) defines an intrinsifying method:
@NodeIntrinsic(setStampFromReturnType = true)
public static Word get() {
return Word.unsigned(unsafeReadWord(Thread.currentThread(), eetopOffset()));
}
In short, the body of a @NodeIntrinsic
annotated method exists purely so that the (@MethodSubstitution
annotated) calling method
can be interpreted should it be deoptimized.
But, we aren't done with the example. The eetopOffset()
callsite above (more on this in a moment) refers to helper method also in CurrentJavaThreadNode
:
private static int eetopOffset() {
try {
return (int) UnsafeAccess.unsafe.objectFieldOffset(Thread.class.getDeclaredField("eetop"));
} catch (Exception e) {
throw new GraalInternalError(e);
}
}
One of the callsites to the intrinsifying CurrentJavaThreadNode.get()
can be found in ThreadSubstitutions
:
/**
* Substitutions for {@link java.lang.Thread} methods.
*/
@ClassSubstitution(java.lang.Thread.class)
public class ThreadSubstitutions {
@MethodSubstitution
public static Thread currentThread() {
return (Thread) CurrentJavaThreadNode.get().readObject(threadObjectOffset(), LocationIdentity.FINAL_LOCATION);
}
Finally, two more questions can be answered:
CurrentJavaThreadNode.get()
contains an invocation of the method it intends to substitute (java.lang.Thread.currentThread()
).
Doesn't this lead to infinite rewriting? No way, because the @MethodSubstituion
where that callsite occurs won't be subject itself (rather, its graph) to replacement.eetopOffset()
marked @Fold
? As before, its graph will be spliced-in anyway, because the default "inlining policy"
(think of it as "splicing-in policy") is "everything" for @MethodSubstitution
.@Snippet
Let's jump into an example:
/**
* Snippet used for lowering {@link CheckCastDynamicNode}.
*/
public class CheckCastDynamicSnippets implements Snippets {
@Snippet
public static Object checkcastDynamic(Word hub, Object object) {
if (probability(NOT_FREQUENT_PROBABILITY, object == null)) {
isNull.inc();
} else {
BeginNode anchorNode = BeginNode.anchor();
Word objectHub = loadHubIntrinsic(object, getWordKind(), anchorNode);
if (!checkUnknownSubType(hub, objectHub)) {
DeoptimizeNode.deopt(InvalidateReprofile, ClassCastException);
}
}
BeginNode anchorNode = BeginNode.anchor();
return piCast(verifyOop(object), StampFactory.forNodeIntrinsic(), anchorNode);
}
The code shown thus far provides the basis for specializing a snippet template (specialization involves arguments-based lookup followed by instantiation if needed). The specialization aspect for the example is discussed briefly in section "Under the hood".
@MethodSubstitution
The motivation for @MethodSubstitution
becomes clear from its main use case: replacing an InvokeNode
(to a resolved method, for example a native method)
with the snippet of a Graal-defined method. As a benefit, the callee becomes amenable to optimization (via inlining, escape analysis) as opposed to a native method
which must be regarded as a black box with heap-effect "may-access-everywhere".
A (mostly self-contained) example of the above is:
/**
* Substitutions for {@link CRC32}.
*/
@ClassSubstitution(value = CRC32.class)
public class CRC32Substitutions {
/**
* Gets the address of {@code StubRoutines::x86::_crc_table} in {@code stubRoutines_x86.hpp}.
*/
@Fold
private static long crcTableAddress() {
return runtime().getConfig().crcTableAddress;
}
@MethodSubstitution(isStatic = true)
static int update(int crc, int b) {
int c = ~crc;
int index = (b ^ c) & 0xFF;
int offset = index << 2;
int result = Word.unsigned(crcTableAddress()).readInt(offset);
result = result ^ (c >>> 8);
return ~result;
}
@MacroNode
A method annotated with @MacroNode
indicates an instance of a MacroNode
subclass should be instantiated upon replacement.
Similarities and differences with @NodeIntrinsic
and @MethodSubstitution
are discussed next.
As with @NodeIntrinsic
, the callsite's arguments (one in this case) allows picking a constructor,
ie the single constructor of a MacroNode
subclass must take an InvokeNode
as single argument.
Unlike @NodeIntrinsic
, the resulting node can't be floating but must subclass FixedWithNextNode
(this is taken care of by MacroNode
which extends AbstractMemoryCheckpoint
).
The main differences as compared to @MethodSubstitution
are:
@MacroNode
-annotated method that's not annotated with @MethodSubstitution
is lowered as any other node.
For example it may be removed by canonicalization, or it may be lowered with the help of the LoweringProvider
in effect
(there's a lowering provider for each architecture). Which brings us to a small difference regarding lowering.
The MacroNode
override of Lowerable.lower(LoweringTool)
may use two MacroNode
-defined helpers:
getLoweredSnippetGraph(LoweringTool)
and getLoweredSubstitutionGraph(LoweringTool)
to customize the lowering in terms of snippets or method substitutions.
@MacroNode
and @MethodSubstitution
is lowered in a first attempt as for the previous item.
In case it remains in the graph, it is replaced with the snippet of the annotated method.
A concise example of MacroNode
is ClassGetComponentTypeNode
, reproduced next:
/**
* {@link MacroNode Macro node} for {@link Class#getComponentType()}.
*
* @see ClassSubstitutions#getComponentType(Class)
*/
public class ClassGetComponentTypeNode extends MacroNode implements Canonicalizable {
public ClassGetComponentTypeNode(Invoke invoke) {
super(invoke);
}
private ValueNode getJavaClass() {
return arguments.get(0);
}
@Override
public Node canonical(CanonicalizerTool tool) {
ValueNode javaClass = getJavaClass();
if (javaClass.isConstant()) {
Class c = (Class) javaClass.asConstant().asObject();
if (c != null) {
Class componentType = c.getComponentType();
return ConstantNode.forObject(componentType, tool.getMetaAccess(), graph());
}
}
return this;
}
}
In case canonical()
doesn't manage to remove the ClassGetComponentTypeNode
, the fallback substitution is performed:
@MacroSubstitution(macro = ClassGetComponentTypeNode.class, isStatic = false)
@MethodSubstitution(isStatic = false)
public static Class> getComponentType(final Class> thisObj) {
Word klass = loadWordFromObject(thisObj, klassOffset());
if (klass.notEqual(0)) {
if ((readLayoutHelper(klass) & arrayKlassLayoutHelperIdentifier()) != 0) {
return piCast(klass.readObject(arrayKlassComponentMirrorOffset(), LocationIdentity.FINAL_LOCATION), Class.class, true, true);
}
}
return null;
}
In order for a method or macro substitution to be available, its defining method must be declared in a class annotated with @ClassSubstitution
, and that class must be fed to:
com.oracle.graal.nodes.spi.Replacements::registerSubstitutions(Class)
Registering the method and macro substitutions of a class doesn't compute right away any splicee (ie no graph is prepared just yet). That's deferred till invoking
Replacements::getMethodSubstitution(ResolvedJavaMethod)
The lookup of a macro-substitution-method doesn't give back a StructuredGraph
but a Class
(the MacroNode
subclass).
Registering a snippet comprises a few steps:
com.oracle.graal.replacements.Snippets
SnippetTemplate.AbstractTemplates
SnippetInfo
for each.
Once a snippet has been registered, obtaining a graph for it involves again a few steps.
In a nutshell, "non-specialized snippet graphs" are cached. When about to perform a particular replacement,
a specialized copy is obtained, ie specialized to the (non-constant) input nodes for the replacement in question.
Caching is provided by AbstractTemplates
, a subclass exists for each snippet.
Continuing with the CheckCastDynamicSnippets
example, its AbstractTemplates
subclass appears below,
with method lower()
encapsulating the code for arguments-based lookup followed by template instantiation if needed:
public static class Templates extends AbstractTemplates {
private final SnippetInfo dynamic = snippet(CheckCastDynamicSnippets.class, "checkcastDynamic");
public Templates(Providers providers, TargetDescription target) {
super(providers, target);
}
public void lower(CheckCastDynamicNode checkcast, LoweringTool tool) {
StructuredGraph graph = checkcast.graph();
ValueNode object = checkcast.object();
Arguments args = new Arguments(dynamic, graph.getGuardsStage(), tool.getLoweringStage());
args.add("hub", checkcast.hub());
args.add("object", object);
SnippetTemplate template = template(args);
Debug.log("Lowering dynamic checkcast in %s: node=%s, template=%s, arguments=%s", graph, checkcast, template, args);
template.instantiate(providers.getMetaAccess(), checkcast, DEFAULT_REPLACER, args);
}
}
With registration out of the way, it remains to explain where in the pipeline are "candidate callsites" checked for substitutions.
It's all part of InliningPhase
:
private void tryToInline(GraphInfo callerGraphInfo, MethodInvocation calleeInfo, MethodInvocation parentInvocation, int inliningDepth, HighTierContext context) {
InlineInfo callee = calleeInfo.callee();
Assumptions callerAssumptions = parentInvocation.assumptions();
if (inliningPolicy.isWorthInlining(context.getReplacements(), callee, inliningDepth, calleeInfo.probability(), calleeInfo.relevance(), true)) {
doInline(callerGraphInfo, calleeInfo, callerAssumptions, context);
} else if (context.getOptimisticOptimizations().devirtualizeInvokes()) {
callee.tryToDevirtualizeInvoke(context.getMetaAccess(), callerAssumptions);
}
metricInliningConsidered.increment();
}
Briefly, isWorthInlining()
caters for the substitutions case (that's why it receives a Replacements
as actual argument, ie the registry of substitutions).
Afterwards, the actual substitution is performed inside doInline()
.
Details can be gathered from the following methods in InliningUtil
:
canIntrinsify()
getIntrinsicGraph()
getMacroNodeClass()
inlineMacroNode().
createMacroNodeInstance()
Both @Fold
and @NodeIntrinsic
are expanded by com.oracle.graal.replacements.NodeIntrinsificationPhase
.
After their expansion, NodeIntrinsificationVerificationPhase
checks whether any such replacement has been omitted.
ReplacementsProvider
and its subclasses:
/**
* Interface for service providers that register replacements with the compiler.
*/
public interface ReplacementsProvider {
void registerReplacements(MetaAccessProvider metaAccess, LoweringProvider lowerer, Replacements replacements, TargetDescription target);
}
InjectedNodeParameter
/**
* Denotes an injected parameter in a {@linkplain NodeIntrinsic node intrinsic} constructor. If
* the constructor is called as part of node intrinsification, the node intrinsifier will inject
* an argument for the annotated parameter. Injected parameters must precede all non-injected
* parameters in a constructor.
*/
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.PARAMETER)
public static @interface InjectedNodeParameter {
}
BranchProbabilityNode
and its intrinsic:
/**
* This intrinsic should only be used for the condition of an if statement. The parameter
* condition should also only denote a simple condition and not a combined condition involving
* && or || operators. It injects the probability of the condition into the if statement.
*
* @param probability the probability that the given condition is true as a double value between
* 0.0 and 1.0.
* @param condition the simple condition without any && or || operators
* @return the condition
*/
@NodeIntrinsic
public static boolean probability(double probability, boolean condition) {
assert probability >= 0.0 && probability <= 1.0;
return condition;
}