Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make IRWitnessTable HOISTABLE #6417

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

jkwak-work
Copy link
Collaborator

@jkwak-work jkwak-work commented Feb 21, 2025

Make IRWitnessTable Hoistable

Intention of the PR

This commit makes IRWitnessTable Hoistable so that we can avoid duplicated IRWitnessTable.

Problems

This commit tries to address the following issues arise after turning IRWitnessTable into Hoistable:

  1. A Hoistable instance is immutable.
  2. When tries to create a duplicated child, you will get a previously created instance of IRWitnessTable, instead of a new one.
  3. We don't actually want to hoist IRWitnessTable.
  4. There can be only one instance of Hoistable and it cannot appear as childs multiple times.
  5. Different import/export mangled names were used for the same Witness-table when its type is "enum" interface.

Implementation

Solution for "1. A Hoistable instance is immutable."

IRWitnessTable::setConcreteType() is removed, because when an IRInst is Hoistable, it is treated as immutable. Any IRInst::setXXX() methods don't work anymore.

There were two places calling setConcreteType() and their logic had to change little bit.

DeclLoweringVisitor::visitInheritanceDecl() in source/slang/slang-lower-to-ir.cpp was calling setConcreteType(). It had a little strange logic around lowerType(). The IRWitnessTable was added with context->setGlobalValue() first and its concreteType was changed later. This commit works around in a way that it sets the parent of IRWitnessTable temporarily and reset it with the correct IRWitnessTable. Without this logic, it went into an infinite recursion.

AutoDiffPass::fillDifferentialTypeImplementation() in source/slang/slang-ir-autodiff.cpp was calling setConcreteType(). It was changing the concreteType of innerResult.diffWitness. This commit creates a new IRWitnessTable and copies its IRWitnessTableEntry.

Solution for "2. When tries to create a duplicated child, you will get a previously created instance of IRWitnessTable, instead of a new one"

After a call to IRBuilder::createWitnessTable(), this commit checks if the returned IRWitnessTable is a brand new or not. If it is not a new one, we have to avoid adding the decorations and children.

This commit decides when to add decorations and children based on whether IRWitnessTable has any of decorations or children already. It doesn't seem like a proper way to check. But when I tried, it was difficult to find a bottleneck point where the decorations and children are added to IRWitnessTable first time. Note that we are not trying to find when IRWitnessTable is created for the first time; we need to find if the decorations and children were added once.

It might be fine to have duplicated IRWitnessTableEntry in most of the cases, but I noticed that it fails an assertion check when shouldDeepCloneWitnessTable() returns false in cloneWitnessTableImpl().

Solution for "3. We don't actually want to hoist IRWitnessTable."

The reason why this commit makes IRWitnessTable is to prevent the duplicated instances of IRInst. But we don't really want to "Hoist" them.

When an IRWitnessTable gets Hoisted out, it causes unexpected problems and the specialization process fails due to the missing IRWitnessTable in the input.

This commit prevent from hoisting IRWitnessTable in _replaceInstUsesWith(). The way this is implemented feel little hack but we discussed on Slack and decided to go with this. One of the proper approaches could be to add a new flag in IROpFlags and have a new one like kIROpFlag_Deduplicate, which is different from just kIROpFlag_Hoistable.

Solution for "4. There can be only one instance of Hoistable and it cannot appear as childs multiple times."

When IRWitnessTable is Hoistable, there can be only a unique set of instances. And we cannot have an instance as a duplicated childs. It is because IRInst has only one set of IRInst* next and IRInst* prev.

Before this commit, an instance of IRGeneral could have duplicated instances of IRWitnessTable. As an example, IInteger interface inherits two other interfaces, IArithmetic and ILogical. And they both inherits from IComparable.

interface IInteger : IArithmetic, ILogical {}
interface IArithmetic : IComparable {}
interface ILogical : IComparable

When we specialize it in specializeGenericImpl(), an IRBlock gets the following list of children:

  • IRWitnessTable for IComparable,
  • IRWitnessTable for IArithmetic,
  • IRWitnessTable for IComparable,
  • IRWitnessTable for ILogical,

For the cloning during the specialize, "IRWitnessTable for IComparable" must be cloned before the cloning of "IRWitnessTable for IArithmetic". Because "IRWitnessTable for IArithmetic" refers "IRWitnessTable for IComparable" as its IRWitnessTableEntry. The order they appear in the IRBlock as children decides which instances will be cloned first. And "IRWitnessTable for IComparable" must appear before "IRWitnessTable for IArithmetic".

Note that "IRWitnessTable for IComparable" appears twice, The first one was added for "IRWitnessTable for IArithmetic". And the second one is added for "IRWitnessTable for ILogical".

With this commit "IRWitnessTable for IComparable" can appear as a child only once in IRBlock. So it causes an error if it gets the following list:

  • IRWitnessTable for IArithmetic,
  • IRWitnessTable for IComparable,
  • IRWitnessTable for ILogical,

In order to resolve the problem, "IRWitnessTable for IComparable" must appear before both "IRWitnessTable for IArithmetic" and "IRWitnessTable for ILogical" as following:

  • IRWitnessTable for IComparable,
  • IRWitnessTable for IArithmetic,
  • IRWitnessTable for ILogical,

To address the problem, the instances of IRWitnessTable is always added to the end of the children list. If it is already added to the list, we don't move. This works out because the AST tree is built based on the dependencies.

Solution for "5. Different import/export mangled names were used for the same Witness-table when its type is "enum" interface."

This issue was found while testing with Falcor tests where it uses Conformance-type feature of Slang.
We are using different import and export mangled names for a same Witness-table when the witness-table is for "Enum" interface.

The way we simplify the implementation of "Enum" causes a problem when it comes to generate export/import for the witness-table. And the exact repro step is still unclear.

There were two suggested solutions for the problem and this PR adopted the first option for now. Maybe we want to improve it with the second option later.

option 1, when we produce mangled names for those witness-table, we can use a mangled name with the underlying "int" type instead of the name of the enum type. In this way, all witness-tables for enum types whose underlying type is same will get the same mangled name. It will allow us to deduplicate the witness-table during the linking.

option 2, we can preserve type info for enum type when generating IR. We can still erase all other uses of the type info of enum types for now. But when we generate the witness-table, instead of filling the conforming type operand to IntType, we fill it as EnumType(IntType) where EnumType is a new global IROp code to represent all enum types (like InterfaceType/StructType). This way the operands for the two witness-tables will be different.

"option 1" is more quick and dirty and "option 2" is more proper way to address it.
I should go with "option 1" and improve it with "option 2" approach later.

@jkwak-work jkwak-work added the pr: non-breaking PRs without breaking changes label Feb 21, 2025
@jkwak-work jkwak-work self-assigned this Feb 21, 2025
@jkwak-work
Copy link
Collaborator Author

jkwak-work commented Feb 21, 2025

This PR is still work in progress, but any comments and feedback will be welcomed.

Currently 17 tests are failing with this PR:

  • tests/autodiff/existential-1.slang
  • tests/autodiff/existential-2.slang
  • tests/bugs/type-legalize-bug-1.slang
  • tests/compute/assoctype-func-param.slang
  • tests/compute/byte-address-buffer.slang
  • tests/compute/column-major.slang
  • tests/language-feature/generics/dependent-generic-2.slang
  • tests/language-feature/generics/dependent-generic-3.slang
  • tests/language-feature/generics/dependent-generic.slang
  • tests/language-feature/generics/generic-assoc-type-constraint.slang
  • tests/language-feature/generics/generic-interface-2.slang
  • tests/language-feature/generics/generic-return-type-requirement.slang
  • tests/language-feature/generics/generic-witness-derived.slang
  • tests/language-feature/generics/variadic-0.slang
  • tests/language-feature/interfaces/generic-requirement-synth-2.slang
  • tests/language-feature/interfaces/is-as-dynamic.slang
  • tests/metal/texture.slang

I am investigating what is happening with them.

@jkwak-work
Copy link
Collaborator Author

/format

@slangbot
Copy link
Contributor

🌈 Formatted, please merge the changes from this PR

@jkwak-work jkwak-work force-pushed the feature/make-witness-table-hoistable branch from 475b891 to 58a5db2 Compare February 28, 2025 22:45
@jkwak-work
Copy link
Collaborator Author

I pushed more code change.
It is still failing on two tests and I am working on it:

  • tests/compute/assoctype-func-param.slang
  • tests/metal/texture.slang

@jkwak-work
Copy link
Collaborator Author

For the remaining problem, here is a minimal repro shader,

RWStructuredBuffer<bool> outputBuffer;


__generic<T : IArithmetic>
bool test()
{
    return T(0) == T(0);
}

[numthreads(1, 1, 1)]
void computeMain()
{
    outputBuffer[0] = test<int4>();
}

When compiled, I get the following error,

(0): error 99999: Slang compilation aborted due to an exception of class Slang::InternalError: unimplemented: Unhandled global inst in spirv-emit:
let  %1 : %2    = undefined

It appears that some of IRinst-s are removed by DCE and their uses are replaced by "undefined" incorrectly.
When dumped IR, I see that some of functions are not specialized "AFTER-SPECIALIZE", probably because witness-tables are not properly applied.
I will keep digging.

@jkwak-work
Copy link
Collaborator Author

With my latest changes, the only test failing is

  • tests/compute/assoctype-func-param.slang

@jkwak-work
Copy link
Collaborator Author

/format

@slangbot
Copy link
Contributor

slangbot commented Mar 5, 2025

🌈 Formatted, please merge the changes from this PR

@jkwak-work jkwak-work force-pushed the feature/make-witness-table-hoistable branch 3 times, most recently from d18a5c7 to 2d863cb Compare March 6, 2025 23:49
@jkwak-work jkwak-work marked this pull request as ready for review March 7, 2025 00:38
@jkwak-work jkwak-work requested a review from a team as a code owner March 7, 2025 00:38
@jkwak-work jkwak-work force-pushed the feature/make-witness-table-hoistable branch from 68be3b0 to 6a8b99d Compare March 7, 2025 02:48
@jkwak-work
Copy link
Collaborator Author

/format

@slangbot
Copy link
Contributor

slangbot commented Mar 7, 2025

🌈 Formatted, please merge the changes from this PR

@jkwak-work jkwak-work force-pushed the feature/make-witness-table-hoistable branch from c926457 to 067d554 Compare March 7, 2025 02:56
@jkwak-work
Copy link
Collaborator Author

/format

@slangbot
Copy link
Contributor

slangbot commented Mar 7, 2025

🌈 Formatted, please merge the changes from this PR

@jkwak-work
Copy link
Collaborator Author

jkwak-work commented Mar 7, 2025

It looks like my PR is failing on a new test,

  • tests/bugs/gh-6504-linker.slang

And Falcor test is also crashing at

  • BSDFIntegratorTests.cpp:BSDFIntegrator (D3D12)

@jkwak-work
Copy link
Collaborator Author

Changing to draft until I fix the failing test and falcor tests

@jkwak-work jkwak-work marked this pull request as draft March 7, 2025 03:56
@jkwak-work
Copy link
Collaborator Author

It turned out that the test failure of the following test was due to an incorrect resolve of the merge conflict.

  • tests/bugs/gh-6504-linker.slang

@jkwak-work
Copy link
Collaborator Author

jkwak-work commented Mar 7, 2025

For the failing Falcor tests, there are two failing tests for "falcor-unit-test":

  • BSDFIntegrator
  • RGLAcquisition

Both of them looks to have a same assertion failure at

 	slang.dll!Slang::cast<Slang::IRInterfaceType>(Slang::IRInst * inst, Slang::IRInterfaceType * __formal) Line 871	C++
 	slang.dll!Slang::DifferentiableTypeConformanceContext::getConformanceTypeFromWitness(Slang::IRInst * witness) Line 1113	C++
 	slang.dll!Slang::DifferentiableTypeConformanceContext::getConformanceTypeFromWitness(Slang::IRInst * witness) Line 1148	C++
 	slang.dll!Slang::DifferentiableTypeConformanceContext::setFunc(Slang::IRGlobalValueWithCode * func) Line 1220	C++
 	slang.dll!Slang::AutoDiffPass::generateDifferentialImplementationForContextType(Slang::OrderedHashSet<Slang::IRInst *> & contextTypes, Slang::Dictionary<Slang::IRInst *,Slang::IRGlobalValueWithCode *,Slang::Hash<Slang::IRInst *>,std::equal_to<Slang::IRInst *>> typeToBwdFuncMap) Line 2931	C++
 	slang.dll!Slang::AutoDiffPass::lowerIntermediateContextType(Slang::IRBuilder * builder) Line 2811	C++
 	slang.dll!Slang::AutoDiffPass::processReferencedFunctions(Slang::IRBuilder * builder) Line 3485	C++
 	slang.dll!Slang::AutoDiffPass::processModule() Line 2671	C++
 	slang.dll!Slang::processAutodiffCalls(Slang::TargetProgram * target, Slang::IRModule * module, Slang::DiagnosticSink * sink, const Slang::IRAutodiffPassOptions & __formal) Line 3641	C++
 	slang.dll!Slang::linkAndOptimizeIR(Slang::CodeGenContext * codeGenContext, const Slang::LinkingAndOptimizationOptions & options, Slang::LinkedIR & outLinkedIR) Line 941	C++
 	slang.dll!Slang::CodeGenContext::emitEntryPointsSourceFromIR(Slang::ComPtr<Slang::IArtifact> & outArtifact) Line 1955	C++
 	slang.dll!Slang::CodeGenContext::emitEntryPointsSource(Slang::ComPtr<Slang::IArtifact> & outArtifact) Line 792	C++
 	slang.dll!Slang::CodeGenContext::emitWithDownstreamForEntryPoints(Slang::ComPtr<Slang::IArtifact> & outArtifact) Line 1355	C++
 	slang.dll!Slang::CodeGenContext::_emitEntryPoints(Slang::ComPtr<Slang::IArtifact> & outArtifact) Line 1830	C++
 	slang.dll!Slang::CodeGenContext::emitEntryPoints(Slang::ComPtr<Slang::IArtifact> & outArtifact) Line 1885	C++
 	slang.dll!Slang::TargetProgram::_createEntryPointResult(__int64 entryPointIndex, Slang::DiagnosticSink * sink, Slang::EndToEndCompileRequest * endToEndReq) Line 2081	C++
 	slang.dll!Slang::TargetProgram::getOrCreateEntryPointResult(__int64 entryPointIndex, Slang::DiagnosticSink * sink) Line 2119	C++
 	slang.dll!Slang::ComponentType::getEntryPointCode(__int64 entryPointIndex, __int64 targetIndex, ISlangBlob * * outCode, ISlangBlob * * outDiagnostics) Line 5067	C++
 	gfx.dll!gfx::RendererBase::getEntryPointCodeFromShaderCache(slang::IComponentType * program, __int64 entryPointIndex, __int64 targetIndex, ISlangBlob * * outCode, ISlangBlob * * outDiagnostics) Line 386	C++

Slang got ITWitnessTableType when it expects IRInterfaceType.

There are bunch failing tests for "falcor-image-test", but they seem to have a same problem.

  internal/renderpasses/test_DepthOfField_d3d12                : STARTED
  internal/renderpasses/test_DepthOfField_d3d12                : FAILED (89.2 s)
    Compiling core module on debug build, this can take a while.
    Compiling core module took 19.00 seconds.
    (Error) GFX Error: D:\sbf\git\slang\FalcorBin\build\windows-vs2022\bin\Debug\shaders\Scene/Material\MaterialTypes.slang(71): error 45001: unresolved external symbol '_SW13MaterialType.
    enum class ShadingModel
               ^~~~~~~~~~~~
    core(103): note: see declaration of 'int'
        bool equals(This other);
             ^~~~~~

    D:\sbf\git\slang\FalcorBin\build\windows-vs2022\bin\Debug\Mogwai.exe exited with return code 3221225477
    View test at: http://NV-F9JNWM3:8080//unknown/internal/renderpasses/test_DepthOfField_d3d12

Image tests FAILED (89.2 s).

@slangbot
Copy link
Contributor

🌈 Formatted, please merge the changes from this PR

This commit deduplicates Witnessable by making it HOISTABLE.

Three problems were observed when changed to HOISTABLE, and they are
addressed in this commit.
 1. A HOISTABLE IRInst is immutable once created. `SetOperand()`
    functions became unavailable. There were code that was modifying
    WitnessTable after creating it. This commit changed the behavior so
    that all operands are prepared before creating a new WitnessTable,
    and all operands are given to its constructor.
 1. WitnessTable started to have duplicated decorations and children.
    Due to the nature of HOISTABLE, when create a new WitnessTable, a
    pre-existing WitnessTable is returned and reused. There isn't an
    easy way to tell if the WitnessTable is a brand new or reused at the
    moment. This commit assumes that it is a brand new only when it has
    no decorations and chilrend after its creation.
 1. In `SimplifyIR()`, the order of children were slightly changed as a
    result of the optimization. The behavior was little different when
    WitnessTable became HOISTABLE. This resulted in an error where
    WitnessTable has a WitnessEntry pointing to an incorrect
    `IRSpecialize`. In order for it to function properly, `IRSpecialize`
    had to appear before `IRWitnessTable`.
Fixes the failing test, tests/compute/assoctype-func-param.slang
@jkwak-work jkwak-work force-pushed the feature/make-witness-table-hoistable branch from 336254b to 99b3f50 Compare March 20, 2025 21:11
@jkwak-work
Copy link
Collaborator Author

/format

@slangbot
Copy link
Contributor

🌈 Formatted, please merge the changes from this PR

@jkwak-work jkwak-work force-pushed the feature/make-witness-table-hoistable branch from a96b581 to f768ee7 Compare March 21, 2025 04:01
@jkwak-work
Copy link
Collaborator Author

jkwak-work commented Mar 21, 2025

This PR is ready for another review.
@csyonghe

// In order to de-duplicate them, Witness-table is marked as Hoistable.
// But it is not exactly a hoistable type and it can be added simpler.
if (inst->getOp() == kIROp_WitnessTable)
addDeduplicatedInst(inst);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry I wasn’t communicating my meaning clearly earlier. My question was is it sufficient to just call the existing addInst function here, instead of calling the new addDeduplicatedInst?

Copy link
Collaborator Author

@jkwak-work jkwak-work Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I used the exisitng IRBuilder::addInst() , one of tests was failing. So no, I couldn't use the function.

tests/autodiff/dynamic-dispatch-material.slang

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can investigate why it fails if IRBuilder::addInst() is preferred.
Considering that it is only failing test, I think there might be a problem for the specific use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr: non-breaking PRs without breaking changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants