M. O’Boyle Keynote, CC
- Compilers redux vs. End of Moore’s law, rough ~3.6% improvement
 
- Architecture can’t do it alone
 
- Idea: using FSM for compiler verification only? All dialects have lowering to FSM
 
- Compiler ~ Jenga
 
- HW-SW contract is breaking down → SW cant fit on new (heterogeneous) HW → rethink the contract
 
- Can compilers help as a DSL would? Without having programmers learn all DSL/manage translation etc→ ✨ABSTRACTION✨ → eg shift up x86 → openCL → clBLAS/Halide/…
 
- Target: nearer to algorithm, will always perform well BUT target is complex and changeable → solutions must work for all APIs → AUTOMATION
 
- IDEA: Lift code to API/DSL and THEN to HW
 
- Learn HW accelerator (grey-box program synthesis/infer program behaviour) and match SW
 
- Match libraries to legacy code automatically (neural classifier)
 
- Further: exploit domain knowledge with higher level abstractions (MLIR), multi-level tactics
 
- Neural decompilation
 
- Idea: generate a FSM testbench automatically with neural network/LLM?
 
- Inductive Oracle/Program Synthesis?
 
- Challenges: Scalability, Data, Accuracy (verification?), Explanation
 
- Idea: represent dataflow/control flow with FSMs?
 
- Semantic-based check can be very useful especially between IRs
 
Albert Cohen Keynote, CC
- Abstractions are key for performances → lift the abstractions until sth we know (domain-wise) and enjoy the ride
 
- Reflective Programming must be domesticated → JAX
 
- Multi-stage programming: useful when we know the domain very well/have a plan/know optimisations in advance (AnyDSL, OOPSLA 2018) → manage partial evaluation (Cohen, SCP 2006) → Cross-stage persistence + macro expansion
 
- Autotuning on top of existing compilers (A flexible approach to autotuning, PACT 2021)
 
- Beyond: reinforcement learning for register allocation → open up compiler interface to make it suitable for production
 
- Guided metaprogramming → scheduling languages, split compilation
 
- IRDL
 
- Rethink/Reevaluate correctness requirements
 
- Idea: find problems FSMs can describe efficiently
 
- Testing methodologies → verification?
 
Panel: Compilers and ML, CC
- Saman: compilers haven’t changed much in the last ~30y, although we have more computing power, better/faster algorithms and more data, neither of which we actually use. We should have:
- CaaS: Compilation as a Service (processing power/accelerators/memory and storage)
 
- Can we naively use ML? → Neural Machine Translation BUT no correctness guarantee
 
- Centrally Collected Data for Continuous Improvement exploiting the usage information retrieved by CaaS (real data)
 
 
- Tobias: ML has a transformative impact on compilers, true or false? Compilers are pervasive across multiple domains → can we enable reuse? MLIR enables this → Multi-level rewriting: we broke compilers to pieces → bigger surface to retrieve data/information. Prolific areas for ML in compilers:
- CompCERT → local verification of lowerings
 
- Peephole Rewrites/Proof Synthesis?
 
- ML-written automation support
 
- Integrate performance models with ML?
 
 
- Hyesoon: before → lot of pre/post processing, today → chatGPT
 
- Fabrice: compilers for ML → yes, ML for compilers → lots of paper, but what about production compilers? Extend to:
- Scientific computing
 
- Generic compilers: pattern-spec embedded libraries
 
 
- Xipeng: LLM bridge semantic gap for compilers: high-level semantics is key → correctness challenges.
- LLM: good at high-level code semantics, but lacks precision/scalability/detailed compiler knowledge
 
- Compiler: precise, detailed transformations, but lacks high-level understanding and is easily blocked by ambiguities
 
- LLM-compiler synergy: compiler-assisted LLM + LLM-assisted compiler → potential = find new optimisations/rewritings via math formulae
 
 
- Q: ML interpretability for correctness
 
- Q: security of cloud compilation?
 
- Q: avoid black-box → improve interaction with the compiler + need for safety → expose bigger surface? → no clear answer: embrace the dark side
 
Amir Shaikhha - A Tensor Algebra Compiler for Sparse Differentiation
- Program = bottleneck between data and hardware
 
- Challenges for sparse differentiation: require control-flow constructs and are complicated
 
- There are a lot of sparse storage format
 
- Two levels of abstractions: logical (AD) + physical (compressed formats: expressive enough for the conversion)
 
- Semi-rings are cool and compositional → can be used as dictionaries to represent sparse matrices
 
Katsumi Okuda - AskIt: Unified Programming Interface for Programming with Large Language Models
- Communication: program <- function call → program, human <-programming language → program, human <-natural language→ LLM-based system
 
- Question: program <-?→ LLM → use DSL for LLM as part of a program
 
- Emergent abilities of LLMs = directly answerable questions
 
- GPT-4 as a parser: make prompt → send prompt → extract result
 
- AskIt: define type → define function → call function
 
- Intersection of directly answerable tasks and codable tasks
 
- Programming interface to have LLMs and programmers interact easily with the program
 
- Does generated code improve performance? (GSM8K benchmark)? Yes
 
- Verification of results: no (AST), the generated programs are tested for correctness (test cases)
 
Volker Seeker - Revealing Compiler Heuristic through Automated Discovery and Optimisation
- Motivation: compilers are great, but difficult to optimize → heuristics guide code optimizations, there are default optimizations settings
 
- Tuning heuristics is beneficial → ML-guided heuristic tuning
 
- impossible to find/tune manually everything, especially in large code bases which are constantly updated etc → automation: what to optimise wrt objectives → existing approaches:
- iterative compilation
 
- promising: RL, but only for exposed heuristics
 
 
- what they do: automatically find heuristics and expose them to attach tuning
 
- how to find heuristics: try to find one that validates the output, then explore the output semantics → possibly improved → differential testing
 
- both application and compiler optimization
 
- NAS, Polybench benchmark suite → identified different tuning targets for each benchmark (only individual) → Q: what is a tuning target/why some benchmarks have more?
 
- core: enable fine-grained ML tuning for compiler heuristics → not only compilers
 
SLaDe
- Decompilation → considerable manual work and code is often mangled (e.g. Ghidra: produces hard to read code)
 
- Neural Decompilation: automated, trained on human-written code → readable
 
- BTC: code is plausible but incorrect, no correctness evaluation
 
- ChatGPT: readable, but incorrect
 
- Tradeoff: correctness vs. Readability → SLaDe is both
 
- small, specialised transformers have potential
 
Jean-Michel Gorius - FSM verification
- preserve properties across automata transformations
 
- translation validation throughout the compilation flow
 
- verifying that hw is consistent with initial hypotheses
 
- GPU optimisation → parallel semantics
 
- Synchronisation interaction → preserve barriers semantics → divergent control flow?
 
- thread coarsening → no barriers in divergent control flow → recursive interleaving→ always legal here
 
- Granularity of coarsening factor → finer granularity yields better results
 
Computing Systems for the Foundation Model Era
- Provide scale and efficiency + achieve performance and programmability → vertically integrated solution: ML algorithms + dataflow compilers + reconfigurable dataflow architecture
 
- General purpose monolithic model vs. Hundreds of specialist models with automatic routing → smaller fine-tuned models are cheaper and sometimes better than large, generalist ones
 
- challenge: sparse training with accuracy and HW efficiency
 
- dataflow compilers: dataflow graph analysis is important → dataflow programming → optimisation by scheduling in space and time: fusion, tiling, pipelining → lower representation closer to hw (e.g. DSL called spatial, PLDI)
 
- reconfigurable dataflow architecture