Skip to content

Commit d40d7f4

Browse files
committed
fix typo
1 parent a98c89b commit d40d7f4

13 files changed

+41
-36
lines changed
411 KB
Loading
112 KB
Loading
-21.3 KB
Binary file not shown.

dist/_file/assets/landscape-placeholder.e488c3da.svg

-5
This file was deleted.
68.2 KB
Loading
Loading

dist/_observablehq/minisearch.81ea0f11.json

-1
This file was deleted.

dist/_observablehq/minisearch.d0afa69b.json

+1
Large diffs are not rendered by default.

dist/_observablehq/search.js

+1-1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

dist/index.html

+32-22
Large diffs are not rendered by default.

dist/team.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ <h1 id="team" tabindex="-1"><a class="observablehq-header-anchor" href="#team">T
8080
<div class="card grid grid-cols-4 justify-center items-center">
8181
<img class="shadow rounded-full max-w-full h-auto align-middle border-none" src="./_file/team-images/didem-unat.3f693129.png" width="100px">
8282
<p class="grid-colspan-3">
83-
<b>Head of The Lab:</b> Assoc. Prof. Didem rnat (dunat@ku.edu.tr)
83+
<b>Head of The Lab:</b> Assoc. Prof. Didem Unat (dunat@ku.edu.tr)
8484
</p>
8585
</div>
8686
<div class="card grid grid-cols-4 justify-center items-center">

docs/index.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -516,7 +516,7 @@ BeyondMoore Software Ecosystem
516516
<h3><a href="https://github.com/ParCoreLab/CPU-Free-model" class="text-xl font-semibold font-serif visited:text-teal-700">CPU Free Model</a><h3>
517517
</div>
518518
<p class="text-lg">This project introduces a fully autonomous execution model for multi-GPU applications, eliminating CPU involvement beyond initial kernel launch. In conventional setups, the CPU orchestrates execution, causing overhead. We propose delegating this control flow entirely to devices, leveraging techniques like persistent kernels and device-initiated communication. Our CPU-free model significantly reduces communication overhead. Demonstrations on 2D/3D Jacobi stencil and Conjugate Gradient solvers show up to a 58.8% improvement in communication latency and a 1.63x speedup for CG on 8 NVIDIA A100 GPUs compared to CPU-controlled baselines.</p>
519-
<p>
519+
<p>
520520
<a href="https://github.com/ParCoreLab/CPU-Free-model" class="text-xl font-semibold font-serif visited:text-teal-700">More details and git repository of the project.</a>
521521
</p>
522522
</div>
@@ -532,7 +532,7 @@ BeyondMoore Software Ecosystem
532532
<a href="https://github.com/ParCoreLab/snoopie" class="text-xl font-semibold font-serif visited:text-teal-700">Snoopie</a>
533533
</div>
534534
<p class="text-lg">With data movement posing a significant bottleneck in computing, profiling tools are essential for scaling multi-GPU applications efficiently. However, existing tools focus primarily on single GPU compute operations and lack support for monitoring GPU-GPU transfers and communication library calls. Addressing these gaps, we present Snoopie, an instrumentation-based multi-GPU communication profiling tool. Snoopie accurately tracks peer-to-peer transfers and GPU-centric communication library calls, attributing data movement to specific source code lines and objects. It offers various visualization modes, from system-wide overviews to detailed instructions and addresses, enhancing programmer productivity.</p>
535-
<p>
535+
<p>
536536
<a href="https://github.com/ParCoreLab/snoopie" class="text-xl font-semibold font-serif visited:text-teal-700">More details and git repository of the project.</a>
537537
</p>
538538
</div>
@@ -548,9 +548,9 @@ BeyondMoore Software Ecosystem
548548
<a href="https://github.com/msasongko17/multigpu_callback" class="text-xl font-semibold font-serif visited:text-teal-700">Multi-GPU Callbacks</a>
549549
</div>
550550
<p class="text-lg">To address resource underutilization in multi-GPU systems, particularly in irregular applications, we propose a GPU-sided resource allocation method. This method dynamically adjusts the number of GPUs in use based on workload changes, utilizing GPU-to-CPU callbacks to request additional devices during kernel execution. We implemented and tested multiple callback methods, measuring their overheads on Nvidia and AMD platforms. Demonstrating the approach in an irregular application like Breadth-First Search (BFS), we achieved a 15.7% reduction in time to solution on average, with callback overheads as low as 6.50 microseconds on AMD and 4.83 microseconds on Nvidia. Additionally, the model can reduce total device usage by up to 35%, improving energy efficiency.</p>
551-
<p>
551+
<p>
552552
<a href="https://github.com/msasongko17/multigpu_callback" class="text-xl font-semibold font-serif visited:text-teal-700">More details and git repository of the project.</a>
553-
</p>
553+
</p>
554554
</div>
555555
<div class="grid h-[100%] justify-center place-items-center">
556556
<img width="500px" src="./assets/Multi-GPU-callback.png" />
@@ -633,7 +633,7 @@ Publications
633633
</div>
634634
<div class="card"> Ilyas Turimbetov, MA Sasongko, and Didem Unat, <a href="https://dl.acm.org/doi/10.1145/3642961.3643799">GPU-Initiated Resource Allocation for Irregular Workloads</a>, International Workshop on Extreme Heterogeneity Solutions (ExHET), 2024 </div>
635635
<div class="card"> I Ismayilov, J Baydamirli, D Sagbili, M Wahib, D Unat, <a href="https://dl.acm.org/doi/abs/10.1145/3577193.3593713">Multi-GPU Communication Schemes for Iterative Solvers: When CPUs are Not in Charge</a>, ICS ’23: Proceedings of the 37th International Conference on Supercomputing, 192–202. </div>
636-
<div class="card"> MA Sasongko, M Chabbi, PHJ Kelly, D Unat, <a href="https://ieeexplore.ieee.org/document/10068807">Precise Event Sampling on AMD vs Intel: Quantitative and Qualitative Comparison</a>, IEEE Transactions on Parallel and Distributed Systems, vol. 34, no. 5, pp. 1594-1608, May 2023, doi: 10.1109/TPDS.2023.3257105.
636+
<div class="card"> MA Sasongko, M Chabbi, PHJ Kelly, D Unat, <a href="https://ieeexplore.ieee.org/document/10068807">Precise Event Sampling on AMD vs Intel: Quantitative and Qualitative Comparison</a>, IEEE Transactions on Parallel and Distributed Systems, vol. 34, no. 5, pp. 1594-1608, May 2023, doi: 10.1109/TPDS.2023.3257105.
637637
</div>
638638

639639
<hr />

docs/team.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
<div class="card grid grid-cols-4 justify-center items-center">
66
<img class="shadow rounded-full max-w-full h-auto align-middle border-none" src="./team-images/didem-unat.png" width="100px" />
77
<p class="grid-colspan-3">
8-
<b>Head of The Lab:</b> Assoc. Prof. Didem rnat (dunat@ku.edu.tr)
8+
<b>Head of The Lab:</b> Assoc. Prof. Didem Unat (dunat@ku.edu.tr)
99
</p>
1010
</div>
1111

0 commit comments

Comments
 (0)