-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HW 3 #415
base: master
Are you sure you want to change the base?
HW 3 #415
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
The best configuration and time for me was: configuration ('coalesced', 128, 128): 0.00291864 seconds |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
Part 1 | ||
|
||
Maze 1: | ||
Finished after 878 iterations, 261.55712 ms total, 0.297901047836 ms per iteration | ||
Found 2 regions | ||
|
||
Maze 2: | ||
Finished after 517 iterations, 153.9384 ms total, 0.297753191489 ms per iteration | ||
Found 35 regions | ||
|
||
|
||
Part 2 | ||
|
||
Maze 1: | ||
Finished after 529 iterations, 158.00224 ms total, 0.298680982987 ms per iteration | ||
Found 2 regions | ||
|
||
Maze 2: | ||
Finished after 273 iterations, 81.45792 ms total, 0.298380659341 ms per iteration | ||
Found 35 regions | ||
|
||
|
||
Part 3 | ||
|
||
Maze 1: | ||
Finished after 11 iterations, 3.37152 ms total, 0.306501818182 ms per iteration | ||
Found 2 regions | ||
|
||
Maze 2: | ||
Finished after 9 iterations, 2.7204 ms total, 0.302266666667 ms per iteration | ||
Found 35 regions | ||
|
||
|
||
Part 4 | ||
|
||
Maze 1: | ||
Finished after 70 iterations, 52.56808 ms total, 0.750972571429 ms per iteration | ||
Found 2 regions | ||
|
||
Maze 2: | ||
Finished after 103 iterations, 76.77008 ms total, 0.745340582524 ms per iteration | ||
Found 35 regions | ||
|
||
|
||
It seems like in my case, serialization of the "finding grandparents" process | ||
is not the best as it leads to a 2.5 time increase in time per iteration. | ||
|
||
|
||
Part 5 | ||
|
||
Suppose that our current label sees 2 other labels, both of which have a | ||
smaller label number than our current one. In that case, if we did atomic | ||
updates, all 3 labels will become the minimum of these 3 labels. However, | ||
it is possible that if we did "min" first, then "reassignment", the order | ||
of the 2 mins and the 2 reassignments can make a difference. For example: | ||
|
||
Suppose our current label at a square is 3, and there are two neighbors | ||
with labels 2 and 1. We would like to update 3 -> 2. However, when two | ||
different threads compute the min of (3,2) and (3,1), they will get 1 and 2. | ||
Now, assume that we assign that label to be 1, and THEN assign it to be 2. | ||
UH OH! Now we have a problem and will have to run for at least another iteration | ||
to fix it. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -75,23 +75,78 @@ propagate_labels(__global __read_write int *labels, | |
// the local buffer is loaded | ||
barrier(CLK_LOCAL_MEM_FENCE); | ||
|
||
int current = buf_y * buf_w + buf_x; | ||
// Fetch the value from the buffer the corresponds to | ||
// the pixel for this thread | ||
old_label = buffer[buf_y * buf_w + buf_x]; | ||
old_label = buffer[current]; | ||
|
||
// CODE FOR PARTS 2 and 4 HERE (part 4 will replace part 2) | ||
|
||
/* | ||
if (old_label < w * h) | ||
{ | ||
buffer[current] = labels[old_label]; // grab grandparent | ||
} | ||
*/ | ||
|
||
|
||
if ((lx == 0) && (ly == 0)) | ||
{ | ||
int prev_key = -1000; | ||
int prev_result; | ||
|
||
for (int i = 0; i < buf_w * buf_h; i++) | ||
{ | ||
int this_label = buffer[i]; | ||
|
||
if (this_label >= w * h) | ||
continue; | ||
|
||
if (prev_key == this_label) | ||
{ | ||
buffer[i] = prev_result; | ||
} | ||
|
||
else | ||
{ | ||
prev_key = this_label; | ||
prev_result = labels[prev_key]; | ||
} | ||
} | ||
} | ||
|
||
|
||
|
||
barrier(CLK_LOCAL_MEM_FENCE); | ||
|
||
// stay in bounds | ||
if ((x < w) && (y < h)) { | ||
// CODE FOR PART 1 HERE | ||
// We set new_label to the value of old_label, but you will need | ||
// to adjust this for correctness. | ||
new_label = old_label; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After parts 2 and 4, you should use buffer[buf_w * buf_y + buf_x] instead of old_label. |
||
|
||
if (new_label < w * h) | ||
{ | ||
int this = buf_y * buf_w + buf_x; | ||
new_label = | ||
min(buffer[(buf_y + 1) * buf_w + buf_x], | ||
min(buffer[(buf_y - 1) * buf_w + buf_x], | ||
min(buffer[this + 1], | ||
min(buffer[this - 1], new_label | ||
)))); | ||
} | ||
|
||
if (new_label != old_label) { | ||
// CODE FOR PART 3 HERE | ||
// indicate there was a change this iteration. | ||
// multiple threads might write this. | ||
|
||
// | ||
atomic_min(&labels[old_label], new_label); | ||
|
||
atomic_min(&labels[y * w + x], new_label); | ||
|
||
*(changed_flag) += 1; | ||
labels[y * w + x] = new_label; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be removed, as you are doing the atomic_min in line 148. |
||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should also update the buffer of the current index.