Skip to content

Commit 8598de3

Browse files
Add: post on copy semantics
1 parent 4377631 commit 8598de3

File tree

3 files changed

+250
-0
lines changed

3 files changed

+250
-0
lines changed

_posts/cpp/2023-09-02-hello-world.md

+1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
---
2+
authors: [gaurav]
23
layout: post
34
title: Hello World
45
categories: [cpp]
+246
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
---
2+
authors: [gaurav]
3+
layout: post
4+
title: Copy Semantics
5+
categories: [cpp]
6+
tags: [c++, c++98, c++03, copy, copy-constructor, copy-semantics]
7+
---
8+
9+
Copy semantics refer to the rules and mechanisms by which objects are copied or cloned when they
10+
are assigned to another object or passed as function arguments. It creates objects that are
11+
`equivalent` and `independent` i.e
12+
13+
1. `source == destination`
14+
2. modification to one object does not cause modification to other
15+
16+
{: file="copy.cpp" }
17+
18+
```c++
19+
#include <cassert>
20+
21+
struct Rectangle { // plain old datatypes (POD)
22+
int length;
23+
int breadth;
24+
};
25+
26+
int area(Rectangle r) { // pass by value, implicit copy
27+
return r.length * r.breadth;
28+
}
29+
30+
int main() {
31+
int x = 10;
32+
int y = x; // 1. construct, implicit copy
33+
assert(x == y);
34+
35+
Rectangle rect {10, 20};
36+
assert(area(rect) == 200); // 2. call area,
37+
}
38+
```
39+
40+
1. `int y = x;` copies the value `10` from variable `x` to `y`.
41+
2. call to a function `area(rect)` copies values stored in `rect.length` and `rect.breadth` to
42+
parameter `r.length` and `r.breadth`
43+
44+
By default compiler generates a default copy constructor and a default copy assignment operator if
45+
required, which performs `member wise copy` i.e it copies each member from source to destination.
46+
47+
In case of basic datatypes such as `int` it is just copying a single value `y = x` and in case of
48+
plain old datatypes (POD) it copies each member variable from source to destination `r = rect`.
49+
50+
This is also known as `shallow copy`. For basic datatypes and POD this is not an issue. But for user
51+
defined classes/structures which include pointer member variables, this can be problematic.
52+
53+
## Shallow Copy
54+
55+
{: file="shallow_copy.cpp" }
56+
57+
```c++
58+
#include <cstddef>
59+
60+
struct DynamicArray {
61+
DynamicArray(size_t size)
62+
: m_size {size}
63+
, m_ptr {new int[m_size]}
64+
{}
65+
66+
~DynamicArray() {
67+
delete[] m_ptr;
68+
}
69+
70+
size_t m_size;
71+
int* m_ptr;
72+
};
73+
74+
int main() {
75+
DynamicArray arr(10);
76+
{
77+
DynamicArray arr_copy(arr); // Copies arr into arr_copy. member by member
78+
}
79+
}
80+
```
81+
82+
{: file="output" }
83+
{: .nolineno }
84+
85+
```bash
86+
g++ -std=c++11 -fsanitize=address shallow_copy.cpp && ./a.out
87+
88+
=================================================================
89+
==11528==ERROR: AddressSanitizer: attempting double-free on 0x604000000010 in thread T0:
90+
#0 0x7ff4a3010780 in operator delete[](void*)
91+
```
92+
93+
As described earlier, compiler provided copy constructor performs member by member copy i.e
94+
`arr_copy.m_ptr = arr.m_ptr`.
95+
96+
So now `m_ptr` of both the objects are pointing to the same address. And this is bad.
97+
98+
1. `No independent modifications`: Changes done through `arr.m_ptr` will be reflected in
99+
`arr_copy.m_ptr` and vice-versa
100+
2. `Undefined behavior`: Due to limited scope `arr_copy` is destroyed first resulting in deletion of
101+
memory pointed by `arr_copy.m_ptr`. Now any operation done through `arr.m_ptr` will result in
102+
undefined behavior since the memory block to which `arr.m_ptr` is pointing is already deleted.
103+
Pointing to deleted memory is also known as `dangling pointer`.
104+
3. `Double free`: At the end of the program both objects go out of scope and are destroyed. Since
105+
both the pointers are pointing to the same memory this results in deletion of the same memory
106+
twice and we can see this error in the output.
107+
108+
To deal with such issues we need `deep copy`.
109+
110+
## Deep Copy
111+
112+
Issue with shallow copy was it simply copies the values directly even in case of pointer variables.
113+
To fix this deep copy first allocates a sepearate memory to pointer member variables and then
114+
copies the contents stored from source memory address to the newly allocated memory address.
115+
Now each copy contains its own unique set of data, even if that data includes references or
116+
pointers to other objects.
117+
118+
Deep copy is implemented explicitly by the programmer by providing user defined `copy constructor`
119+
and `copy assignment operator`.
120+
121+
![Copy Semantics](/assets/img/cpp/copy.svg){: width="800" }
122+
123+
As can be seen in the image after copy operation members of `destination` and `source` both point to
124+
different address but have same contents (shapes).
125+
126+
### Copy Constructor
127+
128+
{: file="copy_constructor.cpp" }
129+
130+
```c++
131+
#include <iostream>
132+
#include <algorithm>
133+
134+
DynamicArray(const DynamicArray& o)
135+
: m_size {o.m_size}
136+
, m_ptr {new int[m_size]} // 1. memory allocation
137+
{
138+
std::copy(o.m_ptr, o.m_ptr + o.m_size, m_ptr); // 2. copy values from source to dest
139+
}
140+
141+
void printArray(DynamicArray arr) {
142+
for (size_t i = 0; i < arr.m_size; ++i) {
143+
std::cout << i << std::endl;
144+
}
145+
}
146+
147+
int main() {
148+
DynamicArray arr(10);
149+
{
150+
DynamicArray arr_copy(arr); // Invokes copy constructor
151+
}
152+
printArray(arr); // creates a temporary copy of arr and passes to printArray
153+
}
154+
```
155+
156+
1. `m_ptr {new int[m_size]}`: Separate memory is allocated to `arr_copy.m_ptr`
157+
2. `std::copy(o.m_ptr, o.m_ptr + o.m_size, m_ptr);`: copies values stored at memory address pointed
158+
by `arr.m_ptr` to newly allocated memory address `arr_copy.m_ptr`
159+
160+
### Why Copy Constructor Takes Argument By Reference
161+
162+
Canonical signature of copy constructor is `DynamicArray(const DynamicArray& o)`.
163+
164+
If it will recieve the argument by pass by value, then when copy constrcutor is invoked it will need
165+
a copy of the argument which will in turn invoke the copy constructor, which would again call the
166+
copy constructor and this will continue recursively until stack is full.
167+
168+
So it takes a parameter by reference.
169+
170+
### Why Copy Constructor Takes Const Argument
171+
172+
To avoid accidental modfications to the source object. Also const reference `const &` allows copy
173+
constructor to receive `temporary objects`.
174+
175+
### Copy Assignment
176+
177+
Copy constructor solves only the half problems, we can get into same issues if copy assignment
178+
operator is not provided.
179+
180+
```c++
181+
int main() {
182+
DynamicArray arr(10);
183+
DynamicArray other(5);
184+
other = arr; // Invokes compiler provided copy assignment
185+
}
186+
```
187+
188+
Since both the objects already exist, `other = arr` invokes compiler provided assignment operator,
189+
which performs member by member copy and will result in the same issue of two pointers pointing to
190+
the same memory.
191+
192+
{: file="copy_assignment.cpp" }
193+
194+
```c++
195+
DynamicArray& operator=(const DynamicArray& o)
196+
{
197+
if (this == &o) { // 1. prevents self assignment, arr = arr
198+
return *this;
199+
}
200+
delete[] m_ptr; // 2. delete any existing memory if any
201+
m_size = o.m_size;
202+
m_ptr = new int[m_size];
203+
std::copy(o.m_ptr, o.m_ptr + o.m_size, m_ptr);
204+
return *this;
205+
}
206+
207+
int main() {
208+
DynamicArray arr(10);
209+
DynamicArray other(1);
210+
arr = other; // Since arr already exists, invokes copy assignment
211+
}
212+
```
213+
214+
Implentation of copy constructor and copy assignment operator is almost same with three small but
215+
important differences.
216+
217+
1. Self assignment check `if (this == &o)`: Statement such as `arr = arr` is a self assignment, if
218+
the program does not check for self assignment then it will result in deletion of `m_ptr` first
219+
and on next line try to allocate new memory and will loose its original content.
220+
2. Delete pre-allocated memory `delete[] m_ptr;`: In assignment both the object already exist and
221+
`m_ptr` might be pointing to valid memory. Allocating new memory without delete will result in
222+
memory leak.
223+
3. `return *this;`: Returning reference to self is not mandatory but then you cannot perform
224+
assignment chaining `(a = b = c)`.
225+
226+
## Note
227+
228+
> If you find the need to provide a custom implementation for either the `copy constructor`,
229+
`copy assignment operator`, or `destructor`, it's a strong indication that you should consider
230+
providing custom implementations for all three of them. This principle is commonly referred to as
231+
the `Rule of Three`.
232+
{: .prompt-info }
233+
234+
## Conclusion
235+
236+
Understanding copy semantics is crucial for managing the behavior of your C++ programs and
237+
controlling how user defined structures are copied especially when pointer member variables are
238+
involved.
239+
240+
Copy semantics is an important concept but it has its own downsides. For smaller size objects,
241+
it is tolerable, but for larger ones, it leads to noticeable performance degradation due to the
242+
creation of numerous temporary copies.
243+
244+
To address this inefficiency, c++11 introduced the concept of `move semantics`. If you're a
245+
technical enthusiast looking to optimize your code and understand the inner workings of move
246+
semantics, dive into our next blog on the topic.

assets/img/cpp/copy.svg

+3
Loading

0 commit comments

Comments
 (0)