How to walk through a container
Just a small article explaining how to work with the STL containers. Vector is taken as an example.
Аt the beginning I would like to say thanks to Anton.
The reason for this article is multiple examples of incorrect iteration seen everywhere in Internet:
vector<int>::iterator p2; // Declare iterator.
// Iterate through STL container.
for (p2 = v.begin (); p2 != v.end (); v++)
cout << "vector has " << *p2 << endl;;
return 0;
People see the code and use it everywhere without any doubts. We are going to explain why you should not code like that, to be precise; you can do it but better not. I will take a simple vector which contains int, as an example.
vector<int> v;
We are going to consider the examples from the most primitive to the sophisticated ones.
- It’s how a guy with the C-related past would write:
for ( unsigned int i = 0; i < v.size(); i++ )
cout << v[i] << endl;
This piece of code works because of the overloaded operator [].
Strictly speaking doing so is not too bad. Yes, you call .size() on each and every cycle iteration, but on almost each and every container .size() has O(1) so not a big deal.
What is probably more important is you have no STL-based control over iteration. If you misuse "i" (imagine if you have some more complex cycle) this will be your problem only. STL will not help you here as it might help you in some other situations as we will see below. - Most common (wrong) example:
for ( vector<int>::iterator I = v.begin(); I != v.end(); I++ )
cout << *I << endl;
Now we are using iterators. There are three things here that can be improved:- /Substitute I++ with ++I/. The reason is in the following code inside a vector (from STLPort):
_Self& operator++() {
_M_bump_up();
return *this;
}
_Self operator++(int) {
_Self __tmp = *this;
_M_bump_up();
return __tmp;
}
Pay attention to the difference: a post-fixed operator needs to create an object copy. It's an overhead! Use a prefixed writing if possible:for ( vector<int>::iterator I = v.begin(); I != v.end(); ++I )
cout << *I << endl; - I != v.end() has to be called for at each iteration of the cycle. You are lucky if your compiler is clever enough and knows how to take a constant out of the loop. However, the experienced C++ers try to make the compiler’s life easier, obviously, not harming readability:
for (vector<int>::iterator I = v.begin(), E = v.end(); I != E; ++I )
cout << *I << endl; - In case if we do not change the content in the cycle body, it worth to look at const_iterator as following:
for ( vector<int>::const_iterator I = v.begin(), E = v.end(); I != E; ++I )
cout << *I << endl;
However be aware there might be the times when you might not want to use const_iterator. See Meyers &Effective STL&, Item 26 for more details.
- /Substitute I++ with ++I/. The reason is in the following code inside a vector (from STLPort):
- Use algorithms:
ostream_iterator< int > output( cout, "\n" );
copy( v.begin(), v.end(), output );
No loop any longer as you see. Meyers devoted the whole chapter of his absolutely fabulous book to algorithms.
Yet one more option is to use for_each:template<typename T>
void output(const T& val) {
cout << val << endl;
}
for_each( v.begin(), v.end(), output<int> );
Put it simple: algorithms tell you to walk through a container via them but not via self-written loops. std::for_each and std::copy have to understand somehow what to do with the container given: that the third parameter is for. In some way it reminds good old qsort in C, but here everything is either template or functor.
Perhaps it might be interesting for you to compare for_each implementation (MSVC, STLPort) with the manual loops above.
MSVC:template<class _InIt,
class _Fn1> inline
_Fn1 for_each(_InIt _First, _InIt _Last, _Fn1 _Func)
{ // perform function for each element
_DEBUG_RANGE(_First, _Last);
_DEBUG_POINTER(_Func);
_CHECKED_BASE_TYPE(_InIt) _ChkFirst(_CHECKED_BASE(_First));
_CHECKED_BASE_TYPE(_InIt) _ChkLast(_CHECKED_BASE(_Last));
for (; _ChkFirst != _ChkLast; ++_ChkFirst)
_Func(*_ChkFirst);
return (_Func);
}
STLPort:// for_each. Apply a function to every element of a range.
template <class _InputIter, class _Function>
_STLP_INLINE_LOOP _Function
for_each(_InputIter __first, _InputIter __last, _Function __f) {
for ( ; __first != __last; ++__first)
__f(*__first);
return __f;
}
Iterator invalidation
If I mention about walking through a container I have to touch the iterator invalidation thing as well.
Remind the previous example:
for ( unsigned int i = 0; i < v.size(); i++ )
cout << v[i] << endl;
Actually, even if you use STL iterators instead you still have no guarantee, for example you may go *over* the container end. No wonder that neither "unsigned int" nor iterator has been protected from flying up to the sky, even if it sounds surprising. However, there are a few aspects that we do have to pay attention to.
An iterator can get invalid. Langer offers the following classification for invalid iterators. This classification looks pretty complete too me.
So if you are going to accept her terms you keep in mind the difference between "inconsistent iterator" and "iterator invalidation".
SGI offers a pretty impressive description on STL. For example: "A vector's iterators are invalidated when its memory is reallocated. Additionally, inserting or deleting an element in the middle of a vector invalidates all iterators that point to elements following the insertion or deletion point. It follows that you can prevent a vector's iterators from being invalidated if you use reserve() to preallocate as much memory as the vector will ever use, and if all insertions and deletions are at the vector's end."
So, there are simple rules for a vector:
- Avoid reallocation.
- No kidding with erase/insert:
vector<int> v(10);
for (vector<int>::iterator I = v.begin(); I != v.end(); ++I)
v.erase(I);
Use it the following way:vector<int> v(10);
for (vector<int>::iterator I = v.begin(); I != v.end();)
I = v.erase(I);
Notice, that for the security reasons we call for .end() at each loop iteration. I also would like to emphasize that what the SGI guys call "iterator invalidation", Langer calls "inconsistent iterator", which I think is more correct.
Now if you do need some really invalid iterator example, we can easily give you one:
char string[23] = "A string to be copied.";
char result[23];
copy(string+20, string+10, result);
For more details please go to Meyers and Langer; let them deal with it professionally. Here I would like to discuss a very practical aspect – how to fish for these things. There are two approaches obviously:
- Experience.
- Use STL debug mode.
MS STL and STLPort developers spent a lot of time and efforts to make this fishing process reliable and easy.
- Fishing for invalid iterators in MS STL:
- _SECURE_SCL
- _HAS_ITERATOR_DEBUGGING
(Note: we are not giving the direct links to MSDN as MS guys seem to have a habit of changing them at least a couple of times every week)
The keys are not equal and have to be used simultaneously. The funny thing is MS is debugging this debug mode :) - Fishing for invalid iterators in STLPort:
STLPort also has its own debug mode - _STLP_DEBUG.
I tried a couple of examples with invalid iterators. What I must say is I'm not happy with STLPort port at all. MS STL messages were clear and understandable, while for the STLPort ones you had to download the debugger and walk up the stack to find the reason for a crash. By the way it is not that easy to build STLPort by VS 2008 at all!
So, bottom line, know about invalid iterator and actively use of the STL DEBUG mode for fishing those situations out.
- Posted by: volodya 7.3.2009 at 05:31 0 comments
You're currently an anonymous user. Just browsing around? That's totally cool with us. We won't bug you until you're ready to write a comment. Otherwize you have to enter your OpenID credentials to log in. If you have not one, you can easily create it!
Example OpenIDs:
- http://openid.aol.com/yourname
- http://yourname.myopenid.com/
- https://me.yahoo.com/yourname (alternately, http://yahoo.com/ works too)
- http://claimid.com/yourname
- http://yourname.wordpress.com/
- http://yourname.blogspot.com/
- http://technorati.com/people/technorati/yourname
- http://yourname.pip.verisignlabs.com/
- http://yourname.livejournal.com/
- http://www.flickr.com/photos/yourname
WHAT'S NEW
- A couple of words about TDD
- Unit-test coding supposes to be one of the most significant methodological achievements of the industry, let’s say, for about last 15 years. The Internet is full of enthusiastic exclamations [1, ...
- February 21, 2010
- CodeExample plugin for Trac
- The Trac plugin for code examples colouring. It supports three types of examples - a simple, a correct one and an incorrect. Further details see at
- February 13, 2010
- Metrics - LoC
- This is going to be a small set of articles devoted to metrics. The first one is about LoC - Line of Code. I think that the first reaction on ...
- May 11, 2009
- Metric - Cyclomatic Complexity
- There is a simple and logic explanation: the more “if”, “while”, “for”, and etc. in code the higher the complexity of the code improvement, management, understandability and refactoring. Cyclomatic (here ...
- May 11, 2009
- SESE vs SEME
- SESE/SEME are terms of structural programming and were introduced at 80-s. Nothing new. However, experience shows that some programmers do not know about them till today. That’s why it makes ...
- April 08, 2009