GNU C++ hash_set vs STL std::set: my notebook
A set is a C++ container that stores unique elements. The C++ Standard Template library (STL) defines a C++ template set<T>
that is typically implemented as a binary search tree.
#include<set>
But the GNU C++ library also provides a (non-standard) hash-based set:
#include<ext/hash_set>
In the following code I've created some random #rs numbers and I print the time needed to insert/remove them from a (GNU/hash_set or STL ) set:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <string> | |
#include <iostream> | |
#include <cstdlib> | |
#include <ctime> | |
#include <sstream> | |
#if defined(__GNUC__) && defined(WITH_HASHSET) | |
#include <ext/hash_set> | |
namespace __gnu_cxx | |
{ | |
/* hack from http://www.moosechips.com/2008/10/using-gcc-c-hash-classes-wit | |
h-strings/ */ | |
template<> struct hash< std::string > | |
{ | |
size_t operator()( const std::string& x ) const | |
{ | |
return hash< const char* >()( x.c_str() ); | |
} | |
}; | |
} | |
typedef __gnu_cxx::hash_set<std::string, __gnu_cxx::hash<std::string> > set_of_string; | |
#else | |
#include <set> | |
typedef std::set<std::string> set_of_string; | |
#endif | |
using namespace std; | |
int main(int argc,char** argv) | |
{ | |
set_of_string rs_list; | |
for(int t=0;t<2;++t) | |
{ | |
srand(0U); | |
for(int i=0;i< 10000000;++i) | |
{ | |
ostringstream os; | |
os << "rs" << rand(); | |
string rs(os.str()); | |
if(t==0) | |
{ | |
rs_list.insert(rs); | |
} | |
else | |
{ | |
set_of_string::iterator r=rs_list.find(rs); | |
if(r!=rs_list.end()) rs_list.erase(r); | |
} | |
} | |
} | |
cout << "Time: "<< (((float)std::clock())/CLOCKS_PER_SEC) << "seconds.\n"; | |
return 0; | |
} |
STL version
$ g++ -Wall -O3 testset.cpp $ ./a.out Time: 109.27seconds.
GNU hash_set version
$ g++ -DWITH_HASHSET=1 -Wall -O3 testset.cpp In file included from /usr/include/c++/4.6/ext/hash_set:61:0, from jeter.cpp:7: /usr/include/c++/4.6/backward/backward_warning.h:33:2: warning: #warning This file includes at least one deprecated or antiquated header which may be removed (...) $ ./a.out Time: 49.69seconds.
That's it,
Pierre
1 comment:
g++ also now includes the (standard) unordered_set, from the header , that provides a tr1 (and c++11) compatible interface to a random-access unordered set. This should have the same (or very similar) performance characteristics to the hash set with the added benefit of being standard.
Post a Comment