[GiNaC-list] little optimization => lots of Memory usage
Alexei Sheplyakov
alexei.sheplyakov at gmail.com
Fri Oct 22 14:45:24 CEST 2010
Hello,
On Wed, Oct 20, 2010 at 8:47 PM, Cristobal Navarro <axischire at gmail.com> wrote:
> yes, it adds some memoery usage, but doing the math i found that it isnt too much,
> for the 8000x8000 problem, each string should be in the worst case length 21
> (for example string s = "Ld0x1d2x3d4x5d6x7d8x9" ), assuming std::string is 1
> byte per length unit:
> stringMemory <= 21 bytes * 64M <= 1344Mbytes <= 1.3GB
I'm afraid your calculation is wrong.
Let's forget about GiNaC for a moment and estimate the memory
usage of map<string, map<string, long> >. std::map is a red-black
tree. Every node has 3 pointers (2 children and parent), color (int),
and the actual data (key and value). So, every node consumes
3*sizeof(void *) + sizeof(int) + padding
+ sizeof(string) + sizeof(long) + strlen(TypicalString)
= 4*sizeof(void *) + sizeof(int) + padding
+ sizeof(long) + strlen(TypicalString)
(std::string is essentialy a char* as far as memory usage is
concerned). We got N trees each of N nodes, thus we need
(4*sizeof(void *) + sizeof(int) + padding + sizeof(long) +
strlen(TypicalString))*N^2
bytes. On x86_64 this will be (48 + strlen(TypicalString))*N^2.
If typical string is 16 characters long, this equals 64*N^2
# N kb
1000 62500
2000 250000
3000 562500
4000 1000000
To check this theory consider the following program (it doesn't use GiNaC):
// mapmap.cpp
#include <map>
#include <string>
#include <sstream>
#include <iostream>
#include <cstdlib>
#include <unistd.h>
#include <sys/resource.h>
using namespace std;
int main(int argc, char** argv)
{
cout << "sizeof(string) " << sizeof(string) << endl;
const int dflt_size = 800;
int size = dflt_size;
if (argc >= 2)
size = atoi(argv[1]);
if (size <= 10)
size = dflt_size;
map<string, map<string, long> > big_map;
for (int i = 0; i < size; ++i) {
map<string, long> m;
for (int j = 0; j < size; ++j) {
ostringstream oss;
oss << "Row " << i << " Col" << j; // ~ 16 chars
m[oss.str()] = i + j;
}
ostringstream oss;
oss << "Row" << i;
big_map[oss.str()] = m;
}
struct rusage rus;
getrusage(RUSAGE_SELF, &rus);
cout << size << "\t" << rus.ru_maxrss << std::endl << std::flush;
return 0;
}
$ g++ -O2 -Wall -pipe -march=native -o mapmap mapmap.cpp
$ for i in 1000 2000 3000 4000; do ./mapmap $i; done
On my system (Athlon 64 X2 running Debian GNU/Linux) it prints
# N kb
1000 112096
2000 450804
3000 1019464
4000 1833620
Experimental values are ~ 2x bigger than ones predicted by theory.
Perhaps I've missed a pointer to vtbl somewhere. Anyway, it's clear
that memory usage is O(N^2), and N = 8000 would consume
~ 2^2 * 2Gb = 8Gb (I'm unable to check this experimentally -- my
computer has 2Gb of RAM).
Conclusion: the binary tree is not suitable large amounts of data.
Perhaps you need to use a different data structure.
Also, use ex instead of ex* (ex is actually a wrapped pointer).
Hope this helps,
Alexei
More information about the GiNaC-list
mailing list