How is it possible to get an estimate for the memory requirement of a graph
in graph-tool?
I know that graph-tool is built upon C++ and Boost, and the adjacency list
is stored via a hash-map.
Apart from the cost of storing the values of vertices indices and edges
indices as `unsigned long`, what is the memory overhead of the structures
used in storing the graph?
For example, for a network of 1M vertices and 100M links without
attributes, how much real memory should I plan to use, excluding
temporaries?
Sorry if the question is repeated, but I could not find it in the previous
mailing list posts.
How is it possible to get an estimate for the memory requirement of a
graph in graph-tool?
Yes, it is, and I should put this in the documentation somewhere.
I know that graph-tool is built upon C++ and Boost, and the adjacency
list is stored via a hash-map.
Not quite, we use an adjacency list using std::vector<>.
Apart from the cost of storing the values of vertices indices and edges
indices as `unsigned long`, what is the memory overhead of the
structures used in storing the graph?
We use a vector-based bidirectional adjacency list, so each edge appears
twice. Each edge is comprised of two size_t (uint64_t) values, for the
target/source and the edge index, so we need 32 bytes per edge.
For each vertex we need a std::vector<> which is 24 bytes and a uint64_t
to separate the in-/out-lists, so we also need 32 bytes per node.
Therefore we need in total:
(N + E) * 32 bytes
For example, for a network of 1M vertices and 100M links without
attributes, how much real memory should I plan to use, excluding
temporaries?
That would be:
3232000000 bytes = 3.01 GB
In practice you will need a little more, since std::vector<> tends to
over-commit.
I've tried (using `htop` command) to measure the RES memory requirement for
such a graph with 1M nodes and 100M links, but the results is almost twice
the size, a total of 6.2 GB.
Is there a reason why I get that figure?
I am on MacOS 10.15.6 Catalina