It's well understood, graphs can be conveniently represented as matrices/tables/relations, and they are equivalent to edge lists.
It might be interesting to discuss to what extent did "graph databases" (you know who you are!) get a foothold because relational database platforms were slow to develop convenient notations and algebras (libraries) for working with abstract graphs. As Hou points out, there is some justifiable skepticism about the argument that graph databases are somehow intrinsically "more efficient" than relational databases for working with graphs. This would be surprising, given the obsession with optimization and performance that dominated the database community for many years, while issues like usability were a bit neglected (leaving the door open for other communities to innovate graph databases and data visualization platforms.) (Another point, don't I sometimes need both relational and graph algebras?)
Because I'm looking mainly for expressive convenience (with good in-memory runtime performance) it's not enough to know that Datalog can represent any abstract graph. If I find textbook pseudocode for, say, maximum matching in graphs, or transitive closure or connected components, how hard will it be to program in the target graph programming system? I'm confident that Datalog, Recursive SQL, and Cymbal or Gremlin can all get the job done, but at what expressive cost (assuming my algorithm is not already a graph language primitive)? Will anyone still even recognize the algorithm.
graph databases win on actual data layouts in disk blocks. trying to make relational databases ‘understand’ graphs really happens only at the query language layer. the data on disk remains optimized for relational queries. i’m not aware of any relational database shipping a storage engine optimized for graphs. if you know any, please share. thanks :)
On the other hand, nearly everybody’s data fits in RAM nowadays, and NVMe is so fast the disk layout constraints have changed a lot. So maybe now’s a good time to rethink anyway.
But only if the data is valuable enough to cover the (much) higher costs that come with those assumptions.
There's more potential applicability/overlap for columnar relational engines (vs. row stores) - this 2023 paper offers some useful background: https://arxiv.org/pdf/2308.08702.pdf
https://relational.ai/product
Maybe I'm missing what you're saying here, but matrices are not "equivalent" to edge lists, they are abstract mathematical objects decoupled from whatever storage method you use in a computer. For example, SuiteSparse:GraphBLAS has four storage formats: dense, bitmap, sparse, and hypersparse. None of these are equivalent to edge lists.
Edge lists are a way to store a representation to a graph, but they are not a graph. Like a matrix, a graph is a abstract mathematical concept that has different methods of representing itself in computers. But mathematically, graphs and matrices are isomorphic, every graph is a matrix, and every matrix is a graph. The isomorphism extends to operations as well, every BFS is a matrix multiplication, and vice versa.
Can you expand on that?
When you consider that a graph and a matrix are isomorphic, doing vector matrix multiplication takes a vector with a set value, say row 4, and multiplies it by a matrix where row 4 has values present that represent edges to the nodes that are adjacent to it (ie "adjacency" matrix). The result is a vector with the next "step" in a BFS across the graph, do that in a loop and you step across the whole graph.
A cool result of this is, for example, taking an adjacency matrix and squaring it is the "Friend of a Friend" graph. It takes every node/row and multiplies it by itself, returning a matrix that are adjacent to the adjacencies of each node, ie, the friends (adjacencies of the adjacencies) of friends (adjacencies) of the nodes.
Deeper traversal are just higher powers, a matrix cubed are the friends of the friends of the friends. You literally say `A @ A @ A` in Python, and you're done, no dictionary or lists or loops or anything else.
A picture is worth a thousand words, see figure 7 of this paper:
https://arxiv.org/pdf/1606.05790.pdf
Also check out figure 8, this shows how incidence matrices can work to represent hyper and multi graphs. An pair of incidence matrices reprsent two graphs, one from nodes to edges and the other from edges to nodes, these are n by m and m by n. When you multiply them, you get a square adjacency matrix that "projects" the incidence into an adjacency. This can be used to collapse hypergraphs into simple graphs that use different semirings to combine the multiple edges.
For some pretty pictures of this kind of stuff, check out CoinBLAS (note I am not a crypto-bro, it was just a very handy extremely large multi-graph that I could easily download in chunks to play with):
https://github.com/Graphegon/CoinBLAS/
I believe what they are saying is that any graph can be represented by either a matrix or edge list. So they are equivalently capable representations of graphs. The "they" refers to matrices, not graphs, when they said "and they are equivalent to edge lists". That's how I read it anyways.
GP says "graphs can be conveniently represented as matrices", which I believe is totally correct.
The GP's claim that graphs (not matrices) are equivalent to edge lists is almost correct. The textbook definition of graphs is G = (V, E) , where V are the set of vertices and E are the set of edges. Implicitly E contains relationships between some of the vertices, so the only thing in the V which isn't part of an edge in E are vertices that are not related to any edge. So if you have a connected graph, you can unambiguously define it by way of a set of edges (~= "edge lists") without explicitly mentioning the vertices.
In short, I don't see any fundamental error in GP's quoted sentence unless you're trying to be extra pedantic.
Earlier versions of SQL did not support recursive queries (though there were vendor-specific extensions) so graph databases had a very real edge there. In fact, recent SQL versions have added "property-based" syntactic sugar for such queries to further improve ease of use.