How Queries Work
Cypher is a declarative query language for graphs, and should be simple to read and write for anyone who is familiar with SQL. Brahmand’s implementation of Cypher is based on the openCypher standard.
Brahmand processes each Cypher query in three high-level phases:
-
Parse & Anchor Selection
- Identify an “anchor” node to start traversal.
- Current heuristic: pick the node with the most
WHERE
predicates; if tied, choose the node with more properties referenced inRETURN
. - (Future: cost-based optimization.)
-
Traversal Planning
- Build ClickHouse Common Table Expressions (CTEs) that traverse edges using the main edge table and precomputed edge‐index tables.
- Apply
WHERE
filters as early as possible on the anchor to limit data volume.
-
Join & Final SELECT
- Join the intermediate CTEs once traversal reaches the target node(s).
- Assemble the final
SELECT
with any remaining filters,GROUP BY
,ORDER BY
, andLIMIT
.
Example
Section titled “Example”Cypher Query
Section titled “Cypher Query”MATCH (p:Post)-[:CREATED]->(u:User)WHERE p.PostTypeId = 2RETURN u.UserId AS UserId, u.DisplayNameORDER BY p.created_date DESCLIMIT 10;
ClickHouse SQL query
Section titled “ClickHouse SQL query”WITH Post_p AS ( SELECT postId FROM Post WHERE PostTypeId = 2),CREATED_incoming_ab7d65838c AS ( SELECT from_id, arrayJoin(bitmapToArray(to_id)) AS to_id FROM CREATED_incoming WHERE from_id IN (SELECT postId FROM Post_p))SELECT u.UserId AS UserId, u.DisplayNameFROM User AS uJOIN CREATED_incoming_ab7d65838c AS ab7d65838cON ab7d65838c.to_id = u.UserIdJOIN Post_p AS p ON p.postId = ab7d65838c.from_idGROUP BY UserId, u.DisplayNameORDER BY p.created_date DESCLIMIT 10
Explanation:
Section titled “Explanation:”- Anchor Node: Only
Post
has aWHERE
filter, so it becomes the anchor. - Early Filtering: Applying
PostTypeId = 2
in thePost_p
CTE limits the data scanned. - Edge Traversal: Traverses the
CREATED
relationship viaCREATED_incoming
. - Final Join: Joins the
User
andPost_p
CTEs, then appliesGROUP BY
,ORDER BY
, andLIMIT
to produce the final result.