Query

Feature Request: Support for "NOT IN" / Anti-Selector in CDC db.cdc.query Selectors
Summary: Add support for an "anti-selector" or NOT IN logic in CDC event selectors to allow querying all changes except those matching a given set of criteria (e.g., excluding specific properties, labels, operations, etc.). Problem Statement: Currently, the CDC query system allows users to include specific changes via detailed selectors (e.g., filter by labels, changesTo, operation, etc.). However, there is no built-in way to express negative filtering or exclusion—such as querying for all changes except updates to certain properties or nodes with specific labels. This limitation forces users to either: Post-process results client-side to remove undesired changes, or Construct overly complex selectors to include every other desired case. This results in inefficiency, verbosity, and increased client-side logic. Proposed Feature: Introduce support for negation or anti-selectors in the CDC selector format. This could be achieved through an explicit "exclude" or "not" keyword, or extended support for NOT IN logic within fields like "labels", "changesTo", and "operation". Example syntax 1 — using "exclude": { "select": "n", "exclude": { "labels": ["SystemLog", "Audit"], "changesTo": ["lastSeen", "sessionToken"], "operation": ["u"] } } Example syntax 2 — using NOT IN logic: { "select": "n", "labels": {"NOT IN": ["SystemLog", "Audit"]}, "changesTo": {"NOT IN": ["lastSeen", "sessionToken"]}, "operation": {"NOT IN": ["u"]} } Use Cases: Exclude noisy audit logs or system entities from streaming pipelines. Avoid processing frequent but low-value property changes (e.g., lastAccessed, viewCount). Simplify subscriptions to business-relevant changes while ignoring background updates. Reduce downstream filtering overhead in real-time processing or analytics systems. Benefits: Enhances expressiveness of CDC query language. Reduces unnecessary computation and data transfer. Enables clearer, more maintainable CDC queries. Aligns with standard filtering capabilities found in query languages like Cypher and SQL. Requesting Teams / Roles Impacted: Data engineering teams building CDC pipelines. Developers consuming real-time changes via db.cdc.query. Users of Neo4j Aura, Enterprise, and dedicated cloud environments. Conclusion: Adding a negation or anti-selector mechanism would be a valuable enhancement to the CDC query API. It improves query flexibility, simplifies data processing workflows, and enables more targeted use of CDC in production environments. We would greatly appreciate consideration of this feature in a future Neo4j release.
0
Cosine similarity score is wrong/misleading in Neo4J
I've noticed when using db.index.vector.queryNodes(...) that the scores I get back are too high. But perhaps this is because I expected cosine similarity scores and nothing in the docs explicitly tells me that the 'score' returned by this function isn't actually cosine similarity. The difference is quite large. To test, I did a similarity search, selected two nodes and noted the score returned by queryNodes , manually looked up those nodes and copied the embeddings then calculated cosine similarity locally. In one test, the true similarity is 0.67, but Neo4J reports 0.84. I assume this is already known and explained by the first A in ANN, but it certainly wasn't clear to me as a newbie. I think many users might assume this 'score' is actually cosine similarity and use existing knowledge they have (like filtering out items where similarity is < 0.9) and find that things don't work as expected. Suggestions: you could make it clearer in the docs and course that this score is NOT cosine similarity and can differ by several tenths. Or, for the subset of matched items, actually calculate the cosine similarity (which is fast for just a handful, and would be even faster if I could indicate via options that my embeddings are normalised, so just the dot product will get the correct result). The workaround I'll use is to disregard the Neo4J score and calculate manually to apply further filters, which I assume is standard practice for those wishing to filter on the cosine similarity value.
1