Design Philosophy & Architecture
This page covers the design principles, architecture, and performance characteristics of GraphCore.jl.
Design Philosophy
GraphCore.jl is designed around three key principles:
Performance-Oriented Design
- CSR Storage: Compressed Sparse Row format for cache-efficient traversal
- Type Specialization: Parametric types enable compiler optimizations and zero-cost abstractions
- Efficient Indexing: Direct O(1) access patterns with
Int32indexing for memory efficiency - Minimal Overhead: Compact memory layouts optimized for common graph operations
Flexible Storage Options
- Multiple Representations: CSR format (CoreGraph) for static analysis, adjacency lists (AdjGraph) for dynamic operations
- Unified Interface: All graph types implement the same
GraphInterfacefor consistent usage - Property Integration: Built-in support for type-safe vertex and edge properties
- External Array Support: Stable indexing schemes for user-managed data arrays
Type Safety and Reliability
- Compile-Time Checking: Parametric types catch common errors during compilation
- Clear Data Ownership: Explicit separation between graph structure, weights, and properties
- Stable Indexing: Consistent edge/vertex indices for reliable external array management
- Comprehensive Validation: Input checking and well-defined method contracts
Architecture Overview
GraphCore.jl
├── GraphInterface # Abstract interface definition
├── CoreGraph # CSR-based graphs (static, high-performance)
├── AdjGraph # Adjacency list graphs (dynamic, mutable)
└── Builders # Builder patterns for graph constructionPerformance Characteristics
| Operation | CoreGraph | WeightedGraph | AdjGraph | PropertyGraph | PropertyAdjGraph |
|---|---|---|---|---|---|
| Neighbor Access | O(1), ~2ns | O(1), ~2ns | O(1), ~2ns | O(1), ~2ns | O(1), ~2ns |
| Edge Lookup | O(degree), ~3ns | O(degree), ~3ns | O(degree), ~3ns | O(degree), ~3ns | O(degree), ~3ns |
| Bounds Checking | ✅ @boundscheck | ✅ @boundscheck | ✅ @boundscheck | ✅ @boundscheck | ✅ @boundscheck |
| @inbounds Safe | ✅ Performance | ✅ Performance | ✅ Performance | ✅ Performance | ✅ Performance |
| Add Edge | ✅ Efficient | ✅ Efficient | O(1) | ✅ Efficient* | O(1) |
| Remove Edge | ✅ Efficient | ✅ Efficient | O(degree) | ✅ Efficient* | O(degree) |
| Add Vertex | ✅ O(1) | ✅ O(1) | O(1) | ✅ O(1)* | O(1) |
| Remove Vertex | ✅ Efficient | ✅ Efficient | O(V+E) | ✅ Efficient* | O(V+E) |
| Input Validation | ✅ Comprehensive | ✅ Comprehensive | ✅ Basic | ✅ Comprehensive | ✅ Basic |
| Memory Overhead | Minimal | +weights | +pointers | +properties | +properties+pointers |
| Cache Efficiency | Excellent | Excellent | Good | Excellent** | Good** |
*PropertyGraph inherits mutation performance from its underlying graph type. **PropertyGraph inherits the performance characteristics of its underlying graph type.
Performance notes:
- Timings are median benchmarks on typical graphs (Petersen graph: 10 vertices, 15 edges)
- All operations benefit from
@inboundsoptimizations in performance-critical loops - Bounds checking can be disabled globally with
--check-bounds=nofor maximum speed - Edge lookup time depends on vertex degree but benefits from cache-efficient CSR layout
Design Decisions & Trade-offs
Why Julia-Idiomatic Bounds Checking?
- Safety First: All vertex/edge access is safe by default with clear error messages
- Performance When Needed:
@inboundsprovides zero-cost optimization for validated access - Familiar Pattern: Follows Julia's array indexing conventions that users already know
- Trade-off: Small overhead in tight loops, but eliminates silent corruption bugs
Why CSR for CoreGraph?
- Cache Efficiency: Neighbors stored contiguously in memory
- Space Efficiency: No pointer overhead compared to adjacency lists
- Index Stability: External arrays remain valid during graph analysis
- Efficient Mutations: Direct array manipulation preserves CSR benefits
- Trade-off: More complex mutation algorithms, but maintains performance characteristics
Why Dual Indexing?
- Flexibility: Support both shared and directional edge properties
- Performance: O(1) access during iteration via
edge_index(g, v, i) - Correctness: Clear separation between undirected and directed semantics
- Trade-off: Slightly more complex API, but with clear documentation
Why Multiple Graph Types?
- Specialization: Each type optimized for its use case
- Composability: PropertyGraph wraps any base type
- Migration Path: Easy conversion between representations
- Trade-off: More types to learn, but unified interface
When to Use Which Graph Type
CoreGraph{D} - Choose When:
- ✅ You need maximum performance for queries and analysis
- ✅ Graph structure is mostly static after initial construction
- ✅ Memory efficiency is important
- ✅ You'll be doing many neighbor traversals
WeightedGraph{W,D} - Choose When:
- ✅ You need type-safe edge weights
- ✅ Performance is critical (same as CoreGraph)
- ✅ You want built-in weight management
- ✅ You need directional weights in undirected graphs
AdjGraph{D} - Choose When:
- ✅ You'll be frequently adding/removing edges
- ✅ Graph construction is dynamic and unpredictable
- ✅ Flexibility is more important than memory efficiency
PropertyGraph{G,V,E} - Choose When:
- ✅ You need to attach custom data to vertices/edges
- ✅ Type safety for properties is important
- ✅ You want to compose with any underlying graph type
- ✅ Zero-overhead property access is required
Performance Best Practices
When to Use @inbounds
Use @inbounds in performance-critical inner loops where you can guarantee safety:
# ✅ SAFE: After explicit validation
function safe_fast_algorithm(g, valid_vertices)
# Pre-validate all vertices are in bounds
@assert all(has_vertex(g, v) for v in valid_vertices)
for v in valid_vertices
# Safe to use @inbounds - we validated vertices above
neighbors = @inbounds neighbor_indices(g, v)
for (i, neighbor) in enumerate(neighbors)
edge_idx = @inbounds directed_edge_index(g, v, i)
# Fast operations...
end
end
end
# ❌ UNSAFE: Don't use @inbounds on user input
function unsafe_example(g, user_vertex)
# DON'T DO THIS - user_vertex might be out of bounds!
neighbors = @inbounds neighbor_indices(g, user_vertex) # Potential crash
endBounds Checking Control
# For maximum performance in production (after thorough testing):
# julia --check-bounds=no script.jl
# For development and debugging (default):
# julia --check-bounds=yes script.jlMemory and Cache Optimization
# ✅ GOOD: Use CoreGraph for analysis workloads
analysis_graph = build_core_graph(edges; directed=false)
# ✅ GOOD: Batch operations for cache efficiency
function efficient_traversal(g)
for v in vertices(g)
neighbors = neighbor_indices(g, v) # O(1) view, cache-friendly
for neighbor in neighbors
# Process all neighbors together
end
end
end