Skip to content

Use inbounds in partial_distance2_coloring#250

Closed
amontoison wants to merge 1 commit intomainfrom
am/inbounds
Closed

Use inbounds in partial_distance2_coloring#250
amontoison wants to merge 1 commit intomainfrom
am/inbounds

Conversation

@amontoison
Copy link
Copy Markdown
Collaborator

I got a big difference on my ANL laptop, I don't know if it is related to the CPU or the @inbounds.
I will check with our benchmarks.

@amontoison amontoison added the benchmark Run benchmarks on PR label Apr 24, 2025
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 24, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (d9ec840) to head (b0884a1).

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #250   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           15        15           
  Lines         1824      1825    +1     
=========================================
+ Hits          1824      1825    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark Results

main b0884a1... main/b0884a19e722e3...
coloring/nonsymmetric/bidirectional/direct/n=1000/p=0.002 1.89 ± 0.012 ms 1.91 ± 0.017 ms 0.99
coloring/nonsymmetric/bidirectional/direct/n=1000/p=0.005 4.34 ± 0.028 ms 4.37 ± 0.036 ms 0.993
coloring/nonsymmetric/bidirectional/direct/n=1000/p=0.01 9.77 ± 0.084 ms 9.77 ± 0.16 ms 1
coloring/nonsymmetric/bidirectional/direct/n=100000/p=0.0001 1.97 ± 0.027 s 1.94 ± 0.028 s 1.01
coloring/nonsymmetric/bidirectional/direct/n=100000/p=2.0e-5 0.282 ± 0.0064 s 0.275 ± 0.0072 s 1.03
coloring/nonsymmetric/bidirectional/direct/n=100000/p=5.0e-5 0.707 ± 0.02 s 0.713 ± 0.016 s 0.992
coloring/nonsymmetric/bidirectional/substitution/n=1000/p=0.002 2.83 ± 0.017 ms 2.86 ± 0.026 ms 0.989
coloring/nonsymmetric/bidirectional/substitution/n=1000/p=0.005 7.49 ± 0.049 ms 7.54 ± 0.085 ms 0.994
coloring/nonsymmetric/bidirectional/substitution/n=1000/p=0.01 18.1 ± 0.25 ms 18.2 ± 0.33 ms 0.994
coloring/nonsymmetric/bidirectional/substitution/n=100000/p=0.0001 3.79 ± 0.047 s 3.73 ± 0.049 s 1.02
coloring/nonsymmetric/bidirectional/substitution/n=100000/p=2.0e-5 0.446 ± 0.0065 s 0.437 ± 0.015 s 1.02
coloring/nonsymmetric/bidirectional/substitution/n=100000/p=5.0e-5 1.33 ± 0.035 s 1.3 ± 0.0087 s 1.02
coloring/nonsymmetric/column/direct/n=1000/p=0.002 0.488 ± 0.01 ms 0.449 ± 0.011 ms 1.09
coloring/nonsymmetric/column/direct/n=1000/p=0.005 1.22 ± 0.0083 ms 1.12 ± 0.0088 ms 1.08
coloring/nonsymmetric/column/direct/n=1000/p=0.01 3.02 ± 0.0096 ms 2.8 ± 0.013 ms 1.08
coloring/nonsymmetric/column/direct/n=100000/p=0.0001 0.506 ± 0.0098 s 0.439 ± 0.012 s 1.15
coloring/nonsymmetric/column/direct/n=100000/p=2.0e-5 0.0813 ± 0.00039 s 0.073 ± 0.00071 s 1.11
coloring/nonsymmetric/column/direct/n=100000/p=5.0e-5 0.204 ± 0.0014 s 0.18 ± 0.002 s 1.13
coloring/nonsymmetric/row/direct/n=1000/p=0.002 0.477 ± 0.01 ms 0.444 ± 0.01 ms 1.07
coloring/nonsymmetric/row/direct/n=1000/p=0.005 1.2 ± 0.0087 ms 1.13 ± 0.0085 ms 1.07
coloring/nonsymmetric/row/direct/n=1000/p=0.01 3.06 ± 0.013 ms 2.83 ± 0.015 ms 1.08
coloring/nonsymmetric/row/direct/n=100000/p=0.0001 0.49 ± 0.01 s 0.436 ± 0.0065 s 1.13
coloring/nonsymmetric/row/direct/n=100000/p=2.0e-5 0.0795 ± 0.0004 s 0.0721 ± 0.00068 s 1.1
coloring/nonsymmetric/row/direct/n=100000/p=5.0e-5 0.198 ± 0.0027 s 0.185 ± 0.0038 s 1.07
coloring/symmetric/column/direct/n=1000/p=0.002 1.04 ± 0.0096 ms 1.04 ± 0.0093 ms 1
coloring/symmetric/column/direct/n=1000/p=0.005 2.29 ± 0.013 ms 2.29 ± 0.013 ms 1
coloring/symmetric/column/direct/n=1000/p=0.01 4.96 ± 0.023 ms 4.94 ± 0.024 ms 1
coloring/symmetric/column/direct/n=100000/p=0.0001 0.746 ± 0.016 s 0.739 ± 0.02 s 1.01
coloring/symmetric/column/direct/n=100000/p=2.0e-5 0.147 ± 0.0011 s 0.147 ± 0.0018 s 0.998
coloring/symmetric/column/direct/n=100000/p=5.0e-5 0.332 ± 0.0029 s 0.333 ± 0.0038 s 0.996
coloring/symmetric/column/substitution/n=1000/p=0.002 1.58 ± 0.014 ms 1.58 ± 0.015 ms 1
coloring/symmetric/column/substitution/n=1000/p=0.005 3.96 ± 0.024 ms 3.94 ± 0.022 ms 1
coloring/symmetric/column/substitution/n=1000/p=0.01 9.03 ± 0.052 ms 9.05 ± 0.058 ms 0.998
coloring/symmetric/column/substitution/n=100000/p=0.0001 1.47 ± 0.027 s 1.42 ± 0.018 s 1.03
coloring/symmetric/column/substitution/n=100000/p=2.0e-5 0.226 ± 0.0032 s 0.224 ± 0.0027 s 1.01
coloring/symmetric/column/substitution/n=100000/p=5.0e-5 0.591 ± 0.011 s 0.581 ± 0.0065 s 1.02
decompress/nonsymmetric/bidirectional/direct/n=1000/p=0.002 0.144 ± 0.0061 ms 0.142 ± 0.0049 ms 1.01
decompress/nonsymmetric/bidirectional/direct/n=1000/p=0.005 0.23 ± 0.0063 ms 0.225 ± 0.0076 ms 1.02
decompress/nonsymmetric/bidirectional/direct/n=1000/p=0.01 0.374 ± 0.0099 ms 0.361 ± 0.012 ms 1.03
decompress/nonsymmetric/bidirectional/direct/n=100000/p=0.0001 0.0912 ± 0.0014 s 0.0884 ± 0.0018 s 1.03
decompress/nonsymmetric/bidirectional/direct/n=100000/p=2.0e-5 30.5 ± 0.43 ms 30 ± 1.2 ms 1.02
decompress/nonsymmetric/bidirectional/direct/n=100000/p=5.0e-5 0.0608 ± 0.00065 s 0.0591 ± 0.00065 s 1.03
decompress/nonsymmetric/bidirectional/substitution/n=1000/p=0.002 0.146 ± 0.0056 ms 0.147 ± 0.005 ms 0.995
decompress/nonsymmetric/bidirectional/substitution/n=1000/p=0.005 0.278 ± 0.0084 ms 0.277 ± 0.0088 ms 1
decompress/nonsymmetric/bidirectional/substitution/n=1000/p=0.01 0.558 ± 0.012 ms 0.569 ± 0.015 ms 0.981
decompress/nonsymmetric/bidirectional/substitution/n=100000/p=0.0001 0.147 ± 0.0061 s 0.148 ± 0.0079 s 0.993
decompress/nonsymmetric/bidirectional/substitution/n=100000/p=2.0e-5 0.0326 ± 0.00037 s 0.0337 ± 0.00036 s 0.967
decompress/nonsymmetric/bidirectional/substitution/n=100000/p=5.0e-5 0.0647 ± 0.00058 s 0.0649 ± 0.00084 s 0.996
decompress/nonsymmetric/column/direct/n=1000/p=0.002 24.5 ± 0.9 μs 24 ± 1.2 μs 1.02
decompress/nonsymmetric/column/direct/n=1000/p=0.005 0.047 ± 0.0012 ms 0.0462 ± 0.0025 ms 1.02
decompress/nonsymmetric/column/direct/n=1000/p=0.01 0.0861 ± 0.0028 ms 0.0851 ± 0.0056 ms 1.01
decompress/nonsymmetric/column/direct/n=100000/p=0.0001 24.2 ± 0.33 ms 23.9 ± 0.34 ms 1.01
decompress/nonsymmetric/column/direct/n=100000/p=2.0e-5 4.23 ± 0.13 ms 4.15 ± 0.13 ms 1.02
decompress/nonsymmetric/column/direct/n=100000/p=5.0e-5 11.7 ± 0.2 ms 11.6 ± 0.24 ms 1.01
decompress/nonsymmetric/row/direct/n=1000/p=0.002 24.5 ± 1.2 μs 23.5 ± 1.5 μs 1.04
decompress/nonsymmetric/row/direct/n=1000/p=0.005 0.0431 ± 0.0011 ms 0.0412 ± 0.0017 ms 1.05
decompress/nonsymmetric/row/direct/n=1000/p=0.01 0.078 ± 0.0025 ms 0.0726 ± 0.0043 ms 1.07
decompress/nonsymmetric/row/direct/n=100000/p=0.0001 11 ± 0.23 ms 11.2 ± 0.24 ms 0.982
decompress/nonsymmetric/row/direct/n=100000/p=2.0e-5 2.96 ± 0.056 ms 2.9 ± 0.092 ms 1.02
decompress/nonsymmetric/row/direct/n=100000/p=5.0e-5 5.7 ± 0.14 ms 5.58 ± 0.17 ms 1.02
decompress/symmetric/column/direct/n=1000/p=0.002 23.8 ± 0.97 μs 24.7 ± 1.1 μs 0.964
decompress/symmetric/column/direct/n=1000/p=0.005 0.0442 ± 0.0013 ms 0.0462 ± 0.0018 ms 0.957
decompress/symmetric/column/direct/n=1000/p=0.01 0.0795 ± 0.0028 ms 0.0822 ± 0.0045 ms 0.968
decompress/symmetric/column/direct/n=100000/p=0.0001 21.3 ± 0.28 ms 22.9 ± 0.29 ms 0.931
decompress/symmetric/column/direct/n=100000/p=2.0e-5 3.61 ± 0.23 ms 3.82 ± 0.19 ms 0.946
decompress/symmetric/column/direct/n=100000/p=5.0e-5 11 ± 0.73 ms 11.2 ± 0.22 ms 0.982
decompress/symmetric/column/substitution/n=1000/p=0.002 0.0878 ± 0.0022 ms 0.0877 ± 0.0024 ms 1
decompress/symmetric/column/substitution/n=1000/p=0.005 0.173 ± 0.0062 ms 0.173 ± 0.0059 ms 1
decompress/symmetric/column/substitution/n=1000/p=0.01 0.34 ± 0.01 ms 0.341 ± 0.011 ms 0.998
decompress/symmetric/column/substitution/n=100000/p=0.0001 0.0854 ± 0.0014 s 0.0836 ± 0.0026 s 1.02
decompress/symmetric/column/substitution/n=100000/p=2.0e-5 22.9 ± 0.23 ms 22.8 ± 0.32 ms 1.01
decompress/symmetric/column/substitution/n=100000/p=5.0e-5 0.0435 ± 0.00048 s 0.0434 ± 0.0006 s 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.002 0.733 ± 0.0089 ms 0.737 ± 0.0095 ms 0.994
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.005 2.59 ± 0.023 ms 2.6 ± 0.022 ms 0.997
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.01 8.02 ± 0.071 ms 8.03 ± 0.074 ms 0.999
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=0.0001 1.51 ± 0.011 s 1.51 ± 0.036 s 0.999
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=2.0e-5 0.131 ± 0.0021 s 0.134 ± 0.0025 s 0.981
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=5.0e-5 0.493 ± 0.0096 s 0.489 ± 0.015 s 1.01
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.002 0.721 ± 0.0085 ms 0.723 ± 0.0093 ms 0.998
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.005 2.59 ± 0.022 ms 2.59 ± 0.022 ms 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.01 8.02 ± 0.073 ms 8.04 ± 0.1 ms 0.998
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=0.0001 1.57 ± 0.008 s 1.58 ± 0.036 s 0.997
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=2.0e-5 0.134 ± 0.0025 s 0.136 ± 0.0023 s 0.982
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=5.0e-5 0.517 ± 0.012 s 0.512 ± 0.014 s 1.01
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.002 0.737 ± 0.0087 ms 0.732 ± 0.0084 ms 1.01
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.005 2.55 ± 0.022 ms 2.57 ± 0.025 ms 0.994
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.01 7.94 ± 0.089 ms 8.04 ± 0.059 ms 0.987
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=0.0001 1.58 ± 0.0096 s 1.59 ± 0.02 s 0.994
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=2.0e-5 0.144 ± 0.0016 s 0.146 ± 0.0026 s 0.986
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=5.0e-5 0.519 ± 0.014 s 0.513 ± 0.02 s 1.01
order/nonsymmetric/column/SparseMatrixColorings.LargestFirst()/n=1000/p=0.002 0.323 ± 0.0091 ms 0.321 ± 0.0094 ms 1.01
order/nonsymmetric/column/SparseMatrixColorings.LargestFirst()/n=1000/p=0.005 0.78 ± 0.01 ms 0.778 ± 0.0088 ms 1
order/nonsymmetric/column/SparseMatrixColorings.LargestFirst()/n=1000/p=0.01 1.96 ± 0.05 ms 1.91 ± 0.0083 ms 1.03
order/nonsymmetric/column/SparseMatrixColorings.LargestFirst()/n=100000/p=0.0001 0.333 ± 0.0059 s 0.313 ± 0.0065 s 1.06
order/nonsymmetric/column/SparseMatrixColorings.LargestFirst()/n=100000/p=2.0e-5 0.0568 ± 0.0012 s 0.0562 ± 0.0015 s 1.01
order/nonsymmetric/column/SparseMatrixColorings.LargestFirst()/n=100000/p=5.0e-5 0.144 ± 0.0026 s 0.145 ± 0.0021 s 0.994
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.002 0.734 ± 0.0084 ms 0.732 ± 0.0086 ms 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.005 2.6 ± 0.026 ms 2.58 ± 0.021 ms 1.01
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.01 8.04 ± 0.091 ms 7.99 ± 0.068 ms 1.01
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=0.0001 1.52 ± 0.014 s 1.53 ± 0.02 s 0.995
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=2.0e-5 0.13 ± 0.0013 s 0.131 ± 0.0027 s 0.994
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=5.0e-5 0.49 ± 0.0093 s 0.493 ± 0.0078 s 0.994
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.002 0.724 ± 0.0097 ms 0.726 ± 0.0093 ms 0.996
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.005 2.59 ± 0.024 ms 2.6 ± 0.024 ms 0.997
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.01 8.01 ± 0.06 ms 8.02 ± 0.079 ms 0.999
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=0.0001 1.59 ± 0.0071 s 1.57 ± 0.027 s 1.01
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=2.0e-5 0.136 ± 0.0018 s 0.136 ± 0.0024 s 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=5.0e-5 0.501 ± 0.011 s 0.512 ± 0.014 s 0.979
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.002 0.732 ± 0.0098 ms 0.735 ± 0.0099 ms 0.996
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.005 2.56 ± 0.025 ms 2.57 ± 0.021 ms 0.997
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.01 7.99 ± 0.072 ms 7.98 ± 0.08 ms 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=0.0001 1.57 ± 0.03 s 1.6 ± 0.04 s 0.977
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=2.0e-5 0.148 ± 0.0018 s 0.146 ± 0.0013 s 1.01
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=5.0e-5 0.522 ± 0.015 s 0.517 ± 0.0051 s 1.01
order/nonsymmetric/row/SparseMatrixColorings.LargestFirst()/n=1000/p=0.002 0.321 ± 0.0095 ms 0.321 ± 0.0095 ms 1
order/nonsymmetric/row/SparseMatrixColorings.LargestFirst()/n=1000/p=0.005 0.775 ± 0.011 ms 0.778 ± 0.0077 ms 0.997
order/nonsymmetric/row/SparseMatrixColorings.LargestFirst()/n=1000/p=0.01 1.92 ± 0.0093 ms 1.92 ± 0.012 ms 0.999
order/nonsymmetric/row/SparseMatrixColorings.LargestFirst()/n=100000/p=0.0001 0.317 ± 0.0055 s 0.322 ± 0.0041 s 0.984
order/nonsymmetric/row/SparseMatrixColorings.LargestFirst()/n=100000/p=2.0e-5 0.055 ± 0.0013 s 0.0554 ± 0.0014 s 0.992
order/nonsymmetric/row/SparseMatrixColorings.LargestFirst()/n=100000/p=5.0e-5 0.143 ± 0.00089 s 0.146 ± 0.0022 s 0.984
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.002 0.22 ± 0.0096 ms 0.225 ± 0.006 ms 0.978
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.005 0.402 ± 0.01 ms 0.401 ± 0.011 ms 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.01 0.662 ± 0.01 ms 0.658 ± 0.011 ms 1.01
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=0.0001 0.13 ± 0.0015 s 0.13 ± 0.002 s 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=2.0e-5 0.0435 ± 0.00051 s 0.0434 ± 0.00037 s 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=5.0e-5 0.0776 ± 0.00097 s 0.0771 ± 0.00087 s 1.01
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.002 0.213 ± 0.0085 ms 0.214 ± 0.0049 ms 0.996
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.005 0.395 ± 0.01 ms 0.394 ± 0.01 ms 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.01 0.667 ± 0.011 ms 0.661 ± 0.011 ms 1.01
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=0.0001 0.13 ± 0.0019 s 0.129 ± 0.0028 s 1.01
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=2.0e-5 0.0419 ± 0.00045 s 0.0411 ± 0.00057 s 1.02
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=5.0e-5 0.0747 ± 0.00097 s 0.0735 ± 0.0016 s 1.02
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.002 0.189 ± 0.0095 ms 0.191 ± 0.005 ms 0.992
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.005 0.357 ± 0.0099 ms 0.357 ± 0.01 ms 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.01 0.577 ± 0.012 ms 0.579 ± 0.012 ms 0.997
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=0.0001 0.128 ± 0.0019 s 0.125 ± 0.0024 s 1.02
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=2.0e-5 0.0403 ± 0.00064 s 0.0396 ± 0.00054 s 1.02
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=5.0e-5 0.0728 ± 0.00075 s 0.0723 ± 0.001 s 1.01
order/symmetric/column/SparseMatrixColorings.LargestFirst()/n=1000/p=0.002 0.112 ± 0.0021 ms 0.118 ± 0.0038 ms 0.951
order/symmetric/column/SparseMatrixColorings.LargestFirst()/n=1000/p=0.005 0.137 ± 0.0052 ms 0.145 ± 0.0089 ms 0.943
order/symmetric/column/SparseMatrixColorings.LargestFirst()/n=1000/p=0.01 0.13 ± 0.0045 ms 0.135 ± 0.0092 ms 0.967
order/symmetric/column/SparseMatrixColorings.LargestFirst()/n=100000/p=0.0001 26.2 ± 0.13 ms 26.2 ± 0.13 ms 1
order/symmetric/column/SparseMatrixColorings.LargestFirst()/n=100000/p=2.0e-5 21.9 ± 0.081 ms 21.9 ± 0.081 ms 1
order/symmetric/column/SparseMatrixColorings.LargestFirst()/n=100000/p=5.0e-5 23.8 ± 0.2 ms 23.8 ± 0.1 ms 1
time_to_load 0.21 ± 0.0078 s 0.209 ± 0.00071 s 1
main b0884a1... main/b0884a19e722e3...
coloring/nonsymmetric/bidirectional/direct/n=1000/p=0.002 0.475 k allocs: 2.35 MB 0.475 k allocs: 2.35 MB 1
coloring/nonsymmetric/bidirectional/direct/n=1000/p=0.005 0.485 k allocs: 3.96 MB 0.485 k allocs: 3.96 MB 1
coloring/nonsymmetric/bidirectional/direct/n=1000/p=0.01 0.495 k allocs: 6.99 MB 0.495 k allocs: 6.99 MB 1
coloring/nonsymmetric/bidirectional/direct/n=100000/p=0.0001 0.555 k allocs: 0.712 GB 0.555 k allocs: 0.712 GB 1
coloring/nonsymmetric/bidirectional/direct/n=100000/p=2.0e-5 0.525 k allocs: 0.24 GB 0.525 k allocs: 0.24 GB 1
coloring/nonsymmetric/bidirectional/direct/n=100000/p=5.0e-5 0.545 k allocs: 0.426 GB 0.545 k allocs: 0.426 GB 1
coloring/nonsymmetric/bidirectional/substitution/n=1000/p=0.002 0.638 k allocs: 2.83 MB 0.638 k allocs: 2.83 MB 1
coloring/nonsymmetric/bidirectional/substitution/n=1000/p=0.005 0.64 k allocs: 4.82 MB 0.64 k allocs: 4.82 MB 1
coloring/nonsymmetric/bidirectional/substitution/n=1000/p=0.01 0.64 k allocs: 8.33 MB 0.64 k allocs: 8.33 MB 1
coloring/nonsymmetric/bidirectional/substitution/n=100000/p=0.0001 0.645 k allocs: 0.812 GB 0.645 k allocs: 0.812 GB 1
coloring/nonsymmetric/bidirectional/substitution/n=100000/p=2.0e-5 0.645 k allocs: 0.272 GB 0.645 k allocs: 0.272 GB 1
coloring/nonsymmetric/bidirectional/substitution/n=100000/p=5.0e-5 0.645 k allocs: 0.472 GB 0.645 k allocs: 0.472 GB 1
coloring/nonsymmetric/column/direct/n=1000/p=0.002 0.135 k allocs: 0.354 MB 0.135 k allocs: 0.354 MB 1
coloring/nonsymmetric/column/direct/n=1000/p=0.005 0.135 k allocs: 0.578 MB 0.135 k allocs: 0.578 MB 1
coloring/nonsymmetric/column/direct/n=1000/p=0.01 0.135 k allocs: 0.967 MB 0.135 k allocs: 0.967 MB 1
coloring/nonsymmetric/column/direct/n=100000/p=0.0001 0.135 k allocs: 0.0931 GB 0.135 k allocs: 0.0931 GB 1
coloring/nonsymmetric/column/direct/n=100000/p=2.0e-5 0.135 k allocs: 0.0335 GB 0.135 k allocs: 0.0335 GB 1
coloring/nonsymmetric/column/direct/n=100000/p=5.0e-5 0.135 k allocs: 0.0558 GB 0.135 k allocs: 0.0558 GB 1
coloring/nonsymmetric/row/direct/n=1000/p=0.002 0.135 k allocs: 0.354 MB 0.135 k allocs: 0.354 MB 1
coloring/nonsymmetric/row/direct/n=1000/p=0.005 0.135 k allocs: 0.578 MB 0.135 k allocs: 0.578 MB 1
coloring/nonsymmetric/row/direct/n=1000/p=0.01 0.135 k allocs: 0.967 MB 0.135 k allocs: 0.967 MB 1
coloring/nonsymmetric/row/direct/n=100000/p=0.0001 0.135 k allocs: 0.0931 GB 0.135 k allocs: 0.0931 GB 1
coloring/nonsymmetric/row/direct/n=100000/p=2.0e-5 0.135 k allocs: 0.0335 GB 0.135 k allocs: 0.0335 GB 1
coloring/nonsymmetric/row/direct/n=100000/p=5.0e-5 0.135 k allocs: 0.0558 GB 0.135 k allocs: 0.0558 GB 1
coloring/symmetric/column/direct/n=1000/p=0.002 0.22 k allocs: 0.584 MB 0.22 k allocs: 0.584 MB 1
coloring/symmetric/column/direct/n=1000/p=0.005 0.23 k allocs: 0.932 MB 0.23 k allocs: 0.932 MB 1
coloring/symmetric/column/direct/n=1000/p=0.01 0.24 k allocs: 1.57 MB 0.24 k allocs: 1.57 MB 1
coloring/symmetric/column/direct/n=100000/p=0.0001 0.3 k allocs: 0.168 GB 0.3 k allocs: 0.168 GB 1
coloring/symmetric/column/direct/n=100000/p=2.0e-5 0.27 k allocs: 0.056 GB 0.27 k allocs: 0.056 GB 1
coloring/symmetric/column/direct/n=100000/p=5.0e-5 0.29 k allocs: 0.0995 GB 0.29 k allocs: 0.0995 GB 1
coloring/symmetric/column/substitution/n=1000/p=0.002 0.4 k allocs: 0.945 MB 0.4 k allocs: 0.945 MB 1
coloring/symmetric/column/substitution/n=1000/p=0.005 0.405 k allocs: 1.65 MB 0.405 k allocs: 1.65 MB 1
coloring/symmetric/column/substitution/n=1000/p=0.01 0.411 k allocs: 2.86 MB 0.411 k allocs: 2.86 MB 1
coloring/symmetric/column/substitution/n=100000/p=0.0001 0.416 k allocs: 0.276 GB 0.416 k allocs: 0.276 GB 1
coloring/symmetric/column/substitution/n=100000/p=2.0e-5 0.41 k allocs: 0.0897 GB 0.41 k allocs: 0.0897 GB 1
coloring/symmetric/column/substitution/n=100000/p=5.0e-5 0.41 k allocs: 0.159 GB 0.41 k allocs: 0.159 GB 1
decompress/nonsymmetric/bidirectional/direct/n=1000/p=0.002 0.045 k allocs: 0.197 MB 0.045 k allocs: 0.197 MB 1
decompress/nonsymmetric/bidirectional/direct/n=1000/p=0.005 0.045 k allocs: 0.419 MB 0.045 k allocs: 0.419 MB 1
decompress/nonsymmetric/bidirectional/direct/n=1000/p=0.01 0.045 k allocs: 0.803 MB 0.045 k allocs: 0.803 MB 1
decompress/nonsymmetric/bidirectional/direct/n=100000/p=0.0001 0.045 k allocs: 0.0782 GB 0.045 k allocs: 0.0782 GB 1
decompress/nonsymmetric/bidirectional/direct/n=100000/p=2.0e-5 0.045 k allocs: 19.1 MB 0.045 k allocs: 19.1 MB 1
decompress/nonsymmetric/bidirectional/direct/n=100000/p=5.0e-5 0.045 k allocs: 0.0409 GB 0.045 k allocs: 0.0409 GB 1
decompress/nonsymmetric/bidirectional/substitution/n=1000/p=0.002 0.045 k allocs: 0.197 MB 0.045 k allocs: 0.197 MB 1
decompress/nonsymmetric/bidirectional/substitution/n=1000/p=0.005 0.045 k allocs: 0.419 MB 0.045 k allocs: 0.419 MB 1
decompress/nonsymmetric/bidirectional/substitution/n=1000/p=0.01 0.045 k allocs: 0.803 MB 0.045 k allocs: 0.803 MB 1
decompress/nonsymmetric/bidirectional/substitution/n=100000/p=0.0001 0.045 k allocs: 0.0782 GB 0.045 k allocs: 0.0782 GB 1
decompress/nonsymmetric/bidirectional/substitution/n=100000/p=2.0e-5 0.045 k allocs: 19.1 MB 0.045 k allocs: 19.1 MB 1
decompress/nonsymmetric/bidirectional/substitution/n=100000/p=5.0e-5 0.045 k allocs: 0.0409 GB 0.045 k allocs: 0.0409 GB 1
decompress/nonsymmetric/column/direct/n=1000/p=0.002 0.045 k allocs: 0.197 MB 0.045 k allocs: 0.197 MB 1
decompress/nonsymmetric/column/direct/n=1000/p=0.005 0.045 k allocs: 0.419 MB 0.045 k allocs: 0.419 MB 1
decompress/nonsymmetric/column/direct/n=1000/p=0.01 0.045 k allocs: 0.803 MB 0.045 k allocs: 0.803 MB 1
decompress/nonsymmetric/column/direct/n=100000/p=0.0001 0.045 k allocs: 0.0782 GB 0.045 k allocs: 0.0782 GB 1
decompress/nonsymmetric/column/direct/n=100000/p=2.0e-5 0.045 k allocs: 19.1 MB 0.045 k allocs: 19.1 MB 1
decompress/nonsymmetric/column/direct/n=100000/p=5.0e-5 0.045 k allocs: 0.0409 GB 0.045 k allocs: 0.0409 GB 1
decompress/nonsymmetric/row/direct/n=1000/p=0.002 0.045 k allocs: 0.197 MB 0.045 k allocs: 0.197 MB 1
decompress/nonsymmetric/row/direct/n=1000/p=0.005 0.045 k allocs: 0.419 MB 0.045 k allocs: 0.419 MB 1
decompress/nonsymmetric/row/direct/n=1000/p=0.01 0.045 k allocs: 0.803 MB 0.045 k allocs: 0.803 MB 1
decompress/nonsymmetric/row/direct/n=100000/p=0.0001 0.045 k allocs: 0.0782 GB 0.045 k allocs: 0.0782 GB 1
decompress/nonsymmetric/row/direct/n=100000/p=2.0e-5 0.045 k allocs: 19.1 MB 0.045 k allocs: 19.1 MB 1
decompress/nonsymmetric/row/direct/n=100000/p=5.0e-5 0.045 k allocs: 0.0409 GB 0.045 k allocs: 0.0409 GB 1
decompress/symmetric/column/direct/n=1000/p=0.002 0.045 k allocs: 0.197 MB 0.045 k allocs: 0.197 MB 1
decompress/symmetric/column/direct/n=1000/p=0.005 0.045 k allocs: 0.419 MB 0.045 k allocs: 0.419 MB 1
decompress/symmetric/column/direct/n=1000/p=0.01 0.045 k allocs: 0.803 MB 0.045 k allocs: 0.803 MB 1
decompress/symmetric/column/direct/n=100000/p=0.0001 0.045 k allocs: 0.0782 GB 0.045 k allocs: 0.0782 GB 1
decompress/symmetric/column/direct/n=100000/p=2.0e-5 0.045 k allocs: 19.1 MB 0.045 k allocs: 19.1 MB 1
decompress/symmetric/column/direct/n=100000/p=5.0e-5 0.045 k allocs: 0.0409 GB 0.045 k allocs: 0.0409 GB 1
decompress/symmetric/column/substitution/n=1000/p=0.002 0.045 k allocs: 0.197 MB 0.045 k allocs: 0.197 MB 1
decompress/symmetric/column/substitution/n=1000/p=0.005 0.045 k allocs: 0.419 MB 0.045 k allocs: 0.419 MB 1
decompress/symmetric/column/substitution/n=1000/p=0.01 0.045 k allocs: 0.803 MB 0.045 k allocs: 0.803 MB 1
decompress/symmetric/column/substitution/n=100000/p=0.0001 0.045 k allocs: 0.0782 GB 0.045 k allocs: 0.0782 GB 1
decompress/symmetric/column/substitution/n=100000/p=2.0e-5 0.045 k allocs: 19.1 MB 0.045 k allocs: 19.1 MB 1
decompress/symmetric/column/substitution/n=100000/p=5.0e-5 0.045 k allocs: 0.0409 GB 0.045 k allocs: 0.0409 GB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.002 0.105 k allocs: 0.196 MB 0.105 k allocs: 0.196 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.005 0.105 k allocs: 0.202 MB 0.105 k allocs: 0.202 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.01 0.105 k allocs: 0.216 MB 0.105 k allocs: 0.216 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=0.0001 0.12 k allocs: 19.1 MB 0.12 k allocs: 19.1 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=2.0e-5 0.105 k allocs: 19.1 MB 0.105 k allocs: 19.1 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=5.0e-5 0.105 k allocs: 19.1 MB 0.105 k allocs: 19.1 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.002 0.105 k allocs: 0.196 MB 0.105 k allocs: 0.196 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.005 0.105 k allocs: 0.202 MB 0.105 k allocs: 0.202 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.01 0.105 k allocs: 0.216 MB 0.105 k allocs: 0.216 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=0.0001 0.12 k allocs: 19.1 MB 0.12 k allocs: 19.1 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=2.0e-5 0.105 k allocs: 19.1 MB 0.105 k allocs: 19.1 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=5.0e-5 0.105 k allocs: 19.1 MB 0.105 k allocs: 19.1 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.002 0.105 k allocs: 0.196 MB 0.105 k allocs: 0.196 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.005 0.105 k allocs: 0.202 MB 0.105 k allocs: 0.202 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.01 0.105 k allocs: 0.216 MB 0.105 k allocs: 0.216 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=0.0001 0.12 k allocs: 19.1 MB 0.12 k allocs: 19.1 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=2.0e-5 0.105 k allocs: 19.1 MB 0.105 k allocs: 19.1 MB 1
order/nonsymmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=5.0e-5 0.105 k allocs: 19.1 MB 0.105 k allocs: 19.1 MB 1
order/nonsymmetric/column/SparseMatrixColorings.LargestFirst()/n=1000/p=0.002 0.045 k allocs: 0.115 MB 0.045 k allocs: 0.115 MB 1
order/nonsymmetric/column/SparseMatrixColorings.LargestFirst()/n=1000/p=0.005 0.045 k allocs: 0.115 MB 0.045 k allocs: 0.115 MB 1
order/nonsymmetric/column/SparseMatrixColorings.LargestFirst()/n=1000/p=0.01 0.045 k allocs: 0.115 MB 0.045 k allocs: 0.115 MB 1
order/nonsymmetric/column/SparseMatrixColorings.LargestFirst()/n=100000/p=0.0001 0.045 k allocs: 11.4 MB 0.045 k allocs: 11.4 MB 1
order/nonsymmetric/column/SparseMatrixColorings.LargestFirst()/n=100000/p=2.0e-5 0.045 k allocs: 11.4 MB 0.045 k allocs: 11.4 MB 1
order/nonsymmetric/column/SparseMatrixColorings.LargestFirst()/n=100000/p=5.0e-5 0.045 k allocs: 11.4 MB 0.045 k allocs: 11.4 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.002 0.105 k allocs: 0.196 MB 0.105 k allocs: 0.196 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.005 0.105 k allocs: 0.202 MB 0.105 k allocs: 0.202 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.01 0.105 k allocs: 0.216 MB 0.105 k allocs: 0.216 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=0.0001 0.12 k allocs: 19.1 MB 0.12 k allocs: 19.1 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=2.0e-5 0.105 k allocs: 19.1 MB 0.105 k allocs: 19.1 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=5.0e-5 0.105 k allocs: 19.1 MB 0.105 k allocs: 19.1 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.002 0.105 k allocs: 0.196 MB 0.105 k allocs: 0.196 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.005 0.105 k allocs: 0.202 MB 0.105 k allocs: 0.202 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.01 0.105 k allocs: 0.216 MB 0.105 k allocs: 0.216 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=0.0001 0.12 k allocs: 19.1 MB 0.12 k allocs: 19.1 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=2.0e-5 0.105 k allocs: 19.1 MB 0.105 k allocs: 19.1 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=5.0e-5 0.105 k allocs: 19.1 MB 0.105 k allocs: 19.1 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.002 0.105 k allocs: 0.196 MB 0.105 k allocs: 0.196 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.005 0.105 k allocs: 0.202 MB 0.105 k allocs: 0.202 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.01 0.105 k allocs: 0.216 MB 0.105 k allocs: 0.216 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=0.0001 0.12 k allocs: 19.1 MB 0.12 k allocs: 19.1 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=2.0e-5 0.105 k allocs: 19.1 MB 0.105 k allocs: 19.1 MB 1
order/nonsymmetric/row/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=5.0e-5 0.105 k allocs: 19.1 MB 0.105 k allocs: 19.1 MB 1
order/nonsymmetric/row/SparseMatrixColorings.LargestFirst()/n=1000/p=0.002 0.045 k allocs: 0.115 MB 0.045 k allocs: 0.115 MB 1
order/nonsymmetric/row/SparseMatrixColorings.LargestFirst()/n=1000/p=0.005 0.045 k allocs: 0.115 MB 0.045 k allocs: 0.115 MB 1
order/nonsymmetric/row/SparseMatrixColorings.LargestFirst()/n=1000/p=0.01 0.045 k allocs: 0.115 MB 0.045 k allocs: 0.115 MB 1
order/nonsymmetric/row/SparseMatrixColorings.LargestFirst()/n=100000/p=0.0001 0.045 k allocs: 11.4 MB 0.045 k allocs: 11.4 MB 1
order/nonsymmetric/row/SparseMatrixColorings.LargestFirst()/n=100000/p=2.0e-5 0.045 k allocs: 11.4 MB 0.045 k allocs: 11.4 MB 1
order/nonsymmetric/row/SparseMatrixColorings.LargestFirst()/n=100000/p=5.0e-5 0.045 k allocs: 11.4 MB 0.045 k allocs: 11.4 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.002 0.09 k allocs: 0.156 MB 0.09 k allocs: 0.156 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.005 0.09 k allocs: 0.156 MB 0.09 k allocs: 0.156 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=1000/p=0.01 0.09 k allocs: 0.157 MB 0.09 k allocs: 0.157 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=0.0001 0.09 k allocs: 15.3 MB 0.09 k allocs: 15.3 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=2.0e-5 0.09 k allocs: 15.3 MB 0.09 k allocs: 15.3 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :high2low, false}()/n=100000/p=5.0e-5 0.09 k allocs: 15.3 MB 0.09 k allocs: 15.3 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.002 0.09 k allocs: 0.156 MB 0.09 k allocs: 0.156 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.005 0.09 k allocs: 0.156 MB 0.09 k allocs: 0.156 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=1000/p=0.01 0.09 k allocs: 0.157 MB 0.09 k allocs: 0.157 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=0.0001 0.09 k allocs: 15.3 MB 0.09 k allocs: 15.3 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=2.0e-5 0.09 k allocs: 15.3 MB 0.09 k allocs: 15.3 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:back, :low2high, false}()/n=100000/p=5.0e-5 0.09 k allocs: 15.3 MB 0.09 k allocs: 15.3 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.002 0.09 k allocs: 0.156 MB 0.09 k allocs: 0.156 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.005 0.09 k allocs: 0.156 MB 0.09 k allocs: 0.156 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=1000/p=0.01 0.09 k allocs: 0.157 MB 0.09 k allocs: 0.157 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=0.0001 0.09 k allocs: 15.3 MB 0.09 k allocs: 15.3 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=2.0e-5 0.09 k allocs: 15.3 MB 0.09 k allocs: 15.3 MB 1
order/symmetric/column/SparseMatrixColorings.DynamicDegreeBasedOrder{:forward, :low2high, false}()/n=100000/p=5.0e-5 0.09 k allocs: 15.3 MB 0.09 k allocs: 15.3 MB 1
order/symmetric/column/SparseMatrixColorings.LargestFirst()/n=1000/p=0.002 0.05 k allocs: 0.116 MB 0.05 k allocs: 0.116 MB 1
order/symmetric/column/SparseMatrixColorings.LargestFirst()/n=1000/p=0.005 0.05 k allocs: 0.116 MB 0.05 k allocs: 0.116 MB 1
order/symmetric/column/SparseMatrixColorings.LargestFirst()/n=1000/p=0.01 0.05 k allocs: 0.116 MB 0.05 k allocs: 0.116 MB 1
order/symmetric/column/SparseMatrixColorings.LargestFirst()/n=100000/p=0.0001 0.05 k allocs: 11.4 MB 0.05 k allocs: 11.4 MB 1
order/symmetric/column/SparseMatrixColorings.LargestFirst()/n=100000/p=2.0e-5 0.05 k allocs: 11.4 MB 0.05 k allocs: 11.4 MB 1
order/symmetric/column/SparseMatrixColorings.LargestFirst()/n=100000/p=5.0e-5 0.05 k allocs: 11.4 MB 0.05 k allocs: 11.4 MB 1
time_to_load 0.159 k allocs: 11.2 kB 0.159 k allocs: 11.2 kB 1

@amontoison
Copy link
Copy Markdown
Collaborator Author

@gdalle On the left it is with the modifications of this PR.
On the right it is the version on main.
Capture d’écran du 2025-04-23 23-31-00

@gdalle
Copy link
Copy Markdown
Member

gdalle commented Apr 24, 2025

My stance on @inbounds hasn't changed: the unsafety is not worth the speed bump. I spent 2h yesterday fighting with an Enzyme segfault, which can't happen with our packages precisely because we don't use that macro. So let's keep it that way.

Even if we actually wanted to use it, we would need tests beforehand to guarantee that the indices are indeed within the bounds (that's what base Julia does for dense linalg). For this sparse case, it would require iterating through all the row values to verify that they are in the correct interval, so the check has complexity O(nnz).

Besides, speeding up coloring was useful for the paper to compare with ColPack, but in the real world, this kind of coloring is fast enough that no one will bat an eye anyway.

I don't have many red lines, but unsafe memory accesses and aliasing is the biggest one.

@amontoison
Copy link
Copy Markdown
Collaborator Author

amontoison commented Apr 24, 2025

Enzyme is another beast and I totally agree with you that we need to put some limit sometimes.

I agree that people can still do crazy things and break the sparsity pattern of a CSC matrix (but a very small %).
Maybe we should add a comment to say that the difference is only related to bound check but my issue is that some loops are probably not vectorized because of that.
Bound checks should be cheap so the speed-up is only a consequence from what I understand.
I think it is still valuable to explain the gap with C++ in the paper (just one sentence).

Never do programming in low-level languages Guillaume, we don't have bound check in C or Fortran.
You need ASAN, MSAN, UBSAN (or Valgrind) to track errors. Always a pleasure. :)

@gdalle
Copy link
Copy Markdown
Member

gdalle commented Apr 24, 2025

The fact that SMC is coded in a high-level language and memory-safe by default is an asset, not a curse. I'll happily add a couple lines to the paper explaining that, but I don't want to compromise the safety of our software to get these last percentage points and match ColPack. If someone inadvertently puts a -1 in the rowvals while constructing their sparse matrix, they should get an error, not a REPL crash with random corruptions of memory.

@gdalle
Copy link
Copy Markdown
Member

gdalle commented Apr 24, 2025

That being said, the present experiment is valuable and I thank you for running it! I just think it belongs in the paper as a comment, like you suggested, but not in the production-ready software. SciML and friends will already find a thousand ways to misuse our code, let's not give them one that is actually dangerous.

@amontoison
Copy link
Copy Markdown
Collaborator Author

amontoison commented Apr 24, 2025

I agree with you 🙂
I was curious to understand what could bring the difference of performance in the "simple" partial distance-2 coloring.
It is just an optimization that can't do the compiler because of the bound checks.

@amontoison
Copy link
Copy Markdown
Collaborator Author

amontoison commented Apr 24, 2025

Another alternative is to add a macro @inbounds_smc that is a no-op and nobody can modify it except us.
It is hardcoded in SMC.jl but we can change its behavior for some benchmarks (with local edit on the macro).

It makes us more comparable to ColPack because we remove the difference between the language Julia / C++ and we compare more the "real" modifications like what you did for the buckets.

But I am also fine with just a comment about the bound checks in the paper.

@gdalle
Copy link
Copy Markdown
Member

gdalle commented Apr 24, 2025

Let's stick with the comment then

@amontoison
Copy link
Copy Markdown
Collaborator Author

amontoison commented Apr 24, 2025

I close the PR.

Conclusion: The compiler can't vectorize some operations because of bound checks. We want to keep them in SMC.jl to ensure that we can easily detect a mistake from the user.

We will add a comment about that in the paper (C++ vs Julia). One short sentence is enough.
We know now where is the gap of performance.

@amontoison amontoison closed this Apr 24, 2025
@amontoison amontoison deleted the am/inbounds branch April 24, 2025 06:05
@gdalle
Copy link
Copy Markdown
Member

gdalle commented Apr 24, 2025

Note that we haven't proven it's due to lack of vectorization. It may also be that the bounds checking itself takes significant time. This could be detected by profiling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark Run benchmarks on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants