@erezsh I appreciate you pointing me to this repo earlier.
I was able to demonstrate about a 2x speedup for my specific DSL, but I was curious what that speedup was more generally, and I as also looking for a nice way to quantify if any chances to the Cython code would provide further speedups.
I took a peek at the Cython code and I think it is ripe for optimization. Python types are being used everywhere, and if we were able to refactor those with pure C types, then I think there could be a large speed gain. Unfortunately, I'm not the greatest Cython coder, so I wasn't able to make any simple changes that did anything.
However, I did write a reproducible benchmark to quantify the speedup between CPython-lark, and Cython-Lark over a range of input sizes.
The timing module I use is timerit (which I wrote). It works similarly to timeit
, but can work inline in existing code. I also use pandas, matplotlib, and seaborn to generate a nice figure showing the speedup over different input sizes.
The stdout of the script is:
Timed best=1.144 ms, mean=1.704 ± 0.4 ms for method=parse_python,size=16
Timed best=5.974 ms, mean=9.279 ± 1.5 ms for method=parse_python,size=87
Timed best=10.813 ms, mean=17.676 ± 2.1 ms for method=parse_python,size=158
Timed best=16.715 ms, mean=25.435 ± 3.1 ms for method=parse_python,size=229
Timed best=20.364 ms, mean=31.777 ± 4.4 ms for method=parse_python,size=299
Timed best=19.721 ms, mean=37.681 ± 8.5 ms for method=parse_python,size=370
Timed best=23.847 ms, mean=47.875 ± 7.2 ms for method=parse_python,size=441
Timed best=29.769 ms, mean=47.659 ± 11.1 ms for method=parse_python,size=512
Timed best=0.751 ms, mean=1.226 ± 0.2 ms for method=parse_cython,size=16
Timed best=4.176 ms, mean=6.391 ± 0.7 ms for method=parse_cython,size=87
Timed best=7.068 ms, mean=11.063 ± 1.5 ms for method=parse_cython,size=158
Timed best=8.398 ms, mean=13.830 ± 3.0 ms for method=parse_cython,size=229
Timed best=11.037 ms, mean=19.378 ± 4.7 ms for method=parse_cython,size=299
Timed best=13.332 ms, mean=25.704 ± 4.5 ms for method=parse_cython,size=370
Timed best=14.691 ms, mean=25.771 ± 6.9 ms for method=parse_cython,size=441
Timed best=20.890 ms, mean=31.757 ± 6.4 ms for method=parse_cython,size=512
Statistics:
min style_key size_key hue_key method size mean
key
method=parse_cython,size=16 0.000751 {} {} method=parse_cython parse_cython 16 0.001226
method=parse_python,size=16 0.001144 {} {} method=parse_python parse_python 16 0.001704
method=parse_cython,size=87 0.004176 {} {} method=parse_cython parse_cython 87 0.006391
method=parse_python,size=87 0.005974 {} {} method=parse_python parse_python 87 0.009279
method=parse_cython,size=158 0.007068 {} {} method=parse_cython parse_cython 158 0.011063
method=parse_cython,size=229 0.008398 {} {} method=parse_cython parse_cython 229 0.013830
method=parse_python,size=158 0.010813 {} {} method=parse_python parse_python 158 0.017676
method=parse_cython,size=299 0.011037 {} {} method=parse_cython parse_cython 299 0.019378
method=parse_cython,size=370 0.013332 {} {} method=parse_cython parse_cython 370 0.025704
method=parse_cython,size=441 0.014691 {} {} method=parse_cython parse_cython 441 0.025771
method=parse_python,size=229 0.016715 {} {} method=parse_python parse_python 229 0.025435
method=parse_python,size=370 0.019721 {} {} method=parse_python parse_python 370 0.037681
method=parse_python,size=299 0.020364 {} {} method=parse_python parse_python 299 0.031777
method=parse_cython,size=512 0.020890 {} {} method=parse_cython parse_cython 512 0.031757
method=parse_python,size=441 0.023847 {} {} method=parse_python parse_python 441 0.047875
method=parse_python,size=512 0.029769 {} {} method=parse_python parse_python 512 0.047659
Speedup:
min style_key size_key hue_key method mean speedup_mean speedup_min
size
16 0.000751 {} {} method=parse_cython parse_cython 0.001226 1.390526 1.523632
87 0.004176 {} {} method=parse_cython parse_cython 0.006391 1.451871 1.430450
158 0.007068 {} {} method=parse_cython parse_cython 0.011063 1.597770 1.529840
229 0.008398 {} {} method=parse_cython parse_cython 0.013830 1.839145 1.990309
299 0.011037 {} {} method=parse_cython parse_cython 0.019378 1.639875 1.845177
370 0.013332 {} {} method=parse_cython parse_cython 0.025704 1.465973 1.479169
441 0.014691 {} {} method=parse_cython parse_cython 0.025771 1.857683 1.623239
512 0.020890 {} {} method=parse_cython parse_cython 0.031757 1.500740 1.425010
Average speedup
mean std min 25% 50% 75% max
speedup_mean 1.592948 0.176645 1.390526 1.462448 1.549255 1.689692 1.857683
speedup_min 1.605853 0.206135 1.425010 1.466989 1.526736 1.678723 1.990309
This is also benchmarked against the lark.lark grammar that ships with lark. I have a helper function to generate a "random" (not really random, but it is simple) lark file to pass to the parser. One question I had was: Is there a way to use lark to generate a string from a grammar? If so that would make bench-marking more complex grammars much easier.
Anyways, I hope this is helpful. Thanks again for this library!