Skip to content

Commit 24df0bc

Browse files
committed
Update docs
1 parent b9e3ec8 commit 24df0bc

3 files changed

Lines changed: 22 additions & 8 deletions

File tree

docs/Documentation.ipynb

Lines changed: 22 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -743,20 +743,20 @@
743743
"source": [
744744
"# Profiling of Divide and Conquer (memory and time)\n",
745745
"We discuss the results of the profiling of the Divide et Impera algorithm.\n",
746-
"Profiling is performed using the `submit.sh` file in the `shell` folder, which internally \n",
746+
"Profiling is performed using the `submit.sh` file in the `shell` folder, which internally calls `scripts/profiling_memory_and_time.py`. \n",
747747
"We begin by discussing the memory consumption of the method, studying how it varies with respect to the matrix size and number of processes, and comparing it to `numpy`'s and `scipy`'s `eig` built-in function.\n",
748748
"\n",
749749
"*IMPORTANT*: please notice that we did not use `scipy.sparse`'s solver as it cannot be used to retrieve all the eigenvalues, which would have make the comparison unfair.\n",
750750
"\n",
751751
"![Memory profiling](plots/memory_profiling.png)\n",
752752
"\n",
753-
"It is possible to see that cumulative memory consumption does not really depend on the number of processes, and that for low values of $n$ it behaves better than `numpy` and `scipy`, while performance degradates for high values of $n$.\n",
753+
"It is possible to see that cumulative memory consumption increases as the number of processes does.\n",
754754
"\n",
755755
"Now we do the same for runtime vs matrix size and number of processes.\n",
756756
"\n",
757-
"![Time profiling](images/plot_time.png)\n",
757+
"![Time profiling](plots/time_profiling.png)\n",
758758
"\n",
759-
"Based on this plot, we would be tempted to say that not only the execution time is much bigger that it is for `numpy` and `scipy`, but it might also seem that our method does not scale with respect to the number of processes.\n",
759+
"Based on the previous plot, we would be tempted to say that not only the execution time is much bigger that it is for `numpy` and `scipy`, but also that our method does not scale with respect to the number of processes.\n",
760760
"However, running a single time the file `shell/time_profile.sh`, we notice that this is likely a problem related to how `time.time()` saves the results.\n",
761761
"\n",
762762
"Running, for instance,\n",
@@ -765,21 +765,35 @@
765765
"```\n",
766766
"we get the following results:\n",
767767
"```mermaid\n",
768-
"Some results\n",
768+
"[D&I] Total execution time: 0.3199 s\n",
769+
"[NumPy] Total execution time: 0.0388 s\n",
770+
"[SciPy] Total execution time: 0.0690 s\n",
769771
"```\n",
770772
"Re-running with `n_procs=2`, we obtain\n",
771773
"```mermaid\n",
772-
"Even more results\n",
774+
"[D&I] Total execution time: 0.2230 s\n",
775+
"[NumPy] Total execution time: 0.8741 s\n",
776+
"[SciPy] Total execution time: 0.0364 s\n",
773777
"```\n",
774778
"Finally, for `n_procs=4`, we obtain\n",
775779
"```mermaid\n",
776-
"Final results\n",
780+
"[D&I] Total execution time: 0.1842 s\n",
781+
"[NumPy] Total execution time: 0.0768 s\n",
782+
"[SciPy] Total execution time: 0.0362 s\n",
777783
"```\n",
784+
"(notice that there is some variance in the times taken by the other two methods as a result of the fact that `time.time()` is not extremely robust).\n",
785+
"\n",
778786
"The previous results suggest that the method scales well with the number of processes, and that the performance (while worse than `numpy` and `scipy`) is such that the comparison goes much better than it seemed to do earlier.\n",
779787
"We believe that the reason for such a behavior is related to the execution of multiple scripts, which can have an impact on execution times as measured with `time.time()`.\n",
780788
"\n",
781789
"Notice that we parallelized everything that could be parallelized (except for the secular solver, which usually takes no more than $5\\%$ of the total time): the bottleneck is given by the Lanczos method, which cannot be parallelized.\n",
782-
"If the Lanczos method is not needed (that is, if the matrix $A$ of which we want to compute the eigenvalues and eigenvectors is already tridiagonal), then the execution time of our solver becomes comparable to the one of `numpy` and `scipy`."
790+
"If the Lanczos method is not needed (that is, if the matrix $A$ of which we want to compute the eigenvalues and eigenvectors is already tridiagonal), then the execution time of our solver becomes comparable to the one of `numpy` and `scipy`.\n",
791+
"\n",
792+
"*Remark*: in the plot used to profile execution times, the Lanczos method takes bigger values than D&I when just one process is used.\n",
793+
"Of course this is not possible, since D&I includes Lanczos.\n",
794+
"However, the value that we plot for all the functions not depending on `n_procs` (including the ones of `numpy` and `scipy` and Lanczos) is the average across all the runs with different numbers of processes.\n",
795+
"As a result, similar to what was remarked earlier for `numpy`'s eigenvalues solver, the execution time for large values of `n_procs` seems to increase, causing the average to become bigger, eventually getting bigger than D&I. \n",
796+
"However, notice that also this time running a single simulation with `shell/time_profile.sh` tells us that this is not truly the case, and that the execution time of the Lanczos algorithm remains pretty much the same as the number of processes increases. "
783797
]
784798
},
785799
{

docs/plots/memory_profiling.png

1.12 KB
Loading

docs/plots/time_profiling.png

60.7 KB
Loading

0 commit comments

Comments
 (0)