Updates from ShareLaTeX

This commit is contained in:
Carl Pearson
2017-05-07 15:54:56 -07:00
parent 932c9f6770
commit 0be0e6b1c1

View File

@@ -47,13 +47,16 @@ Fig.~\ref{fig:app_breakdown} shows the amount of time the full inverse-solver ap
``BW (32T)'' corresponds to a 32-thread OpenMP parallel run on a single XE node, and S822LC corresponds to a 160-thread OpenMP parallel run on the S822LC node.
Non-MLFMM operations are a minority of the time, and become an even smaller proportion of the time as the object reconstructions grow larger.
\begin{figure}[b]
\begin{figure}[h]
\begin{center}
\begin{tabular}{c}
\mbox{\psfig{figure=figures/cpu_matvec.pdf,width=8cm}}
\end{tabular}
\end{center}
\caption{A three-dimensional plot with gray-scale format.}
\caption{
Amount of application time spent in MLFMM for two different execution environments.
MLFMM is the dominant component even with CPU parallelization on a single node.
}
\label{fig:app_breakdown}
\end{figure}
@@ -61,30 +64,30 @@ Non-MLFMM operations are a minority of the time, and become an even smaller prop
\section{MLFMM Results}
As described in section \ref{sec:application} and shown in Table \ref{tab:components}, the MLFMM realization of matrix-vector multiplications forms the core computational kernel of the application, and its performance dominates that of the full inverse solver.
As described in Section \ref{sec:application} and shown in Fig. \ref{fig:app_breakdown}, the MLFMM realization of matrix-vector multiplications forms the core computational kernel of the application, and its performance dominates that of the full inverse solver.
This section presents an analysis of the performance of the MLFMM algorithm in three different environments.
\subsection{Evaluation Environments}
\begin{table}{}
\centering \caption{Evaluation Systems} \label{tab:systems}
\begin{tabular}{|c|c|c|c|}
\hline & \textbf{XK Node} & \textbf{XE Node} & \textbf{S822LC} \\
\hline
\hline \textbf{CPU 1} & AMD Opteron 6276 & AMD Opteron 6276 & IBM Power8 \\
\hline \textbf{CPU 2} & -- & AMD Opteron 6276 & IBM Power8 \\
\hline
\hline \textbf{GPU 1} & \makecell{K20X \\ (6 GB RAM) } & -- & P100 (16GB RAM) \\
\hline \textbf{GPU 2} & -- & -- & P100 (16GB RAM) \\
\hline \textbf{GPU 3} & -- & -- & P100 (16GB RAM) \\
\hline \textbf{GPU 4} & -- & -- & P100 (16GB RAM) \\
\hline \textbf{RAM} & 32GB & 64 GB & 512 GB \\
\hline \makecell{\textbf{CPU-GPU} \\ \textbf{Bus}} & PCIe & -- & NVLink \\
\hline
\end{tabular}
\end{table}
%\begin{table}{}
%\centering \caption{Evaluation Systems} \label{tab:systems}
%\begin{tabular}{|c|c|c|c|}
%\hline & \textbf{XK Node} & \textbf{XE Node} & \textbf{S822LC} \\
%\hline
%\hline \textbf{CPU 1} & AMD Opteron 6276 & AMD Opteron 6276 & IBM Power8 \\
%\hline \textbf{CPU 2} & -- & AMD Opteron 6276 & IBM Power8 \\
%\hline
%\hline \textbf{GPU 1} & \makecell{K20X \\ (6 GB RAM) } & -- & P100 (16GB RAM) \\
%\hline \textbf{GPU 2} & -- & -- & P100 (16GB RAM) \\
%\hline \textbf{GPU 3} & -- & -- & P100 (16GB RAM) \\
%\hline \textbf{GPU 4} & -- & -- & P100 (16GB RAM) \\
%\hline \textbf{RAM} & 32GB & 64 GB & 512 GB \\
%\hline \makecell{\textbf{CPU-GPU} \\ \textbf{Bus}} & PCIe & -- & NVLink \\
%\hline
%\end{tabular}
%\end{table}
The performance of MLFMM is evaluated in three different computing environments: Blue Waters XE nodes, Blue Waters XK nodes, and an IBM S822LC.
The performance of MLFMM is evaluated in three different computing systems: Blue Waters XE nodes, Blue Waters XK nodes, and an IBM S822LC.
The Blue Waters XE and XK nodes are two different kinds of computing nodes available on the Blue Waters supercomputer.
Each Blue Waters node is a two-socket system: the XE node has two AMD Opteron 6276 CPUs, each with eight floating-point units, hardware support for 16 executing threads, and $32$~GB of RAM.
The XK node replaces one of these CPUs with an NVIDIA K20X GPU with the Kepler architecture and $6$~GB of RAM.
@@ -100,7 +103,7 @@ The P100s are connected to the Power8 CPUs via $80$~GB/s NVLink connections.
All evaluations are done on a problem with these parameters. \todo{get from mert}
Fig.~\ref{fig:kernel_breakdown} shows the amount of of MLFMM execution time spent in computational kernels.
Fig.~\ref{fig:mlfmm_bw} shows the amount of of MLFMM execution time spent in computational kernels.
\begin{figure}[b]
\begin{center}
@@ -109,10 +112,10 @@ Fig.~\ref{fig:kernel_breakdown} shows the amount of of MLFMM execution time spe
\end{tabular}
\end{center}
\caption{BW.}
\label{fig:kernel_breakdown}
\label{fig:mlfmm_bw}
\end{figure}
Fig.~\ref{fig:kernel_breakdown} shows the amount of of MLFMM execution time spent in computational kernels.
Fig.~\ref{fig:mlfmm_minsky} shows the amount of MLFMM execution time spent in computational kernels.
\begin{figure}[b]
\begin{center}
@@ -120,8 +123,8 @@ Fig.~\ref{fig:kernel_breakdown} shows the amount of of MLFMM execution time spe
\mbox{\psfig{figure=figures/mlfmm_minsky.pdf,width=8cm}}
\end{tabular}
\end{center}
\caption{A three-dimensional plot with gray-scale format.}
\label{fig:kernel_breakdown}
\caption{S822LC.}
\label{fig:mlfmm_minsky}
\end{figure}
@@ -136,7 +139,7 @@ Fig.~\ref{fig:kernel_breakdown} shows the amount of of MLFMM execution time spe
\mbox{\psfig{figure=figures/kernels.pdf,width=8cm}}
\end{tabular}
\end{center}
\caption{A three-dimensional plot with gray-scale format.}
\caption{Normalized breakdown of the computation time across different MLFMM kernels in different exection environments.}
\label{fig:kernel_breakdown}
\end{figure}
@@ -194,18 +197,18 @@ The papers are expected to be two-pages long.
\begin{table}{}
\centering \caption{Caption of the Table.} \label{table1}
\begin{tabular}{|c|c|c|c|}
\hline Item~1& Item~2
& Item~3 & Item~4\\
\hline\hline \multicolumn{4}{|c|}{Item~5} \\
\hline Item~6&
\multicolumn{3}{|c|}{Item~7}\\
\hline Item~8 & Item~9 & Item~10 & Item~11\\
\hline
\end{tabular}
\end{table}
%\begin{table}{}
%\centering \caption{Caption of the Table.} \label{table1}
%\begin{tabular}{|c|c|c|c|}
%\hline Item~1& Item~2
%& Item~3 & Item~4\\
%\hline\hline \multicolumn{4}{|c|}{Item~5} \\
%\hline Item~6&
%\multicolumn{3}{|c|}{Item~7}\\
%\hline Item~8 & Item~9 & Item~10 & Item~11\\
%\hline
%\end{tabular}
%\end{table}