@@ -34,7 +34,7 @@ For this first experiment, we provide empirical evidence that the JPEG formulati

...

@@ -34,7 +34,7 @@ For this first experiment, we provide empirical evidence that the JPEG formulati

\subsection{ReLu Approximation Accuracy}

\subsection{ReLu Approximation Accuracy}

\label{sec:exprla}

\label{sec:exprla}

Next, we with to examine the impact of the ReLu approximation. We start by examining the raw error on individual $8\times8$ blocks. For this test, we take random $4\times4$ pixel blocks in the range $[-1, 1]$ and scale them to $8\times8$ using a box filter. Fully random $8\times8$ blocks do not accurately represent the statistics of real images and are known to be a worst case for the DCT transform. The $4\times4$ blocks allow for a large random sample size while still approximating real image statistics. We take 10 million such blocks and compute the average RMSE of our Approximated Spatial Masking (ASM) technique and compare it to computing ReLu directly on the approximation (APX). This test is repeated for all one to fifteen spatial frequencies. The result, shown in Figure \ref{fig:rba} shows that our ASM method gives a better approximation (lower RMSE) through the range of spatial frequencies.

Next, we examine the impact of the ReLu approximation. We start by examining the raw error on individual $8\times8$ blocks. For this test, we take random $4\times4$ pixel blocks in the range $[-1, 1]$ and scale them to $8\times8$ using a box filter. Fully random $8\times8$ blocks do not accurately represent the statistics of real images and are known to be a worst case for the DCT transform. The $4\times4$ blocks allow for a large random sample size while still approximating real image statistics. We take 10 million such blocks and compute the average RMSE of our Approximated Spatial Masking (ASM) technique and compare it to computing ReLu directly on the approximation (APX). This test is repeated for all one to fifteen spatial frequencies. The result, shown in Figure \ref{fig:rba} shows that our ASM method gives a better approximation (lower RMSE) through the range of spatial frequencies.

\begin{figure*}

\begin{figure*}

\centering

\centering

...

@@ -76,4 +76,4 @@ As a final test, we show that if the models are trained in the JPEG domain, the

...

@@ -76,4 +76,4 @@ As a final test, we show that if the models are trained in the JPEG domain, the

\label{fig:rt}

\label{fig:rt}

\end{figure}

\end{figure}

Finally, we show the throughput for training and testing. For this we test on all three datasets by training and testing a spatial model and training and testing a JPEG model and measuring the time taken. This is then converted to an average throughput measurement. The experiment is performed on an NVIDIA Pascal GPU with a batch size of \TODO images. The results, shown in Figure \ref{fig:rt}, show that the JPEG model is able to outperform the spatial model in all cases, but that the performance on training is still limited. This is likely because of the more complex gradient created by the convolution and ReLu operations. At inference time, however, performance is greatly improved over the spatial model.

Finally, we show the throughput for training and testing. For this we test on all three datasets by training and testing a spatial model and training and testing a JPEG model and measuring the time taken. This is then converted to an average throughput measurement. The experiment is performed on an NVIDIA Pascal GPU with a batch size of 40 images. The results, shown in Figure \ref{fig:rt}, show that the JPEG model is able to outperform the spatial model in all cases, but that the performance on training is still limited. This is likely because of the more complex gradient created by the convolution and ReLu operations. At inference time, however, performance is greatly improved over the spatial model.