five

Nvidia OpenCL failure on K40m GPU using driver version 375.26

收藏
DataCite Commons2020-08-29 更新2024-07-27 收录
下载链接:
https://figshare.com/articles/Nvidia_OpenCL_failure_on_K40m_GPU_using_driver_version_375_26/6533915
下载链接
链接失效反馈
官方服务:
资源简介:
<b>Purpose</b><br>This is a simple test to demonstrate the failure of NVIDIA's OpenCL runtime packagedwith the RHEL-6 Linux x65 GPU driver version 375.26 on a Tesla K40m GPU.<br>See our [github link](https://github.com/arghdos/nvidia_failure_mwe) for a full description as well.<br><b>Methodology</b><br>Essentially what we have done here is packaged a stripped down version of the OpenCLJacobian generated by pyJac for a [361-species isopentanol model](https://github.com/Niemeyer-Research-Group/pyJac-paper/blob/master/data/IC5H11OH/Sarathy_ic5_mech.cti).<br>The user supplies the code with the path to two different OpenCL implementations andtests them for the correct output.<br>The code is checking a simple summation of the forward stoichiometric coefficientsin the model as seen in lines 80--101 of `jacobian_kernel.ocl`:<br>```<i> int nu_rev = -1;</i><i> int nu_fwd = -1;</i><i> i_0 = simple_map[i];</i><i> offset_next = net_reac_to_spec_offsets[i_0 + 1];</i><i> offset = net_reac_to_spec_offsets[i_0];</i><i> i_2 = thd_mask[i_0];</i><i> i_1 = rev_mask[i_0];</i><i> for (int net_ind = offset; net_ind &lt;= -1 + offset_next; ++net_ind)</i><i> {</i><i> nu_fwd = nu_fwd + reac_to_spec_nu[1 + 2 * net_ind];</i><i> nu_rev = nu_rev + reac_to_spec_nu[2 * net_ind];</i><i> #ifdef PRINT</i><i> if (i_0 == 1988 &amp;&amp; 64 * gid(0) + lid(0) == 867)</i><i> {</i><i> printf("%d\t%d\n", reac_to_spec_nu[1 + 2 * net_ind], reac_to_spec_nu[2 * net_ind]);</i><i> }</i><i> #endif</i><i> }</i><i> if(64 * gid(0) + lid(0) == 867 &amp;&amp; (i_0 == 1975 || i_0 == 1988))</i><i> {</i><i> printf("rxn:%d, nu_fwd_sum:%d, nu_rev_sum:%d\n", i_0, nu_fwd, nu_rev);</i><i> }</i>```<br>Here we see that nu_fwd / rev are summed over the reaction.For the 867th condition, and reactions 1975 &amp; 1988 (0-based) in the mechanism, we printthe forward and reverse nu sum. These reactions are:<br>```<i># Reaction 1976 (note: this is 1-based indexing from cantera)</i><i>reaction('ic5h9oh-2ooh-4o2 &lt;=&gt; ic5ohket2-4 + oh', [1.250000e+10, 0.0, 19450.0])</i><i># 6s beta, +2kcal</i><i># Reaction 1989</i><i>reaction('ic5ohket2-4 =&gt; oh + ch3chco + ch2oh + ch2o', [1.000000e+16, 0.0, 39000.0])</i><i># rev / 0.000e+00 0.00 0.000e+00 /</i>```<br>As the nu_fwd / nu_rev are initialized to -1 initial we'd expect, for reaction 1975:<br>```<i> nu_fwd = -1 + (1 + 0 + 0) = 0</i><i> nu_rev = -1 + (0 + 1 + 1) = 1</i>```<br>and indeed, the correct output is:<br>```<i>rxn:1975, nu_fwd_sum:0, nu_rev_sum:1</i>```<br>Similarly, for reaction 1988:<br>```<i> nu_fwd = -1 + (1 + 0 + 0 + 0) = 0</i><i> nu_rev = -1 + (0 + 1 + 1 + 1) = 3</i>```and the correct output:```<i>rxn:1988, nu_fwd_sum:0, nu_rev_sum:3</i>```<br><b>Testing</b>Essentially what is done is run the test on NVIDIA's OpenCL driver with and withoutthe PRINT macro is defined we should get the correct output, else we should get the incorrect output:<br>```<i>rxn:1975, nu_fwd_sum:0, nu_rev_sum:0</i><i>rxn:1988, nu_fwd_sum:-1, nu_rev_sum:3</i>```<br>A complete test run, using the Intel runtime as an alternate, gives the following output:<br>```<i>python nvidia_test.py -nv /usr/lib64/ -hp /apps2/cuda/8.0.61/include/ -on Intel -op /apps2/opencl_runtime/16.1.1/intel/opencl/lib64/</i><i>gcc -fPIC -O3 -std=c99 -xc jacobian_kernel_main.ocl jacobian_kernel_compiler.ocl timer.ocl read_initial_conditions.ocl ocl_errorcheck.ocl -I/gpfs/gpfs1/apps2/cuda/8.0.61/include -Wl,-rpath,/usr/lib64 -L/usr/lib64 -lOpenCL -o test.out</i><i><br></i><i>./test.out NVIDIA 896 1 1</i><i><br></i><i><br></i><i><br></i><i>rxn:1975, nu_fwd_sum:0, nu_rev_sum:0</i><i>rxn:1988, nu_fwd_sum:-1, nu_rev_sum:3</i><i>896,4.357185000000000e+03,1.999950000000000e+02,1.410000000000000e+01</i><i><br></i><i>gcc -fPIC -O3 -std=c99 -xc -DPRINT jacobian_kernel_main.ocl jacobian_kernel_compiler.ocl timer.ocl read_initial_conditions.ocl ocl_errorcheck.ocl -I/gpfs/gpfs1/apps2/cuda/8.0.61/include -Wl,-rpath,/usr/lib64 -L/usr/lib64 -lOpenCL -o test.out</i><i><br></i><i>./test.out NVIDIA 896 1 1</i><i><br></i><i><br></i><i><br></i><i>rxn:1975, nu_fwd_sum:0, nu_rev_sum:1</i><i>0 1</i><i>0 1</i><i>0 1</i><i>0 1</i><i>1 0</i><i>rxn:1988, nu_fwd_sum:0, nu_rev_sum:3</i><i>896,3.998452000000000e+03,2.075790000000000e+02,1.631400000000000e+01</i><i><br></i><i>gcc -fPIC -O3 -std=c99 -xc jacobian_kernel_main.ocl jacobian_kernel_compiler.ocl timer.ocl read_initial_conditions.ocl ocl_errorcheck.ocl -I/gpfs/gpfs1/apps2/cuda/8.0.61/include -Wl,-rpath,/gpfs/gpfs1/apps2/opencl_runtime/16.1.1/intel/opencl-1.2-6.4.0.25/lib64 -L/gpfs/gpfs1/apps2/opencl_runtime/16.1.1/intel/opencl-1.2-6.4.0.25/lib64 -lOpenCL -o test.out</i><i><br></i><i><br></i><i>./test.out Intel 896 1 1</i><i>Compilation started</i><i>Compilation done</i><i>Linking started</i><i>Linking done</i><i>Device build started</i><i>Device build done</i><i>Kernel was successfully vectorized (4)</i><i>Kernel was successfully vectorized (4)</i><i>Done.</i><i><br></i><i>rxn:1975, nu_fwd_sum:0, nu_rev_sum:1</i><i>rxn:1988, nu_fwd_sum:0, nu_rev_sum:3</i><i>896,5.648410000000000e+02,9.851000000000001e+00,1.066080000000000e+02</i>```<br><br><br><b>Requirements</b><br>- The OpenCL implementations must be capable of using `printf` in OpenCL code.- The OpenCL library paths may be either to an OpenCL ICD-loader (e.g., [ocl-icd](https://github.com/OCL-dev/ocl-icd)) or the libraries directly- Only v375.26 of the Tesla driver has been tested, If you can test any other drivers and find the same bug please feel free to file an issue so we may update the list of faulty drivers.<br>
提供机构:
figshare
创建时间:
2018-06-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作