Can confirm that from my setup. Increasing the parallelization beyond 3-4 concurrent threads doesn’t also significantly increase the inference speed any more.
This is a telltale sign that some of the cores are starving because data doesn’t arrive fast enough any more…
Can confirm that from my setup. Increasing the parallelization beyond 3-4 concurrent threads doesn’t also significantly increase the inference speed any more.
This is a telltale sign that some of the cores are starving because data doesn’t arrive fast enough any more…