Memory Copy and Latency
One problem that a multi-die processor faces is of poor latency. It's fine if the data you are grabbing is sitting, quite literally, very close to the core in question, yet it's not optimal if you have to traverse multiple Infinity Fabric links. Appreciating the similarity between Threadripper and the sever-centric Epyc chips, you can read all about how this 'hopping' can cause potential hurdles here.
This is one reason why AMD chose to introduce a Game Mode that uses local memory alone, thus reducing latency to 64.8ns in our testing.