2 years ago
#71551
Dave
Duplicate reference key values for SSIS lookup cause package to hang
I have a package that is used to load a slowly changing dimension in a data warehouse. The package loads 0-20 new rows of data on the nightly process. When testing the full load capability, which would load about 300K rows of data, the package never completes. There is a warning that is generated both in the nightly runs and the full load run.
Extract MEMBER_COMMUNICATION_DIM: Warning: The MEMBER_COMMUNICATION_DIM encountered duplicate reference key values when caching reference data. This error occurs in Full Cache mode only. Either remove the duplicate key values, or change the cache mode to PARTIAL or NO_CACHE.
It is true that there are duplicates in the lookup data. My question is, why do these duplicate rows cause the package to never complete? And why is it only when loading a large data set?
The package is only using the lookup no match output to populate the table. My understanding from the lookup tool is that it will return the first match that it finds, so if you are using the matched data then having duplicates can be problematic, since you cannot guarantee which row will match. However, since the lookup tool is only being used to check that the row does not already exist in the table, I am not sure why it is causing this package to never complete.
This is the lookup query being used SELECT MEM_COMM_KEY,MEM_COMM_HASH FROM MEMBER_COMMUNICATION_DIM
The lookup is matching on the hash column from the table, to the hash column on the source table.
ssis
0 Answers
Your Answer