The Problem
A receive FTP adapter is picking files up twice and creating duplicate during processing. This article will investigate the cause and then give a workaround.
Analysis
It is well known that FTP adapter can pick a file up twice if the transaction creating the file is not atomic. In our case the application writes to a staging folder and then moves the file to a completed folder. We can dismiss this as the reason for our duplicates because the last transaction is atomic.
1400+ files are created in a burst in the ftp folder. The settings of the BizTalk FTP adapter are shown below.
The pipeline in the receive location contains a SAB archiving component and this is how we first saw the duplicate file issue. In BizTalk 2013R2 if a SAB component tries to archive a file with the same name as an existing file it create file with the same name with a guid attached to it. Thus we observe evidence of file duplicates like this;
Searching for these duplicate filenames in the FTP adapter log I found; firstly listing of all the files in the FTP folder like this
Secondly we saw the retrieval of the same file twice in the log and a second listing. Note the default behaviour is to retrieve 20 files in a batch, delete the files and then retrieve the next 20 until the 1400+ files are retrieved.
Finally a delete occurs for the same file.
I was surprized by the behaviour because the FTP adapter should not do this. At this stage I recreated the exact same issue on my development machine so I could study this more closely. The results for the same files was very reproducible.
I noticed that the duplicates always came at about he same point in the burst of files. I reasoned that somehow the deletion of the default batch of 20 files was being screwed up and that it had something to do with the large burst of files at one time.
Workaround
With this thought in hand I discovered a workaround. I reasoned that there was an issue with deleting a batch of 20 files repeatedly and adjusted the number to 1.
Gratifyingly, running the sane unit test as before gave no duplicates.
Conclusion
I have shown that tuning the Batch property on the FTP receive adapter can be useful. Especially if you are observing duplicate pick ups when processing a large volume of files. I think that this is a bug in BizTalk 2013R2 because the same unit test in BizTalk 2010 does not create duplicates.