I recently was troubleshooting a DFS issue for a customer which seemed to be sporadic for end users and made no sense to the administrators on site. DFS was setup with two Windows Server 2008 R2 SP1 file servers as targets and a domain based DFS namespace was setup to publish two root paths.
Symptoms: End users were experiencing long delays while trying to save data to their home directory which was mapped using the dfs namespace. They also experienced the same lag while trying to read from this same directory. Yet it did not happen and was not reproducible from the desktop.
Troubleshooting: Dumping the client dfs cache referral, domain and provider proved to look ok. Mapping to each of the servers directly also seemed to pan out as expected. So DNS and querying the domain controllers for referrals seemed to be fine as far as I could tell. Finally an end-user had the issue and I was able to determine that this user was only having the issue when dfs server 2 was the file server it received as its target from Active Directory.
I focused my attention on DFS file server 2 and started looking at the logs. The application log didn’t show anything helpful, but the DFS replication log in the “crimson channel” did show warnings about the staging directory, target replication directory or both, did not have enough free space. The warning was raised with event id: 4502.
The staging quota for the replication group was set to a very high value in MB so I was certain that it was not out of disk space. There was plenty of free space on the volume in disk management as well.
I then started looking outside of DFS for what could be causing the space issue for the staging directory. FSRM was in fact installed and in use. Both quotas and file screen templates had been setup on the directory that was being replicated by DFS which included the staging directory in its default path. For example D:\Share\dfsrprivate. The quote was set to hard limit of 5GB which was in fact causing the free space issue on the staging directory no matter what quota size was setup in DFSR. I believe that when the staging directory is running low or out of free space, a cleanup process is run to free up more space so that it can continue operations. Digging further, I opened up the dfs replication log, which by default is located in the C:\Windows\debug directory, and noticed many, many access is denied errors. The file types that were being logged with access is denied were the file types setup in the file screen template in FSRM which was configured to block.
Looking at the time stamps of the events in the DFSR replication log in the crimson channel in regard to free space, the time stamps in the debug log for access is denied error and remembering that the staff and end users mentioned it was a sporadic issue when saving or reading file, It clicked and perfmon was immediately opened to check the disk for I/O. Saving and opening = reading and writing to the disk. “Lag” = high disk i/o. The disk was in fact being hit harder than it should have been during normal operations.
Resolution: We removed the file screen template and quota on the E:\Share\* which included the dfsrprivate directory. Restarted both the FSRM and DFS Replication services in the services applet to generate new logs and events. The warning logs about staging directory free space did not resurface. The debug log in C:\Windows\debug for dfs replication showed clean with no errors. Disk i/o ramped up in perfmon as the dfs replication back log caught up but then flattened out. Replicating the issue from the beginning was difficult but I could not reproduce the issue after the changes were made and have not heard of the issue since.
I wonder how then would you be able to utilize FSRM file screening with DFS and DFS replication effectively since the two technologies don’t seem to be aware of each other? I came across the following blog post which mentions using DFS file filters to ensure that extension that are included in the FSRM file screening are also excluded from DFS replication groups. Look for reason #9.
Easy enough! I started plugging in some extensions into the DFS Replication group file filters box to test this out. After typing in 254 characters, I could not add anymore. Using the Audio/Video file filter alone in FSRM has a list of extensions greater than this character limit. What’s up with that? Well, as this is a domain based DFS namespace, and configuration information is stored in AD, off to Active Directory we go. Opening ADSI edit and navigating to the following attribute will allow for greater than 254 characters to be added.
Right click CN=FolderName and select properties. If you navigate to the attribute msDFSR-FileFilter you can edit and add in greater than 256 characters which you cannot do in the DFS Management GUI.
I’m currently in the process of building this exact scenario in my lab environment to confirm my theory but hopefully adding the values in ADSI edit will be recognized and excluded from replication. If anyone ever actually reads this and can test, please post back any findings.
Hope it helps someone, someday.